15:34:03 #startmeeting RELENG (2017-01-16) 15:34:03 Meeting started Mon Jan 16 15:34:03 2017 UTC. The chair is dgilmore. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:34:03 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:34:03 The meeting name has been set to 'releng_(2017-01-16)' 15:34:03 #meetingname releng 15:34:03 The meeting name has been set to 'releng' 15:34:03 #chair dgilmore nirik tyll sharkcz bochecha masta pbrobinson pingou maxamillion mboddu 15:34:03 Current chairs: bochecha dgilmore masta maxamillion mboddu nirik pbrobinson pingou sharkcz tyll 15:34:06 #topic init process 15:34:16 morning 15:34:19 .hello mohanboddu 15:34:20 mboddu: mohanboddu 'Mohan Boddu' 15:34:21 .hello maxamillion 15:34:23 maxamillion: maxamillion 'Adam Miller' 15:35:16 we have a packed agenda today 15:35:23 .hello sharkcz 15:35:24 sharkcz: sharkcz 'Dan Horák' 15:35:31 https://pagure.io/releng/issues?status=Open&tags=meeting 15:36:37 .hello puiterwijk 15:36:38 puiterwijk: puiterwijk 'Patrick "マルタインアンドレアス" Uiterwijk' 15:38:05 lets get started 15:39:04 #topic Alternative Architectures updates 15:39:20 sharkcz: anything we need to know on alternative arches? 15:40:18 nope, I'm checkign with pbrobinson about the shadow host reinstall and gettign the automation back running 15:40:28 okay 15:40:50 #info sharkcz is checking with pbrobinson on shadow host reinstall and gettign the automation back running 15:41:03 sharkcz: when were updates last pushed? 15:41:33 on Jan 12th 15:41:48 :( 15:41:50 OKAY 15:41:58 and was that all arches? 15:42:01 or just some? 15:42:02 yes 15:42:11 we need to do them more often 15:42:21 * nirik nods 15:42:27 once per week is sufficient for us 15:42:43 sharkcz: not really 15:43:36 though its also true of other arches 15:45:33 anyway 15:45:49 I know we have an issue with insufficent power builders 15:46:03 #info there is currently insufficent power builders 15:46:22 #info we have not yet setup hosts to do aarch64 or power image builds on 15:46:57 anything else? 15:47:07 not from me 15:47:10 oh... 15:47:18 nirik: 15:47:20 ? 15:47:29 we might want to identify any work needed to bodhi/etc for f26+ for alternative arch updates... 15:47:47 may just need sync scripts adjustment to put them in place but there might be more, I don't know 15:48:12 nirik: we will 15:48:23 it will just be the syncing of the content 15:48:31 .fas linuxmodder 15:48:32 linuxmodder: linuxmodder 'Corey W Sheldon' 15:48:51 ok. Will make pushes longer again, but oh well. 15:49:15 #info we need to identify changes needed to the bodhi processes for f26 to push alternate uarch updates to the correct locations 15:49:34 nirik: we should file an issue in pagure for the overall work, then subtasks 15:49:47 sure 15:49:51 nirik: hopefully not that much longer 15:50:07 brb someone is at my door 15:56:54 back 15:57:36 #topic 130 consider longer logs retention for koschei 15:57:42 https://pagure.io/releng/issue/130 15:57:59 right now it's 1 day and thats really too short. 15:58:19 I guess we could go a week? 15:58:20 if there's a run on a friday or saturday and you see the result monday there's no logs 15:58:34 yeah, a week seems fine 15:58:40 any randomly picked time will be insufficent for some people 15:58:46 sure. 15:58:51 but we do not have petabytes of storage 15:59:02 longer than a week you probibly want to redo it anyhow to see what changed... 15:59:29 #agreed we will change the log file retention from 1 day to 7 days for koschei 16:00:29 #topic 133 Enable Signed Repository Metadata 16:00:36 https://pagure.io/releng/issue/133 16:01:06 So I think we should figure out how to do this 16:01:34 I guess. it doesn't buy us much, but more security is better than less. 16:01:39 though I know some users will complain, tehre was already one bug over how dnf deals with the gpgkeys over the openh264 repo 16:01:50 that was user error (IMHO) 16:01:58 nirik: provides protection for people using baseurl 16:02:28 sure, but "doctor, it hurts when I do this...well, don't do that then" 16:02:33 does it mean a signer has to unlock their key and sign the repo metadata? 16:02:39 nirik: user error, bug in dnf, unrealistic expectations 16:02:57 this may need to tie into the signed koji repos thing... 16:03:09 masta: we would need to setup all teh processes that make repos to sign the repodata 16:03:12 masta: I wouldn't go with that, and go for integration with autosigning. 16:03:29 if we do that as part of the creation or shipping of the repos we should decide 16:03:47 nirik: I do not think so 16:03:54 I want to get less manual signing, not more 16:04:19 given it needs to be done at the end of long running processes 16:04:27 manual is not feasible 16:04:39 I can see manually signed release repos, but not updates... those would want automation. 16:04:50 the repodata dignature is a detatched signature for repodata.xml 16:04:52 I'd want automation for all kinds 16:05:16 masta: well we would want branched and rawhide composes to have signed repodata also 16:05:52 do we all think there is value in signing the repodata? 16:06:10 sure. 16:06:34 yes, I've always wanted it... but it was something I've never mentioned... because I thought it was too hard a problem (in the context of manually signing) 16:06:45 #info we agree there is value in signing repodata 16:07:08 we need to identify all the places to be signed 16:07:36 I think the repos produced by pungi composes and bodhi updates 16:08:00 do we want to retroactively change epel, and existing fedora updates streams? 16:08:08 I think so 16:08:32 even if we do not change the configs 16:08:54 I think setting up the signing without a bunch of exeptions and corner cases is best 16:09:05 I do wonder how things like satellite will work 16:09:18 say someone imports epel into satellite/spacewalk 16:09:21 since only repodata.xml makes sense... sure... only that one file would churn. 16:09:25 does it make its own repos 16:10:34 masta: well unless you set repo_gpgcheck=1 in the repo file or globally nothing changes from the user perspective 16:10:59 rhartman|rh: do you know if satellite/spacewalk/katello etc make thier own repos? 16:11:12 or do they also suck in the repodata? 16:11:45 but I guess its up to people using such tech to manage the repo defenitions anyway 16:11:48 dgilmore, i'm not familiar with the latter 2, satellite I can investigate, I believe they do 16:12:13 rhartman|rh: spacewalk is the open source satelliet 5 code base 16:12:36 rhartman|rh: and katello I believe is the open source satellite 6 code base 16:13:36 does someone want totake ownership of finding and documenting our use cases? 16:13:41 dgilmore, i'll take the ai to follow up, assuming your concern is those would need updates as well 16:14:28 rhartman|rh: well my concern is people doing something like importing epel or fedora would lose the ability to check the gpg signing of the repos 16:15:16 #action rhartman to follow up with the spacwalk teams code bases on signing of repodata and how it would work for them 16:15:38 rhartman|rh: as they will have no ability to sign repos themselves using the official keys 16:15:39 dgilmore, so like a yum repo-sync kidna thing, and run createrepo_c.. so then no more gpg sig? 16:15:51 which yum/dnf afaik makes an assumtion on 16:15:59 masta: exactly 16:16:28 hrm... 16:17:58 #info lets follow up next week. If someone wants to work on this please reach out to dgilmore 16:18:31 #topic #6557 Something broken in Rawhide's koji 16:18:39 https://pagure.io/releng/issue/6557 16:19:02 this looks like its a bug in elfutils on aarch64 16:19:19 we need to do better about triaginga nd providing feedback on tickets 16:19:30 there is also other issues in koji right now 16:19:37 downloads of rpms is failing a lot 16:19:50 and it looks like srpms are randomly corrupted 16:20:00 The latest issue there is a very weird thing with DNS resolution sometimes working and other times not 16:20:27 puiterwijk: joy 16:20:52 And it's not like anything changes: within the exact same dnf run, getting the repodata works, but then getting the packages gives a dns error 16:21:08 #info we need to do better at triaging issues and providing feedback. the issue that resulted in this ticket seems to be a elfutils bug on aarch64 16:21:09 corrupted srpms are in my experience incompletely downloaded, not some shuffled bits or so 16:21:41 sharkcz: we started getting a lot of them in the last 24 hours 16:21:57 might be related to the buildroot issues 16:22:00 which puiterwijk and nirik did change things in kojipkgs yesterday 16:22:18 yes yes we did. 16:22:19 puiterwijk: nirik: would you like to summarise the changes you made? 16:22:48 I added a kojipkgs02 instance (setup exactly like the 01 one)... then we moved both of them behind proxy01/10 and haproxy. 16:23:07 so you should be able to take down either one and the other one keeps working transparently 16:23:09 that is a lot of extra layers 16:23:15 it is. 16:23:28 but it means kojipkgs01 is no longer a SPOF 16:23:44 any of which could be causing the current corruption of srpms and is hard to debug 16:23:49 sure 16:23:53 puiterwijk: ah, you already debugged the download thing ? 16:24:02 nirik: yes. the latest issue is this DNS thing... 16:24:08 dgilmore, how can we reproduce, would a curl/wget work for that? 16:24:19 Which makes no sense at all, and is not related to our change other then that we changed DNS 16:24:25 masta: no, curl/wget work. 16:24:30 note: the sometimes failing to download packages for the mock chroot was happening before. 16:24:40 #info nirik added a kojipkgs02 instance (setup exactly like the 01 one)... then we moved both of them behind proxy01/10 and haproxy. 16:25:01 puiterwijk: I have a thought... 16:25:07 nirik: ah, me too 16:25:11 nirik: right 16:25:11 we aren't allowing port 53 tcp on the builders 16:25:28 nirik: the new issue is what seems to be corrupted srpm downloads frequently 16:25:28 so when the result is too large for udp it sends tcp and it's dropped? 16:25:37 nirik: well, it's the same DNS result.. 16:25:47 dgilmore: yep. saw it. Not dug into it much yet. 16:25:51 we also have another issue... 16:26:04 buildvm's are very unstable 16:26:17 and we haven't been able to isolate why yet. 16:26:38 #info new issue is that we seem to frequently get corrupted srpm downloads for build tasks 16:26:39 so, looks like my monday is all set. ;) 16:26:56 #info new issue is that buildvm's seem unstable after moving to f25 16:27:25 but it's not f25 16:27:31 #info new issue is that armv7hl builds get killed by oom much more frequently than previously 16:27:34 they were fine from dec to last week 16:27:57 nirik: updates applied last week? 16:28:07 I was thinking it was a kernel issue, but I moved them back to the kernel they were on and they still blew up last night 16:28:19 so, it might be a userspace thing... yes, updates last week 16:28:24 how are they blowing up? 16:28:43 they hang, they spew kernel oopses, then they slowly reboot (over many many minutes) 16:29:32 and journald /systemd dies and corrupts the journal 16:29:46 https://bugzilla.redhat.com/show_bug.cgi?id=1413314 is some of the dmesg with 4.9.x kernel 16:30:51 lovely 16:31:42 yeah, its always fun. ;) 16:31:51 and i686 builds seem to be more problematic than x86_64? 16:32:01 not sure. 16:32:27 most of the complaints ive seen in irc channels has been issues with i686 16:32:36 but that may be a coincidence 16:33:04 so are the arm builders doing large builds, kernel, tool chains, atlas... big jobs? 16:33:40 #action puiterwijk, nirik and dgilmore to do more investigation 16:34:00 masta: lbreoffice is one, but its others 16:34:05 the arm builders were hitting OOM all the time, so I updated them last week and they are running 4.9.x kernels just fine 16:34:12 masta: including nirik's ssh session when looking 16:34:23 the above issue is buildvms _only_ 16:34:32 #info arm builders are happier on 4.9 kernel 16:34:37 buildhw and arms seem just fine with the 4.9.x kernel 16:34:53 libreoffice is still failing, but that might be it's fault 16:34:58 okay 16:35:04 any more to discuss here? 16:35:20 our time is technically up and we have not done half of the issues? 16:36:12 #topic #6570 Fedora Core 1 archive incomplete 16:36:20 https://pagure.io/releng/issue/6570 16:36:35 I am really unsure of what we can do here 16:36:52 if someone still has some cds? ;) 16:37:01 we could send out a call... 16:37:16 I might have a CD at home, kital also might have one at his office. 16:37:33 I think I have access to the GOLD tree 16:37:42 (not sure if home-burned CDs are still readable 10 years after) 16:37:56 yeah, no telling. 16:37:58 I am not sure how it was messed up 16:38:01 * nirik got rid of his cd's long ago 16:38:52 ls /mnt/redhat/released/FC-1/GOLD/x86_64/os/Fedora/RPMS/|wc -l 16:38:52 1476 16:38:55 dgilmore, sorry was in simultaneous meetings, reading up. And will pursue that info 16:39:04 rhartman|rh: cheers 16:39:35 will have to lok at it as whats in the internal tree actually has a different layout to what is on the mirrors 16:40:37 #info we assume that something in the archiving process caused the issues 16:41:07 #info dgilmore has access to the internal Red Hat copy of FC-1 it however looks different to what was shipped 16:41:26 for me the biggest question is do we do anything? 16:43:18 Good question, I lean towards yes. 16:43:37 It's what we know when we know it... so now we know, should fix it. 16:44:30 we likely should also look at proviing tooling for release archiving 16:45:07 * nirik also thinks we should fix... 16:45:11 for posterity! 16:45:19 #info we should put back the missing content 16:45:33 #action dgilmore will put back the missing content 16:46:02 #action file an issue for new tooling to be using in archiving releases 16:46:37 #topic #6577 Fedora 26 mass rebuild 16:46:44 https://pagure.io/releng/issue/6577 16:47:01 so the schedule has us doing a mass rebuild on Feb 1 16:47:13 we need to update the scripts 16:47:23 we need to file tracker bugs 16:47:36 setup failing build monitoring 16:47:53 I guess mboddu should probably do it 16:47:58 and I will help him 16:48:02 +1 16:48:09 we will use the releng user to do the mass rebuild 16:48:16 which reminds me 16:48:17 dgilmore: sure 16:48:29 I am the only person getting email sent to releng@fp.o 16:48:37 is there others who want that email? 16:49:07 mostly it is bug email from mass rebuild FTBFS 16:49:39 we know there will be a new gcc 16:50:07 dgilmore: I think I should get them as well 16:50:38 mboddu: okay, will add you 16:50:51 dgilmore: thanks :) 16:51:16 #action dgilmore to add mboddu to list to recieve releng@fp.o email 16:51:22 huh, I thought I was already getting those emails, but no.. those were from a mail list named rel-eng@ 16:51:58 masta: yeah its different 16:52:09 it was something setup when i made the releng user in fas 16:52:44 dgilmore, does it get any actionable email from real people? 16:53:13 masta: sometimes people set the bugzilla user as needinfo 16:53:29 otherwise no 16:53:42 okay, add me too... as a backup from you and mboddu 16:53:49 okay 16:54:00 #action dgilmore to add masta to list to recieve releng@fp.o email 16:54:25 #info need to determine the status of all things needed prior to the mass rebuild 16:54:46 I know the pkgconfig one started discussion only after being approved 16:55:33 #topic #6578 run repoclosure before mering mass rebuilds 16:55:40 https://pagure.io/releng/issue/6578 16:55:59 wording here may have been bad 16:56:13 we can not really run it for the mass rebuilds 16:56:30 but we can when merging in side targets 16:57:06 adn given how much of a mess python3 was in when it was merged I think there is a lot of value in making sure that we do not merge in things known broken 16:57:11 seems reasonable for completed side-tags 16:57:36 it fits into my long term plan of detecting when things cause breakage and not letting in until the breakage is fixed 16:57:52 most side tags have been fine 16:58:14 does everyone agree we should add the extra checking? 16:58:21 +1 16:58:27 +1 16:58:40 mboddu: maxamillion: nirik: puiterwijk: sharkcz: pbrobinson: bowlofeggs: 16:58:49 tyll_: 16:59:02 +1 16:59:17 +1 16:59:21 sorry, multitasking badly 16:59:25 sure. +1 16:59:28 I will mark that as accepted 16:59:30 Sure, +1 16:59:46 #accepted we will run repoclosure on side targets before merging 17:00:02 #action dgilmore to update the sop on how to run repoclosure 17:00:17 #topic #6587 PDC ownership in Fedora 17:00:23 https://pagure.io/releng/issue/6587 17:00:38 so we need someone to step up and own pdc in fedora 17:00:55 Ralph does not have time for it in his current role 17:01:04 there is a lot more we can and should do with it 17:01:27 we just need to have somoen who can look after it and help drive new features for us with upstream 17:01:39 Do we have an existing postgre database we can host it on? 17:01:48 maybe the place koji db runs? 17:01:50 we are currently running an old version as no one has had the time to do the upgrades 17:02:02 masta: we have it up and running 17:02:06 oh! 17:02:06 masta: its all in place 17:02:15 just not looked after and loved as it should be 17:02:32 https://pdc.fedoraproject.org/ 17:03:44 #info somoene is needed to shppard and love PDC for us 17:03:59 I only just started messing with pdc, so I'm not going to latch on to this, but I'll look into it to potentially help here. Okay? 17:04:40 #info we need someone to guide and shappard it for Fedora's needs. be involved in upstream. ensure we get code updates and make the most of what PDC offers us 17:04:48 okay cheers 17:05:13 #topic Open Floor 17:05:24 does anyone have anything? 17:05:50 nothing from me 17:06:29 * nirik thinks everything he had was already covered. 17:07:51 lets wrap up then 17:07:54 Thanks all 17:07:57 #endmeeting