15:34:03 <dgilmore> #startmeeting RELENG (2017-01-16)
15:34:03 <zodbot> Meeting started Mon Jan 16 15:34:03 2017 UTC.  The chair is dgilmore. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:34:03 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:34:03 <zodbot> The meeting name has been set to 'releng_(2017-01-16)'
15:34:03 <dgilmore> #meetingname releng
15:34:03 <zodbot> The meeting name has been set to 'releng'
15:34:03 <dgilmore> #chair dgilmore nirik tyll sharkcz bochecha masta pbrobinson pingou maxamillion mboddu
15:34:03 <zodbot> Current chairs: bochecha dgilmore masta maxamillion mboddu nirik pbrobinson pingou sharkcz tyll
15:34:06 <dgilmore> #topic init process
15:34:16 <nirik> morning
15:34:19 <mboddu> .hello mohanboddu
15:34:20 <zodbot> mboddu: mohanboddu 'Mohan Boddu' <mboddu@bhujji.com>
15:34:21 <maxamillion> .hello maxamillion
15:34:23 <zodbot> maxamillion: maxamillion 'Adam Miller' <maxamillion@gmail.com>
15:35:16 <dgilmore> we have a packed agenda today
15:35:23 <sharkcz> .hello sharkcz
15:35:24 <zodbot> sharkcz: sharkcz 'Dan Horák' <dan@danny.cz>
15:35:31 <dgilmore> https://pagure.io/releng/issues?status=Open&tags=meeting
15:36:37 <puiterwijk> .hello puiterwijk
15:36:38 <zodbot> puiterwijk: puiterwijk 'Patrick "マルタインアンドレアス" Uiterwijk' <puiterwijk@redhat.com>
15:38:05 <dgilmore> lets get started
15:39:04 <dgilmore> #topic Alternative Architectures updates
15:39:20 <dgilmore> sharkcz: anything we need to know on alternative arches?
15:40:18 <sharkcz> nope, I'm checkign with pbrobinson about the shadow host reinstall and gettign the automation back running
15:40:28 <dgilmore> okay
15:40:50 <dgilmore> #info sharkcz is checking with pbrobinson on shadow host reinstall and gettign the automation back running
15:41:03 <dgilmore> sharkcz: when were updates last pushed?
15:41:33 <sharkcz> on Jan 12th
15:41:48 <dgilmore> :(
15:41:50 <dgilmore> OKAY
15:41:58 <dgilmore> and was that all arches?
15:42:01 <dgilmore> or just some?
15:42:02 <sharkcz> yes
15:42:11 <dgilmore> we need to do them more often
15:42:21 * nirik nods
15:42:27 <sharkcz> once per week is sufficient for us
15:42:43 <dgilmore> sharkcz: not really
15:43:36 <dgilmore> though its also true of other arches
15:45:33 <dgilmore> anyway
15:45:49 <dgilmore> I know we have an issue with insufficent power builders
15:46:03 <dgilmore> #info there is currently insufficent power builders
15:46:22 <dgilmore> #info we have not yet setup hosts to do aarch64 or power image builds on
15:46:57 <dgilmore> anything else?
15:47:07 <sharkcz> not from me
15:47:10 <nirik> oh...
15:47:18 <dgilmore> nirik:
15:47:20 <dgilmore> ?
15:47:29 <nirik> we might want to identify any work needed to bodhi/etc for f26+ for alternative arch updates...
15:47:47 <nirik> may just need sync scripts adjustment to put them in place but there might be more, I don't know
15:48:12 <dgilmore> nirik: we will
15:48:23 <dgilmore> it will just be the syncing of the content
15:48:31 <linuxmodder> .fas linuxmodder
15:48:32 <zodbot> linuxmodder: linuxmodder 'Corey W Sheldon' <sheldon.corey@openmailbox.org>
15:48:51 <nirik> ok. Will make pushes longer again, but oh well.
15:49:15 <dgilmore> #info we need to identify changes needed to the bodhi processes for f26 to push alternate uarch updates to the correct locations
15:49:34 <dgilmore> nirik: we should file an issue in pagure for the overall work, then subtasks
15:49:47 <nirik> sure
15:49:51 <dgilmore> nirik: hopefully not that much longer
15:50:07 <dgilmore> brb someone is at my door
15:56:54 <dgilmore> back
15:57:36 <dgilmore> #topic 130 consider longer logs retention for koschei
15:57:42 <dgilmore> https://pagure.io/releng/issue/130
15:57:59 <nirik> right now it's 1 day and thats really too short.
15:58:19 <dgilmore> I guess we could go a week?
15:58:20 <nirik> if there's a run on a friday or saturday and you see the result monday there's no logs
15:58:34 <nirik> yeah, a week seems fine
15:58:40 <dgilmore> any randomly picked time will be insufficent for some people
15:58:46 <nirik> sure.
15:58:51 <dgilmore> but we do not have petabytes of storage
15:59:02 <nirik> longer than a week you probibly want to redo it anyhow to see what changed...
15:59:29 <dgilmore> #agreed we will change the log file retention from 1 day to 7 days for koschei
16:00:29 <dgilmore> #topic 133 Enable Signed Repository Metadata
16:00:36 <dgilmore> https://pagure.io/releng/issue/133
16:01:06 <dgilmore> So I think we should figure out how to do this
16:01:34 <nirik> I guess. it doesn't buy us much, but more security is better than less.
16:01:39 <dgilmore> though I know some users will complain, tehre was already one bug over how dnf deals with the gpgkeys over the openh264 repo
16:01:50 <nirik> that was user error (IMHO)
16:01:58 <dgilmore> nirik: provides protection for people using baseurl
16:02:28 <nirik> sure, but "doctor, it hurts when I do this...well, don't do that then"
16:02:33 <masta> does it mean a signer has to unlock their key and sign the repo metadata?
16:02:39 <dgilmore> nirik: user error, bug in dnf, unrealistic expectations
16:02:57 <nirik> this may need to tie into the signed koji repos thing...
16:03:09 <dgilmore> masta: we would need to setup all teh processes that make repos to sign the repodata
16:03:12 <puiterwijk> masta: I wouldn't go with that, and go for integration with autosigning.
16:03:29 <dgilmore> if we do that as part of the creation or shipping of the repos we should decide
16:03:47 <dgilmore> nirik: I do not think so
16:03:54 <puiterwijk> I want to get less manual signing, not more
16:04:19 <dgilmore> given it needs to be done at the end of long running processes
16:04:27 <dgilmore> manual is not feasible
16:04:39 <masta> I can see manually signed release repos, but not updates... those would want automation.
16:04:50 <dgilmore> the repodata dignature is a detatched signature for repodata.xml
16:04:52 <puiterwijk> I'd want automation for all kinds
16:05:16 <dgilmore> masta: well we would want branched and rawhide composes to have signed repodata also
16:05:52 <dgilmore> do we all think there is value in signing the repodata?
16:06:10 <nirik> sure.
16:06:34 <masta> yes, I've always wanted it... but it was something I've never mentioned... because I thought it was too hard a problem (in the context of manually signing)
16:06:45 <dgilmore> #info we agree there is value in signing repodata
16:07:08 <dgilmore> we need to identify all the places to be signed
16:07:36 <dgilmore> I think the repos produced by pungi composes and bodhi updates
16:08:00 <dgilmore> do we want to retroactively change epel, and existing fedora updates streams?
16:08:08 <dgilmore> I think so
16:08:32 <dgilmore> even if we do not change the configs
16:08:54 <dgilmore> I think setting up the signing without a bunch of exeptions and corner cases is best
16:09:05 <dgilmore> I do wonder how things like satellite will work
16:09:18 <dgilmore> say someone imports epel into satellite/spacewalk
16:09:21 <masta> since only repodata.xml makes sense... sure... only that one file would churn.
16:09:25 <dgilmore> does it make its own repos
16:10:34 <dgilmore> masta: well unless you set repo_gpgcheck=1 in the repo file or globally nothing changes from the user perspective
16:10:59 <dgilmore> rhartman|rh: do you know if satellite/spacewalk/katello etc make thier own repos?
16:11:12 <dgilmore> or do they also suck in the repodata?
16:11:45 <dgilmore> but I guess its up to people using such tech to manage the repo defenitions anyway
16:11:48 <rhartman|rh> dgilmore, i'm not familiar with the latter 2, satellite I can investigate, I believe they do
16:12:13 <dgilmore> rhartman|rh: spacewalk is the open source satelliet 5 code base
16:12:36 <dgilmore> rhartman|rh: and katello I believe is the open source satellite 6 code base
16:13:36 <dgilmore> does someone want totake ownership of finding and documenting our use cases?
16:13:41 <rhartman|rh> dgilmore, i'll take the ai to follow up, assuming your concern is those would need updates as well
16:14:28 <dgilmore> rhartman|rh: well my concern is people doing something like importing epel or fedora would lose the ability to check the gpg signing of the repos
16:15:16 <dgilmore> #action rhartman to follow up with the spacwalk teams code bases on signing of repodata and how it would work for them
16:15:38 <dgilmore> rhartman|rh: as they will have no ability to sign repos themselves using the official keys
16:15:39 <masta> dgilmore, so like a yum repo-sync kidna thing, and run  createrepo_c.. so then no more gpg sig?
16:15:51 <dgilmore> which yum/dnf afaik makes an assumtion on
16:15:59 <dgilmore> masta: exactly
16:16:28 <masta> hrm...
16:17:58 <dgilmore> #info lets follow up next week. If someone wants to work on this please reach out to dgilmore
16:18:31 <dgilmore> #topic #6557 Something broken in Rawhide's koji
16:18:39 <dgilmore> https://pagure.io/releng/issue/6557
16:19:02 <dgilmore> this looks like its a bug in elfutils on aarch64
16:19:19 <dgilmore> we need to do better about triaginga nd providing feedback on tickets
16:19:30 <dgilmore> there is also other issues in koji right now
16:19:37 <dgilmore> downloads of rpms is failing a lot
16:19:50 <dgilmore> and it looks like srpms are randomly corrupted
16:20:00 <puiterwijk> The latest issue there is a very weird thing with DNS resolution sometimes working and other times not
16:20:27 <dgilmore> puiterwijk: joy
16:20:52 <puiterwijk> And it's not like anything changes: within the exact same dnf run, getting the repodata works, but then getting the packages gives a dns error
16:21:08 <dgilmore> #info we need to do better at triaging issues and providing feedback. the issue that resulted in this ticket seems to be a elfutils bug on aarch64
16:21:09 <sharkcz> corrupted srpms are in my experience incompletely downloaded, not some shuffled bits or so
16:21:41 <dgilmore> sharkcz: we started getting a lot of them in the last 24 hours
16:21:57 <sharkcz> might be related to the buildroot issues
16:22:00 <dgilmore> which puiterwijk and nirik did change things in kojipkgs yesterday
16:22:18 <nirik> yes yes we did.
16:22:19 <dgilmore> puiterwijk: nirik: would you like to summarise the changes you made?
16:22:48 <nirik> I added a kojipkgs02 instance (setup exactly like the 01 one)... then we moved both of them behind proxy01/10 and haproxy.
16:23:07 <nirik> so you should be able to take down either one and the other one keeps working transparently
16:23:09 <dgilmore> that is a lot of extra layers
16:23:15 <nirik> it is.
16:23:28 <nirik> but it means kojipkgs01 is no longer a SPOF
16:23:44 <dgilmore> any of which could be causing the current corruption of srpms and is hard to debug
16:23:49 <dgilmore> sure
16:23:53 <nirik> puiterwijk: ah, you already debugged the download thing ?
16:24:02 <puiterwijk> nirik: yes. the latest issue is this DNS thing...
16:24:08 <masta> dgilmore, how can we reproduce, would a curl/wget work for that?
16:24:19 <puiterwijk> Which makes no sense at all, and is not related to our change other then that we changed DNS
16:24:25 <puiterwijk> masta: no, curl/wget work.
16:24:30 <nirik> note: the sometimes failing to download packages for the mock chroot was happening before.
16:24:40 <dgilmore> #info nirik added a kojipkgs02 instance (setup exactly like the 01 one)... then we moved both of them behind proxy01/10 and haproxy.
16:25:01 <nirik> puiterwijk: I have a thought...
16:25:07 <puiterwijk> nirik: ah, me too
16:25:11 <dgilmore> nirik: right
16:25:11 <nirik> we aren't allowing port 53 tcp on the builders
16:25:28 <dgilmore> nirik: the new issue is what seems to be corrupted srpm downloads frequently
16:25:28 <nirik> so when the result is too large for udp it sends tcp and it's dropped?
16:25:37 <puiterwijk> nirik: well, it's the same DNS result..
16:25:47 <nirik> dgilmore: yep. saw it. Not dug into it much yet.
16:25:51 <nirik> we also have another issue...
16:26:04 <nirik> buildvm's are very unstable
16:26:17 <nirik> and we haven't been able to isolate why yet.
16:26:38 <dgilmore> #info new issue is that we seem to frequently get corrupted srpm downloads for build tasks
16:26:39 <nirik> so, looks like my monday is all set. ;)
16:26:56 <dgilmore> #info new issue is that buildvm's seem unstable after moving to f25
16:27:25 <nirik> but it's not f25
16:27:31 <dgilmore> #info new issue is that armv7hl builds get killed by oom much more frequently than previously
16:27:34 <nirik> they were fine from dec to last week
16:27:57 <dgilmore> nirik: updates applied last week?
16:28:07 <nirik> I was thinking it was a kernel issue, but I moved them back to the kernel they were on and they still blew up last night
16:28:19 <nirik> so, it might be a userspace thing... yes, updates last week
16:28:24 <dgilmore> how are they blowing up?
16:28:43 <nirik> they hang, they spew kernel oopses, then they slowly reboot (over many many minutes)
16:29:32 <nirik> and journald /systemd dies and corrupts the journal
16:29:46 <nirik> https://bugzilla.redhat.com/show_bug.cgi?id=1413314 is some of the dmesg with 4.9.x kernel
16:30:51 <dgilmore> lovely
16:31:42 <nirik> yeah, its always fun. ;)
16:31:51 <dgilmore> and i686 builds seem to be more problematic than x86_64?
16:32:01 <nirik> not sure.
16:32:27 <dgilmore> most of the complaints ive seen in irc channels has been issues with i686
16:32:36 <dgilmore> but that may be a coincidence
16:33:04 <masta> so are the arm builders doing large builds, kernel, tool chains, atlas... big jobs?
16:33:40 <dgilmore> #action puiterwijk, nirik and dgilmore to do more investigation
16:34:00 <dgilmore> masta: lbreoffice is one, but its others
16:34:05 <nirik> the arm builders were hitting OOM all the time, so I updated them last week and they are running 4.9.x kernels just fine
16:34:12 <dgilmore> masta: including nirik's ssh session when looking
16:34:23 <nirik> the above issue is buildvms _only_
16:34:32 <dgilmore> #info arm builders are happier on 4.9 kernel
16:34:37 <nirik> buildhw and arms seem just fine with the 4.9.x kernel
16:34:53 <nirik> libreoffice is still failing, but that might be it's fault
16:34:58 <dgilmore> okay
16:35:04 <dgilmore> any more to discuss here?
16:35:20 <dgilmore> our time is technically up and we have not done half of the issues?
16:36:12 <dgilmore> #topic #6570  Fedora Core 1 archive incomplete
16:36:20 <dgilmore> https://pagure.io/releng/issue/6570
16:36:35 <dgilmore> I am really unsure of what we can do here
16:36:52 <nirik> if someone still has some cds? ;)
16:37:01 <nirik> we could send out a call...
16:37:16 <rsc> I might have a CD at home, kital also might have one at his office.
16:37:33 <dgilmore> I think I have access to the GOLD tree
16:37:42 <rsc> (not sure if home-burned CDs are still readable 10 years after)
16:37:56 <nirik> yeah, no telling.
16:37:58 <dgilmore> I am not sure how it was messed up
16:38:01 * nirik got rid of his cd's long ago
16:38:52 <dgilmore> ls /mnt/redhat/released/FC-1/GOLD/x86_64/os/Fedora/RPMS/|wc -l
16:38:52 <dgilmore> 1476
16:38:55 <rhartman|rh> dgilmore, sorry was in simultaneous meetings, reading up. And will pursue that info
16:39:04 <dgilmore> rhartman|rh: cheers
16:39:35 <dgilmore> will have to lok at it as whats in the internal tree actually has a different layout to what is on the mirrors
16:40:37 <dgilmore> #info we assume that something in the archiving process caused the issues
16:41:07 <dgilmore> #info dgilmore has access to the internal Red Hat copy of FC-1 it however looks different to what was shipped
16:41:26 <dgilmore> for me the biggest question is do we do anything?
16:43:18 <masta> Good question, I lean towards yes.
16:43:37 <masta> It's what we know when we know it... so now we know, should fix it.
16:44:30 <dgilmore> we likely should also look at proviing tooling for release archiving
16:45:07 * nirik also thinks we should fix...
16:45:11 <nirik> for posterity!
16:45:19 <dgilmore> #info we should put back the missing content
16:45:33 <dgilmore> #action dgilmore will put back the missing content
16:46:02 <dgilmore> #action file an issue for new tooling to be using in archiving releases
16:46:37 <dgilmore> #topic #6577 Fedora 26 mass rebuild
16:46:44 <dgilmore> https://pagure.io/releng/issue/6577
16:47:01 <dgilmore> so the schedule has us doing a mass rebuild on Feb 1
16:47:13 <dgilmore> we need to update the scripts
16:47:23 <dgilmore> we need to file tracker bugs
16:47:36 <dgilmore> setup failing build monitoring
16:47:53 <dgilmore> I guess mboddu should probably do it
16:47:58 <dgilmore> and I will help him
16:48:02 <maxamillion> +1
16:48:09 <dgilmore> we will use the releng user to do the mass rebuild
16:48:16 <dgilmore> which reminds me
16:48:17 <mboddu> dgilmore: sure
16:48:29 <dgilmore> I am the only person getting email sent to releng@fp.o
16:48:37 <dgilmore> is there others who want that email?
16:49:07 <dgilmore> mostly it is bug email from mass rebuild FTBFS
16:49:39 <dgilmore> we know there will be a new gcc
16:50:07 <mboddu> dgilmore: I think I should get them as well
16:50:38 <dgilmore> mboddu: okay, will add you
16:50:51 <mboddu> dgilmore: thanks :)
16:51:16 <dgilmore> #action dgilmore to add mboddu to list to recieve releng@fp.o email
16:51:22 <masta> huh, I thought I was already getting those emails, but no.. those were from a mail list named rel-eng@
16:51:58 <dgilmore> masta: yeah its different
16:52:09 <dgilmore> it was something setup when i made the releng user in fas
16:52:44 <masta> dgilmore, does it get any actionable email from real people?
16:53:13 <dgilmore> masta: sometimes people set the bugzilla user as needinfo
16:53:29 <dgilmore> otherwise no
16:53:42 <masta> okay, add me too... as a backup from you and mboddu
16:53:49 <dgilmore> okay
16:54:00 <dgilmore> #action dgilmore to add masta to list to recieve releng@fp.o email
16:54:25 <dgilmore> #info need to determine the status of all things needed prior to the mass rebuild
16:54:46 <dgilmore> I know the pkgconfig one started discussion only after being approved
16:55:33 <dgilmore> #topic #6578 run repoclosure before mering mass rebuilds
16:55:40 <dgilmore> https://pagure.io/releng/issue/6578
16:55:59 <dgilmore> wording here may have been bad
16:56:13 <dgilmore> we can not really run it for the mass rebuilds
16:56:30 <dgilmore> but we can when merging in side targets
16:57:06 <dgilmore> adn given how much of a mess python3 was in when it was merged I think there is a lot of value in making sure that we do not merge in things known broken
16:57:11 <masta> seems reasonable for completed side-tags
16:57:36 <dgilmore> it fits into my long term plan of detecting when things cause breakage and not letting in until the breakage is fixed
16:57:52 <dgilmore> most side tags have been fine
16:58:14 <dgilmore> does everyone agree we should add the extra checking?
16:58:21 <masta> +1
16:58:27 <sharkcz> +1
16:58:40 <dgilmore> mboddu: maxamillion: nirik: puiterwijk: sharkcz: pbrobinson: bowlofeggs:
16:58:49 <dgilmore> tyll_:
16:59:02 <mboddu> +1
16:59:17 <maxamillion> +1
16:59:21 <maxamillion> sorry, multitasking badly
16:59:25 <nirik> sure. +1
16:59:28 <dgilmore> I will mark that as accepted
16:59:30 <puiterwijk> Sure, +1
16:59:46 <dgilmore> #accepted we will run repoclosure on side targets before merging
17:00:02 <dgilmore> #action dgilmore to update the sop on how to run repoclosure
17:00:17 <dgilmore> #topic  #6587 PDC ownership in Fedora
17:00:23 <dgilmore> https://pagure.io/releng/issue/6587
17:00:38 <dgilmore> so we need someone to step up and own pdc in fedora
17:00:55 <dgilmore> Ralph does not have time for it in his current role
17:01:04 <dgilmore> there is a lot more we can and should do with it
17:01:27 <dgilmore> we just need to have somoen who can look after it and help drive new features for us with upstream
17:01:39 <masta> Do we have an existing postgre database we can host it on?
17:01:48 <masta> maybe the place koji db runs?
17:01:50 <dgilmore> we are currently running an old version as no one has had the time to do the upgrades
17:02:02 <dgilmore> masta: we have it up and running
17:02:06 <masta> oh!
17:02:06 <dgilmore> masta: its all in place
17:02:15 <dgilmore> just not looked after and loved as it should be
17:02:32 <dgilmore> https://pdc.fedoraproject.org/
17:03:44 <dgilmore> #info somoene is needed to shppard and love PDC for us
17:03:59 <masta> I only just started messing with pdc, so I'm not going to latch on to this, but I'll look into it to potentially help here. Okay?
17:04:40 <dgilmore> #info we need someone to guide and shappard it for Fedora's needs. be involved in upstream. ensure we get code updates and make the most of what PDC offers us
17:04:48 <dgilmore> okay cheers
17:05:13 <dgilmore> #topic Open Floor
17:05:24 <dgilmore> does anyone have anything?
17:05:50 <mboddu> nothing from me
17:06:29 * nirik thinks everything he had was already covered.
17:07:51 <dgilmore> lets wrap up then
17:07:54 <dgilmore> Thanks all
17:07:57 <dgilmore> #endmeeting