15:37:31 <dgilmore> #startmeeting RELENG (2014-08-11)
15:37:31 <zodbot> Meeting started Mon Aug 11 15:37:31 2014 UTC.  The chair is dgilmore. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:37:31 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:37:41 <dgilmore> #meetingname releng
15:37:41 <zodbot> The meeting name has been set to 'releng'
15:37:50 <masta> neat trick that addchair
15:37:54 <dgilmore> #chair dgilmore nirik tyll sharkcz bochecha masta pbrobinson
15:37:54 <zodbot> Current chairs: bochecha dgilmore masta nirik pbrobinson sharkcz tyll
15:38:07 * pbrobinson is here
15:38:13 * nirik is here.
15:38:19 <dgilmore> #topic init process
15:38:20 * masta waves
15:38:33 * bochecha is here
15:39:18 <dgilmore> #topic #5931 [Proposal] Move new branch and unretire requests to pkgdb2
15:39:25 <dgilmore> https://fedorahosted.org/rel-eng/ticket/5931
15:39:35 <dgilmore> no pingu
15:40:56 <dgilmore> okay lets move on
15:41:12 <dgilmore> #topic #5959 Enable keep-alive connections for koji (primary and secondaries)
15:41:18 <dgilmore> https://fedorahosted.org/rel-eng/ticket/5959
15:41:37 <dgilmore> tyll: so from memory we used to and it caused issues so we turned it off
15:42:03 <dgilmore> but that was a long time ago
15:42:09 <dgilmore> so we can revisit it
15:43:36 <dgilmore> nirik: maybe we should just turn keep alive on and see if anything breaks
15:43:52 <nirik> sure. is that in koji or apache or squid or ?
15:43:58 <pbrobinson> we could possibly do it on a secondary koji first to see?
15:44:01 <dgilmore> apache on the hubs
15:44:15 <dgilmore> pbrobinson: yeah.
15:44:36 * danofsatx-work is sitting in, as usual
15:44:37 <pbrobinson> if I know what issues to look for happy to have arm.koji be the guinea pig
15:45:05 <nirik> sure. Could do a secondary first to see, then do primary...
15:45:34 <dgilmore> pbrobinson: it was years ago, and I honestly don't remeber the specifics
15:46:19 <dgilmore> i think its related to how koji uses raw sockets for ssl auth
15:46:26 <dgilmore> but thats a guess
15:46:34 <pbrobinson> well still happy to do arm.koji and keep an eye out
15:46:50 <dgilmore> #action pbrobinson to turn on on arm koji and see how it goes
15:47:20 <dgilmore> lets see how that goes and look at the rest for next week
15:47:29 <dgilmore> #topic #5870 rawhide signing
15:47:36 <dgilmore> https://fedorahosted.org/rel-eng/ticket/5870
15:48:01 <dgilmore> tyll: this is mostly working right? well other than primary sigul being down right now
15:48:34 <dgilmore> sigul got really unhappy yesterday when i poked at it. texlive seemed to make it lock up
15:48:41 <nirik> it's up again now.
15:48:46 <nirik> processing texlive for f21
15:48:57 <pbrobinson> texlive would make most things lock up and/or cry....
15:49:07 <dgilmore> its a nasty piece of work
15:52:00 <nirik> the one part where that hiccups the most is
15:52:07 <nirik> when it's signing the 1.5+gb src.rpm
15:52:19 <dgilmore> yeah
15:52:46 <dgilmore> at some point we really need to try get some of the bugs in sigul fixed
15:53:12 * nirik nods
15:53:12 <masta> yeah
15:53:31 <dgilmore> I guess we need an update from tyll on where ther code etc is and how its all going. then close this as done
15:53:45 <dgilmore> unless we want to look at gating builds
15:53:47 <masta> I've noticed it's more stable on the ppc64 side, less prone to get stuck on large batches
15:54:40 <nirik> dgilmore: yeah. we might, but that could be a seperate ticket.
15:55:09 <dgilmore> nirik: yeah.
15:55:39 <dgilmore> #action tyll provide update and status on where the code is.  then close this ticket
15:55:49 <dgilmore> #topic #5914 Move fedmsg based blocking service to Fedora Infrastructure
15:55:58 <dgilmore> https://fedorahosted.org/rel-eng/ticket/5914
15:56:20 <dgilmore> need to see where tyll is whith this
15:56:46 <dgilmore> #action tyll provide status and let us know anything we need to do
15:56:58 <dgilmore> #topic Secondary Architectures updates
15:56:58 <dgilmore> #topic Secondary Architectures update - ppc
15:57:15 <masta> hey
15:57:17 <dgilmore> pbrobinson: masta: sharckcz: hows things with ppc
15:57:48 <pbrobinson> I believe we now have all the core toolchain bits for PPC-LE (binutils/gcc/glibc) etc so it should start moving forward again RSN
15:57:54 <masta> pretty good, I think we are still working out python stuff.
15:58:05 <pbrobinson> masta: I believe that should now be fixed?
15:58:58 <masta> pbrobinson: possibly, I'm not 100% sure... but there was some outlying issue I thought on Friday. (forgot the details)
15:59:57 <masta> nothing more to report.
16:00:26 <dgilmore> anything else to update here?
16:00:41 <pbrobinson> for ppc or all 2ndary?
16:00:50 <dgilmore> ppc
16:00:55 <pbrobinson> none
16:00:58 <dgilmore> #topic Secondary Architectures update - s390
16:01:07 <dgilmore> sharckcz is not here.
16:01:24 <pbrobinson> there's still a outstanding bug in gcc we're awaiting for confirmation about here
16:01:32 <pbrobinson> it affects openssl on s390
16:01:42 <pbrobinson> but not sure what other impact
16:01:48 <dgilmore> yeah, thats a general bug though right?
16:02:04 <pbrobinson> awaiting confirmation from Jakob
16:02:15 <dgilmore> okay
16:02:22 <pbrobinson> the other ABI issue has the fix landed
16:02:29 <dgilmore> awesome
16:02:36 <dgilmore> #topic Secondary Architectures update - arm
16:02:41 <pbrobinson> going strong
16:02:42 <dgilmore> pbrobinson: ill let you give this
16:03:02 <pbrobinson> a few packages I need to beat up (one has issues with something NEON based0
16:03:22 <pbrobinson> but other than that looking really good.... if I do say so myself ;-)
16:03:33 <pbrobinson> that's it
16:03:51 <dgilmore> cool
16:04:02 <dgilmore> #topic Open Floor
16:04:07 <dgilmore> so i have one item
16:04:17 <dgilmore> there will be an extra mass rebuild this week
16:05:11 <pbrobinson> this is good for secondary arches as it gives us opportunity to fix up issues that appear on all of them (various and none the same)
16:05:20 <masta> when does it start?
16:05:22 <pbrobinson> but obviously not nice in general
16:05:32 <pbrobinson> Thursday night (UK time)
16:06:13 <tyll> I am now here as well
16:06:19 <dgilmore> need to refactor the mass rebuild script to do things right
16:06:37 <nirik> dgilmore: 21 and 22? or?
16:06:43 <dgilmore> but I believe it should all go pretty smoothly
16:06:50 <dgilmore> nirik: yeah both f21 and f22
16:07:07 <nirik> and only a subset of archfull ones?
16:07:13 <pbrobinson> but only binaries
16:07:17 <pbrobinson> so noarch are OK
16:07:23 <dgilmore> only archful packages
16:07:46 <nirik> all of them? or could we find out a subset based on time the bug was around?
16:08:06 <masta> tool chain issues force the mass rebuild?
16:08:13 <dgilmore> gcc bugs
16:08:38 <dgilmore> https://fedorahosted.org/rel-eng/ticket/5962
16:08:43 <dgilmore> has all the details
16:11:02 <dgilmore> anyone else have anything?
16:11:10 <dgilmore> tyll: want to give up some updates?
16:11:59 * pbrobinson has nothing more, will need to run momentarily
16:12:04 <masta> I believe I am on sign/mash/push duty this week
16:12:06 <tyll> dgilmore: the code for the autosigning is in the rel-eng repo (autosigner.py)
16:12:15 <dgilmore> tyll: excellent
16:12:33 <tyll> https://fedorahosted.org/rel-eng/ticket/5870#comment:15 has some open questions that I identified
16:13:41 <tyll> so to use it finally I believe we need to use a gating tag to make sure that everything is really signed, currently there is no code to ensure this, but only everything is signed when it is tagged
16:14:34 <tyll> an open task where I cannot do anything is to see, if there is now enough space on the secondary kojis to store the signed RPMs
16:14:51 <nirik> sigul seems stuck on texlive again.
16:15:13 <dgilmore> tyll: we really have very little insight into the storage availability on the kojis
16:15:36 <tyll> nirik: the autosigning script is also trying to sign it
16:15:50 <nirik> we were supposed to have a bunch more space, but when I asked last they said it wasn't available... will keep following up with them
16:16:20 <nirik> tyll: yeah, thats where it's sticking (well, I think it's siguls fault, not the script)
16:16:57 <dgilmore> nirik: likely sigul
16:17:25 <tyll> nirik: currently it is still waiting for koji within a reasonable timeout, so maybe it will work eventually
16:17:34 <tyll> s/koji/sigul
16:19:08 <tyll> there is also still a potential issues with garbage collection on koji, because signed RPMs will be removed after 12 weeks, therefore if Rawhide packages are not re-build within 12 weeks, the RPM will only be published unsigned
16:19:50 <tyll> for the last one I had the idea to make mash write out signed RPMs if a signature exists, but then it will require a admin koji certificate
16:20:13 <dgilmore> tyll: they only get cleaned up if not the latest
16:20:47 <tyll> I see, then it is not a problem
16:21:56 <tyll> so do you want to discuss about how to implement the gating tag or want to know something about the signing script?
16:22:17 <nirik> I think the gating tag would be great for rawhide.
16:22:29 <nirik> Not sure it's worth it for f21 at this point... since it's going to move to bodhi soon
16:23:01 <dgilmore> I am thinking that using a gating tag might be a great way to work on not having rawhide so broken.
16:23:18 <dgilmore> force things to stay in the gating tag until they pass some sanity checks
16:24:26 <tyll> yes, but for rawhide signing there should be one tag that only means the build is not yet signed or it still needs to be signed if it is not yet tagged
16:24:32 <nirik> dgilmore: yep.
16:24:56 <tyll> otherwise the progress needs to be stored somewhere else to make sure that every build is signed
16:25:28 <masta> I like the idea  of the tag, just sign the tag, and then presumably untag the builds from there.
16:25:32 <nirik> yeah, so... f22-unsigned (sign it) -> f22-pending -> some checks at compose time at least -> f22
16:25:53 <dgilmore> nirik: something like that sounds good
16:26:12 <dgilmore> nirik: probably though id go f22-pending -> f22-unsigned -> f22
16:26:54 <dgilmore> where pending is in the buildroot but wont get shipped out until its passed checks etc, and we dont sign it until its passed checks
16:26:58 <nirik> but that wouldn't allow for compose time checks would it? I guess it could untag
16:27:04 <dgilmore> we will need to do the same for secondaries
16:27:12 <dgilmore> will take some working out the workflow
16:27:17 <nirik> so checks would be after build time only?
16:27:31 <nirik> yeah.
16:27:38 <dgilmore> what checks are you thinking?
16:28:28 <nirik> not sure, I think we talked about some compose time checking long ago...
16:28:37 <nirik> but perhaps it doesn't make sense if we have build time
16:28:51 <dgilmore> ideally we catch things before compose time
16:28:59 <nirik> we should invite tflink and see what his thoughts are around it.
16:29:04 <dgilmore> but having sanity checks on the compose would be good also
16:30:10 <dgilmore> lets table the discussion and work out a plan of attack
16:30:36 <tyll> so is it ok to create a f22-unsigned tag that I can use to implement the code by tagging builds in there manually?
16:30:51 <dgilmore> tyll: sure.
16:32:26 <tyll> ok, then regarding the fedmsg blocking service: Did I get it right that retired EPEL packages should be blocked, all builds untagged and the pkg unblocked in the build tag?
16:32:57 <dgilmore> tyll: yeah.
16:33:12 <nirik> yep.
16:33:30 * nirik isn't sure all builds being untagged matters, but I guess it can't hurt
16:34:05 <tyll> the untagging is there in case the package needs to be re-introduced e.g. as ppc-only after it went to RHEL for x86_64
16:34:15 <dgilmore> nirik: i think if ots not the epel build will show up in the buildroot and not the rhel version
16:34:33 <nirik> weird. even tho it's blocked?
16:34:57 <tyll> nirik: if it is re-introduces it needs to be unblocked again in the future
16:34:59 <dgilmore> yeah, it shows up in latest-pkg due to inheritance when unblocked in the -build tag
16:35:15 <tyll> oh, or that
16:35:16 <nirik> neat.
16:35:28 <nirik> wonder if we have some that are blocked that we have not untagged right then
16:35:34 <dgilmore> likely
16:37:58 <dgilmore> anything else? or should i wrap up?
16:38:00 <tyll> to deploy the fedmsg blocker service, we still need to decide whether to use a current or a new koji certificate
16:38:18 * nirik would say new...
16:38:29 <nirik> just to make sure it's seperate
16:39:06 <tyll> nirik: will you create one and store it in private ansible?
16:39:09 <dgilmore> its easy to make one
16:39:13 <dgilmore> lets go with new
16:39:29 <tflink> nirik: interestingly enough, we were talking about how to better support repo checks @ push time earlier today
16:39:33 <nirik> sure, or dgilmore can.
16:39:55 <tflink> the conclusion was that we need to gather use cases before going too much farther forward with implementation
16:40:07 <nirik> tflink: cool. :) I think really all we need now is to make sure whatever setup we move to has the tags setup so we can check in both after build and before/during compose.
16:40:09 <dgilmore> tflink: okay
16:40:17 <nirik> yeah, thats a good idea too.
16:40:29 <tyll> will there certificate be valid for a long time or does it need to be updated every 6 months?
16:40:37 <dgilmore> tyll: 10 years
16:42:02 <tflink> I was talking with someone else about checks during/before push recently (I think it may have been lmacken but not sure) and one thought was to use fedmsg: send out a fedmsg at push start, taskotron checks that push contents are OK and emits fedmsg which the push process is waiting for
16:42:33 <nirik> could work. if taskotron could work on an entire repo instead of a package(s)
16:43:18 <dgilmore> we could mash the repo then run checks
16:43:22 <nirik> or... I wonder, if the checks were fast enough... we could just do those after each build
16:43:33 <dgilmore> then deal with breakage according to policy and remash
16:43:36 <tyll> can we use such a certificate for the autosigning as well? But I guess it won't work, because the certificate is also used for sigul, isn't it?
16:43:47 <dgilmore> though deltarpm probably would suck
16:43:51 <nirik> ie, foo-2.0 is built, check the entire repo for some common breaks, if it is broken, don't allow foo-2.0 in
16:44:12 <dgilmore> tyll: the autosiging needs to be done by a real user
16:44:20 <nirik> it would be probibly much better to check at build time, then we could prevent problems
16:44:24 <dgilmore> tyll: and the keys are per person
16:44:37 <dgilmore> so if i kicked it off it would be done as mee
16:44:38 <dgilmore> me
16:45:51 <tyll> dgilmore: yes, this is how it is currently done
16:46:25 <tflink> nirik: actually, most of the checks are already tag-based and modified to report per-update/package
16:47:18 <nirik> tflink: yeah, thinking on it, doing them before compose time would be better, because at compose time all we can do is fail the compose, but at package build time we could reject the package entering the compose and avoid the breakage.
16:47:31 <nirik> I'll go back and look at what we were thinking for compose time stuff.
16:48:12 <tflink> when we (qa folks) were talking about it earlier today, one idea we had was for multiple levels of "strictness"
16:48:45 <nirik> the conversation I remember was from a few years back and we might have not thought we would have anything like taskotron...
16:48:51 <nirik> or fedmsg
16:49:02 <tflink> one level for what we're currently doing where strict rules will cause as much problems as  anything but use a more strict form of upgradepath/depcheck @ push time when we know exactly what is about to be pushed and probably want to be more strict
16:52:32 * nirik nods.
16:52:54 <nirik> with a pending tag system we can also allow maintainers to override... just tag in directly.
16:53:51 <dgilmore> i think id rather not allow that
16:54:06 <nirik> ok, then we would need some other way to allow...
16:54:14 <tflink> getting the basic design for qa systems to support this is one thing that we want to get started this week, so if you all have use cases around how this might work, that would be appreciated
16:54:38 <tflink> specifically, the ability to override results and what data needs to go where in what format
16:55:01 <dgilmore> nirik: id rather use some app that they have to give a reason for it
16:55:28 <nirik> dgilmore: ok. I really would expect it to be very rare tho... and we can always see who tagged it and ask them.
16:55:37 <dgilmore> make people put some thought into why they need to force something in
16:55:49 <nirik> or even note that in the compose... "foo was manually tagged in by kevin"
16:56:12 <nirik> yep
16:57:02 <dgilmore> i really would like a tool that shows the differences between rpm builds
16:57:17 <dgilmore> i could see the override going into that
16:57:18 <nirik> some folks were touting rpmgrill
16:57:31 <dgilmore> where you give a big fat, they is a soname bump etc message
16:57:38 <tflink> IIRC, rpmgrill can't quite do that
16:57:47 <dgilmore> not heard of rpmgrill
16:57:48 <tflink> it's more of a collection of rpm tools
16:57:52 <nirik> ok
16:58:16 <tflink> but we'll likely have an rpmgrill task in taskotron before too long. the env-and-stacks folks want it
16:59:03 <nirik> anyhow, the common case for overriding I see is something that causes broken deps, but is a important security/bugfix and the thing that breaks needs upstream work...
16:59:20 <nirik> (if we block on broken deps, whcih I would love to see)
16:59:45 <tflink> the other case is when we find a bug in the tools :)
16:59:58 <nirik> yeah.
17:00:03 <tflink> not that I think the tools are buggy but I also know that they aren't perfect
17:00:34 <nirik> also if we key on moving into a tag, we can retag it into the candidate tag to make it retest once other things are fixed.
17:00:38 <tflink> there is at least one situation that we haven't thought of and that the tools may not handle well
17:01:30 <dgilmore> tflink: possibly more than one
17:02:15 <tflink> dgilmore: probably more than one :) that's why I said "at least", though
17:02:28 <nirik> rawhide also presents some problems.
17:02:35 <nirik> since things are not in update bundles.
17:02:49 <nirik> so how can it know that libfoo, bar, baz are all to be tested at once.
17:03:36 <dgilmore> nirik: need to test the whole batch in -pending as an update
17:03:52 <tflink> that's what we're doing for non-rawhide right now, anyways
17:04:08 <tflink> just grab the whole tag and work on that as a group
17:04:08 <nirik> but then if you build libfoo, it gets tagged into pending, and while you are building bar/baz it gets rejected and untagged...
17:04:22 <nirik> anyhow, we can sort this all out I'm sure.
17:04:35 <tflink> we'd have to do a "rawhide-pending" tag to make it work, I think
17:08:31 * dgilmore needs to run to catch a plane
17:08:36 <dgilmore> nirik: want to finish up
17:12:52 <nirik> sure, anything else for open floor?
17:16:00 <nirik> ok, thanks for coming everyone!
17:16:02 <nirik> #endmeeting