17:00:48 <Oxf13> #startmeeting Fedora Release Engineering
17:00:48 <zodbot> Meeting started Fri Aug 20 17:00:48 2010 UTC.  The chair is Oxf13. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:48 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
17:00:58 <Oxf13> #meetingname fedora-releng
17:00:58 <zodbot> The meeting name has been set to 'fedora-releng'
17:01:03 <Oxf13> #topic Roll Call
17:01:16 * nirik is lurking around in the cheap seats.
17:01:32 <Oxf13> ping: notting dgilmore lmacken spot wwoods rdieter nirik poelcat
17:02:01 * notting is here
17:02:33 * jsmith lurks
17:04:06 <Oxf13> small crew today
17:04:21 <Oxf13> #info present are nirik notting jsmith and Oxf13
17:04:30 <Oxf13> #topic Fedora 14 Alpha
17:04:41 <Oxf13> so I was on leave, somebody else want to recap this?
17:05:06 <nirik> we had a 1 week slip due to a blocker bug...
17:05:26 <nirik> but things should be going out to mirrors for a release next week...
17:05:39 <notting> dgilmore composed the RCs. we had a moment where we thought we needed a RC5, but the bug in question was deemed not a blocker
17:05:57 <notting> right now, we're GO for on-schedule release next week.
17:06:04 <notting> #info go for on-schedule alpha release next week
17:06:38 <notting> as mentioned on the other channel, i didn't know who is taking the responsibility of the push (and therefore most of the rest of the alpha tickets)
17:06:51 <Oxf13> I know there will probably be a retrospective at some point, but from our point of view, what could have been done differently to avoid the slip?
17:07:17 <Oxf13> #info oxf13 responsible for ensuring the push and the rest of the tickets get done.
17:07:23 <Oxf13> #info dgilmore will be staging the bits today
17:07:43 <Oxf13> #question What could have been done differently to avoid the slip?
17:07:50 <Oxf13> (is #question a real command?)
17:08:07 <tibbs> Higher caliber weapons?  More advanced torture devices?
17:08:39 <notting> Oxf13: 1) land python 2.7 earlier 2) land systemd earlier 3) don't have gnome reverting at the same time
17:08:40 <nirik> well, if we had more testable things sooner we might have spotted the bug...
17:09:02 <notting> Oxf13: #1 and #2 are under fedora's control, to some extent. #3 less so.
17:09:07 <jsmith> Right... the bug wasn't directly related to python or systemd, but those things certainly contributed to the lack of testing
17:10:02 <Oxf13> ok.  I had previously expressed extreme distaste with landing python so late, and I was unaware of boost happening at the same time
17:10:09 <jsmith> (and yes, there's already retrospective going on at https://fedoraproject.org/wiki/Fedora_14_Schedule_Retrospective)
17:10:14 <Oxf13> I hadn't directly talked about systemd but I was also upset about that
17:10:17 <notting> the bug was in basic video mode handling, correct? so it was just a run-of-the-mill anaconda regression?
17:10:32 <Oxf13> unfortunately we gave people a feature deadline and they hit it, so it's not the developer's fault as much
17:10:40 <notting> jsmith: essentially, the late-landings led to a late RC
17:10:52 <jsmith> notting: Yeah, I'd say that's a fair assessment
17:11:32 <notting> Oxf13: also, there were some rel-eng process delays, but i don't think those really caused the slip
17:12:29 * poelcat here
17:12:31 <jsmith> That's the thing -- it's hard to point to one big fault
17:12:39 <jsmith> Instead, it was a bunch of small things that added up
17:13:10 <notting> if i had to pick one thing to concentrate on, it would be figuring out a way to get the RCs out sooner
17:13:13 <Oxf13> sure, that's almost always the case
17:13:45 <Oxf13> from my POV, and after talking to some developers, trying to set different dates for different features seems kind of lame
17:13:59 <Oxf13> so instead, I think we just need to increase the time between feature freeze and Test Compose/RC
17:14:12 <Oxf13> perhaps 2 weeks instead of 1
17:14:51 <notting> perhaps. did we actually get a TC before we got to the RC-ish date?
17:15:30 <Oxf13> we had a few TCs but they were basically stillborne
17:15:46 <Oxf13> couldn't get any meaningful testing out of them due to very early crashes
17:15:53 <Oxf13> the TC compose effort turned into the RC compose effort
17:16:34 <jsmith> And even the RCs were a bit rough, as I recall
17:16:52 <poelcat> yes, after two mis-fires with the TC we moved to the RC
17:16:58 <Oxf13> yeah, we had no blockers on the list so we could compose the RCs, but we had no blockers because we couldn't test anything to see what was wrong.
17:17:23 * poelcat thought it was positive that we resolved all the blockers before moving the RC
17:17:23 <notting> istr we got *some* TC testing w/updates images. but not a great deal.
17:17:30 <poelcat> it was a crunch but people got it done
17:19:48 <Oxf13> poelcat: that is positive, but as we see there was just stuff lurking behind the doors we couldn't previously open
17:20:06 <Oxf13> so I have a proposal for next release.
17:20:07 <poelcat> IMHO this seems like more of an issue for FESCo to dive and solve into rather than releng
17:20:48 <notting> Oxf13: don't have everyone leave the same weekend. j/k
17:20:51 <Oxf13> #idea Have two weeks of time between feature freeze and Alpha RC
17:21:31 <notting> Oxf13: we can certainly propose that.
17:21:49 <notting> an idea that was proposed to me - compose the alpha RCs from a separate tag, while the branched tree can still move on
17:21:59 <Oxf13> poelcat: while I agree that it's a FESCo thing, our group should identify at our level what would have made a difference.
17:22:22 <Oxf13> notting: that's how we used to do it, before N-F-R, and there was a lot of confusion around what got into the alpha tag, and how it got in
17:22:27 <Oxf13> and it involved lots of releng tickets
17:22:32 <Oxf13> which developers didn't like
17:22:39 * jsmith would point out that there's still some confusion
17:22:56 <Oxf13> it also meant that what was produced nightly was /not/ what was in the alpha
17:23:00 <Oxf13> which reduced our testing ability
17:23:30 <Oxf13> "It was working in the nightly tree, why wasn't that in Alpha?"
17:23:43 <jsmith> My opinion is that the new way is better -- it just hasn't been well explained
17:24:16 <notting> Oxf13: perhaps. a lot of this request comes from the extended slip, where everything freezes for a week while we verify one bug and re-run through the qualification
17:24:23 <Oxf13> right now, the push to stable is at a dead stop when we enter RC phase.  We could instead still do a push of selected things to stable, but that puts the tree at risk.
17:24:32 <Oxf13> notting: yep.
17:24:38 <nirik> I think it might be worth considering a earier go/nogo if there's a slip.
17:25:27 <nirik> ie, if you slip and then fix the blockers, have the next go/nogo at the time all the blockers are solved.
17:25:42 <notting> Oxf13: perhaps when we reach a certain level of RC-ness, we tag the alpha then.
17:26:24 <nirik> in this case it would have let us unfreeze monday most likely, instead of waiting until wed, for thursdays compose.
17:26:43 <Oxf13> nirik: that could work.  We wouldn't move the ship date earlier, but we could move the compose date earlier.
17:26:49 <nirik> right.
17:27:01 <notting> should we discuss this at fesco next week too?
17:27:02 <Oxf13> notting: that's a possibility, but I see risk in that
17:28:16 <poelcat> Oxf13: how does moving the compose date earlier help? in the case of the alpha it took a full week TC --> RC (thurs to thurs) before we had something usable
17:28:21 <Oxf13> notting: yeah, this is probably at a higher level a FESCo issue, I just wanted to have some ideas floated and discussed from our level.
17:28:41 <Oxf13> poelcat: We were talking about in the event of a slip.
17:29:09 <Oxf13> poelcat: if we slip for an issue, and have a fix the very next day, we could compose another RC and try to get to a go/nogo earlier than a full week.  May not be possible, depends on QA resources
17:29:30 <poelcat> ahh
17:30:07 <notting> we don't want to move the full go/no-go meeting. just get the engineering/QA 'yup, we're a go' stamp earlier.
17:30:26 <poelcat> so for the f14 alpha how many days earlier could we have gotten the stamp from QA?
17:30:48 <Oxf13> I have no idea
17:30:56 <nirik> poelcat: probibly monday.
17:30:59 <notting> at least two, i think
17:31:44 * poelcat recalls that full testing wrapped up on wed sometime
17:31:45 <Oxf13> poelcat: the "problem" we're looking to solve is having the stable repo remain frozen for an extended period of time in the event of a slip
17:31:59 <nirik> if we had decided the x issue was not a blocker, it would have been sooner... I think we had a fix for the blocker friday.
17:32:40 <Oxf13> nirik: I think we'd have to talk to QA though.  THey may have needed the rest of the week to confirm that there were no other bugs
17:33:03 <poelcat> Oxf13: sort of related, is this page correct? https://fedoraproject.org/wiki/Change_deadlines
17:33:08 <poelcat> i created it while you were out
17:33:19 <Oxf13> poelcat: will review after meeting
17:33:48 * poelcat thinks some of the details might be slightly off
17:33:51 <poelcat> thanks
17:33:55 <notting> should we move on?
17:36:42 <Oxf13> yeah, probably
17:36:49 <Oxf13> what conclusions have we reached for the notes?
17:37:05 <Oxf13> #info propose 2 weeks between Feature Freeze and RC
17:37:17 <notting> #info discuss possible unfreezing earlier with fesco
17:37:18 <Oxf13> #info Investigate "unfreezing" sooner in the case of a slip
17:37:24 <notting> whoops
17:37:56 <Oxf13> I guess the alpha tag falls under that one
17:38:12 <Oxf13> #info investigate earlier go/no go decision in case of a slip
17:38:17 <Oxf13> ok, lets move on.
17:38:25 <Oxf13> #topic dist-git
17:38:39 <Oxf13> can't remember if we've had a meeting since this happened
17:39:08 <Oxf13> oh we did
17:39:17 <Oxf13> #info fedpkg updates have been going through resolving lots of issues
17:39:39 <Oxf13> #info still some design issues, such that using a local private branch that does not track any remote branch breaks some fedpkg actions
17:40:15 <Oxf13> I'll continue to develop on it.
17:41:02 <Oxf13> anything else on this?
17:42:20 <notting> not from me.
17:42:25 <Oxf13> ok.
17:42:32 <Oxf13> #topic SOPs
17:42:42 <Oxf13> so our SOPs got a workout the last couple weeks as I was on leave.
17:42:53 <Oxf13> I believe that notting found a number of holes in them that need updating
17:42:58 <Oxf13> correct?
17:43:23 <notting> yeah. i sent what i found along in e-mail.
17:43:33 <Oxf13> yep, I've flagged that
17:43:36 <notting> i can post it somewhere more public if we need to
17:43:46 <Oxf13> I also still have some gnote notes on further SOPs
17:43:53 <Oxf13> poelcat: are you willing to do some legwork on these again?
17:43:54 <notting> some of it was SOP, some of it was system setup/puppet tweaks. i think i fixed most of the infrastructure things
17:44:22 <poelcat> Oxf13: absolutley
17:44:23 <nirik> I can try and write up an updates pushing SOP if folks like. (I would need help on the non epel parts of it tho)
17:44:32 <Oxf13> nirik: that would rock
17:44:33 <poelcat> otherwise one of my anual personal goals will fail :)
17:44:43 <notting> Oxf13: one thing i didn't write mail on... the docs we have on sigul are... not good.
17:44:57 <Oxf13> poelcat: ok, thanks.  trying to divide up time between responding to releng tickets, doing fedpkg work, and getting some wiki work done.  I'll send some stuff along.
17:45:11 <Oxf13> notting: yeah, some of that is on purpose, some isn't.
17:45:12 <poelcat> Oxf13: i'm out all next week, but maybe open tickets against me or something like that
17:45:24 <Oxf13> poelcat: ok
17:45:29 * poelcat was also hoping to do a run through of release SOP page and figure out where we have gaps
17:45:59 <poelcat> i think more of them are done than we link to
17:46:06 <poelcat> which is a good problem to have
17:46:12 <dgilmore> gahh im here now
17:46:14 <Oxf13> indeed!
17:46:59 <Oxf13> dgilmore: no worries, we've assigned all work items to you
17:47:18 <dgilmore> Oxf13: ok
17:47:42 <dgilmore> Oxf13: in that case we are going to support sparc as a primary arch :)
17:47:44 <Oxf13> #info oxf13 to send some SOPs to poelcat
17:47:57 <Oxf13> #info Oxf13 to update some SOPs based on notting's experiences running things.
17:48:25 <Oxf13> I guess those were more of action than info, but oh well.
17:48:33 <notting> dgilmore: if you've got other SOP notes from running through them, send them to Oxf13
17:48:41 <Oxf13> #topic open floor
17:48:48 <Oxf13> alright, any topics from you folks?
17:49:01 <notting> yup
17:49:20 <dgilmore> notting: most of the issues i ran into were broken mock configs that you updated
17:49:24 <notting> first of all, i want to extend an official thanks to jwb for his work on rel-eng the past few years
17:49:34 <dgilmore> they worked for one use case but not the other that uses them
17:49:43 <nirik> agreed. Hurray for jwb.
17:49:45 <dgilmore> notting: indeed many cheers for him
17:50:23 <notting> #agreed rel-eng thanks josh boyer for his rel-eng service
17:50:52 <Oxf13> super agree here
17:51:06 * jsmith agrees as well
17:51:50 <notting> ok, and an operational question. should we note down somewhere who is primary person pushing updates for what? just to avoid confusion and to have an easy fallback path when people are out
17:52:02 <dgilmore> notting: probably
17:52:11 <dgilmore> ive been doing it the last couple of weeks
17:52:22 <notting> i've been doing f14 only
17:52:37 * nirik has been pushing el4/el5, but would be happy to do more if folks would like.
17:52:59 <Oxf13> I was doing all the Fedora updates prior to taking leave
17:53:12 <Oxf13> and I'm willing to take on some of that again
17:53:29 <Oxf13> but I also like the idea of having nirik do some of that :)
17:53:44 <nirik> perhaps some kind of rotating schedule or something...
17:53:49 * nirik doesn't know what would be best
17:53:56 <Oxf13> that's not a bad idea
17:54:06 <notting> i'm ok with doing f14 for now. not sure i have time to push f12 + f13 + f14
17:54:26 <notting> but i'm also ok with someone else doing f14 :)
17:54:28 <Oxf13> notting: nod.  I'd say the rotating schedule would concern non-branched updates.
17:54:49 <dgilmore> Oxf13: yeah
17:55:11 <dgilmore> i need to  get you to setup nirik access in sigul.
17:55:21 <dgilmore> so he has access to do el6 aslo
17:55:32 <nirik> yeah, el6...
17:56:06 <Oxf13> ok.
17:56:18 <Oxf13> #action Oxf13 to get nirik setup in sigul for pushing updates
17:58:36 <nirik> thanks. happy to help
17:58:43 <Oxf13> alright, anything else?
17:58:57 <dgilmore> i have nothing else
17:59:09 <notting> so, for now dgilmore + nirik are doing el*, i'm doing f14, and f12/f13 is ...?
17:59:19 <dgilmore> notting: me
17:59:36 <Oxf13> #idea setup a rotating schedule for Fedora updates pushes
17:59:41 <Oxf13> at least for this week
17:59:53 <Oxf13> I may take over Fedora stuff next week to give dgilmore a break
18:01:27 <poelcat> how technical does a person need to be to help with pushes?
18:02:13 <poelcat> IOW could I help?
18:03:18 <notting> not sure it's 'technical' as much as 'understand somewhat the guts of the infrastructure to debug when it goes wrong'
18:04:01 <poelcat> notting: so there's stuff not covered in the SOPs that's more gained by experience?
18:04:15 <notting> well, ideally it wouldn't go wrong, and you wouldn't need to know
18:04:22 <dgilmore> poelcat: why things fail is wide and varied
18:04:32 * nirik has a list tho to add to a sop. ;)
18:04:33 <dgilmore> and you cant cover all failure cases
18:05:05 <poelcat> dgilmore: right, that's the 'experience' component I was suggesting
18:05:47 <Oxf13> yep
18:05:56 <Oxf13> it takes some experience to recognize the failure states and work from there
18:06:18 <Oxf13> or at least having somebody else with that experience on hand to look over issues, which is how jwb got up to speed
18:06:40 <Oxf13> he took over the pushing and when he ran into issues he pinged me about them.  Eventually he was able to use his experience to figure things out on his own
18:07:49 <dgilmore> i imparted onto nirik what knowledge i had of the most common cases with epel pushes
18:08:19 <Oxf13> ideally we should document the issues as we hit them in a SOP for pushes
18:08:21 * nirik wrote them down. ;)
18:08:39 <dgilmore> hopefully the newer bodhi fixes many of them
18:08:56 <Oxf13> heh, that's been the mantra for years
18:09:21 <dgilmore> Oxf13: biggest issue for epel has been making sure packages meet epel's testing policies
18:09:32 <dgilmore> and bodhi has something for support of that now
18:10:02 <Oxf13> cool
18:11:10 <nirik> dgilmore: was going to talk to luke about that...
18:11:18 <nirik> as it doesn't seem to be enabled/working. ;(
18:11:31 <Oxf13> Alright, I'd like to wrap up here
18:12:18 <dgilmore> nirik: ok
18:12:23 * dgilmore is done
18:12:26 * dgilmore needs food
18:13:33 <Oxf13> ok.
18:13:36 <Oxf13> Thanks all!
18:13:38 <Oxf13> #endmeeting