16:02:11 #startmeeting Fedora QA meeting 16:02:11 Meeting started Mon Nov 3 16:02:11 2014 UTC. The chair is adamw. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:02:11 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:02:15 #meetingname fedora-qa 16:02:15 The meeting name has been set to 'fedora-qa' 16:02:19 #topic Roll call 16:02:21 * kparal is here 16:02:31 * satellit listening 16:02:32 ahoy, welcome to the first post-clock-change meeting, always a source of fun :) 16:02:39 * pwhalen is here 16:02:44 * roshi is here 16:03:26 .hello sgallagh 16:03:27 sgallagh: sgallagh 'Stephen Gallagher' 16:05:38 ummm....is this thing on? 16:06:07 no, it isn't 16:06:12 * adamw turns it off 16:06:24 #chair roshi danofsatx 16:06:24 Current chairs: adamw danofsatx roshi 16:06:43 alrighty, let's get rolling as it might be a packed meeting 16:07:10 oh, shoot. 16:07:17 i've just noticed I never hit Send on the meeting announce mail 16:07:22 so, the proposed agenda for the record: 16:07:30 1. Fedora 21 Beta: remaining work 16:07:35 2. Fedora 21 Final schedule 16:07:38 3. Test Days 16:07:40 4. Open floor 16:08:20 #topic Fedora 21 Beta: remaining work 16:09:02 so first thing I had under this topic is the bug that was discovered in upgrade shortly after we signed off on Beta: https://bugzilla.redhat.com/show_bug.cgi?id=1159292 16:10:24 #info major upgrade issue needs mitigating / resolving somehow for beta: https://bugzilla.redhat.com/show_bug.cgi?id=1159292 16:10:56 yeah, that's... pretty terrible 16:11:37 we've been discussing approaches to this in #anaconda for the last few minutes 16:11:48 * sgallagh goes to read the scrollback 16:12:02 we now have two viable fixes to fedup, but there's still a worrying case, which is if someone runs an upgrade without updating fedup first 16:12:20 what about the redirect to a newer upgrade.img? 16:12:45 adamw: Our statement on fedup is that it should always be up-to-date before executing. 16:12:59 kparal: where? 16:13:07 Maybe to avoid this in the future, we could have it self-update before continuing? 16:13:08 "should" - the magic keyword. 16:13:17 s/should/must/ 16:13:18 we need to have a newer uprgade.img first. ;) 16:13:24 adamw: I posted it to the bug report 16:13:24 sgallagh: there's various things we can do in future, none of them relevant now. 16:13:31 kparal: i mean, where does the new upgrade.img go? 16:13:51 OK, so if this was Final, I'd be more worried. 16:13:51 adamw: some other location than test/21-Beta 16:13:54 I suppose there's the one in the branched tree... if it has a fix. 16:14:21 nirik: not yet. 16:14:24 it could, easily enough. 16:14:25 But do we generally have a lot of people upgrading to Beta who *wouldn't* hear us if we made a reasonable announcement? 16:14:35 but then the problem with that is it gets rebuilt daily, so it *could* potentially be broken any day. 16:14:53 adamw: yep 16:15:05 that's why I proposed creating test/21-Beta-fedup-fix. should be picked by mirrors automatically 16:15:10 so let's back up and quantify the problem a bit 16:15:11 and then mirrormanager can point to it 16:15:42 We have a frozen upgrade.img in the Beta tree that contains the misfeatured systemd. I don't know if releng considers it viable to just delete that, but we can't *change* it. 16:15:53 Beta release day is tomorrow. 16:16:13 We know ways to change both fedup and upgrade.img so this doesn't happen, but we don't have either fix in Bodhi yet. 16:16:39 We can provide an update for fedup for stable releases, but we can't *force* people to do it. 16:16:53 adamw: Can't we? 16:17:01 AIUI, no. 16:17:08 do you have an idea on that front? 16:17:26 adamw: Do the upgrades happen to the Beta tree or the stable tree? 16:17:35 can't fedup update itself as a "pre-stage 1" event? 16:17:46 danofsatx: a fedup we've already released can't. 16:17:57 time paradox! 16:18:01 word. 16:18:08 sgallagh: once the mirrormanager redirect is in place, it uses whatever location configured by fedora releng. before that, you need to use --instrepo manually 16:18:09 we don't need fixes for this which are design changes for fedup that will take weeks - i mean, that's not *useless*, but it doesn't fix the problem we have right now. 16:18:17 danofsatx: That was the same suggestion I was making for the future, but let's solve the present first 16:18:22 sgallagh: there are two major moving parts to fedup 16:18:56 sgallagh: the bits that run on the local system to download packages, download upgrade.img, and prepare the second stage boot, and the bits that live inside upgrade.img that mostly actually control the second stage boot 16:19:12 sgallagh: changes to the *first* stage occur in the fedup package, in the 'source' distro 16:19:18 Right 16:19:23 changes to the *second* stage occur in fedup-dracut (and systemd and whatever else), in the 'target' distro 16:19:46 right now we can update fedup for f19 and f20 to cause changes in stage1 (but we can't be 100% sure people will actually have installed those updates) 16:20:25 we can also generate a new upgrade.img , but what we can't do is replace the one in the Beta tree. we can at least plausibly have mirrormanager point elsewhere so that people who don't explicitly specify a --instrepo don't get the Beta one, but instead...some other one. 16:20:33 people would still be able to use the Beta one with an explicit --instrepo in that case. 16:20:36 What I'm wondering is whether we could stick something into the %pre section of something like basesystem such that it would abort any attempt to continue the transaction if the current system did not already have the patched version of fedup or newer 16:21:26 aborting package upgrade transactions in the middle is the thing we're trying to *solve*, isn't it? :) 16:21:29 adamw: why exactly we can't replace the one in Beta tree? 16:21:49 kparal: because it'd break the consistency of the tree, some bits of it would then not have been generated *from* it, which is a no-no by policy 16:21:51 adamw: Well, I was thinking about trying to abort it at the beginning, before anything had changes 16:21:56 But I suppose we can't guarantee the order 16:22:04 sgallagh: I don't think there's a strong enough guarantee of the ordering, yeah 16:22:07 so "just" a policy issue 16:22:30 so i'm thinking perhaps a three-pronged attack might be strong enough: 16:22:42 1. fix fedup in f19 and f20 16:22:49 2. wipe the upgrade.img from the Beta tree 16:22:59 3. build a new one, shove it somewhere else, and have mirrormanager point to that 16:23:17 in order of priority, i guess we'd switch 2 and 1 over 16:23:23 humm... 16:23:36 If it's "just" policy, I wonder if we can't make a one-time exception and move 3. into the Beta tree 16:23:42 wwoods: you here? doesn't fedup also get the signed treeinfo and such? not just the updates.img? 16:24:03 nirik: it does use .treeinfo to find the updates.img, yeah 16:24:08 I think it also downloads yum metadata 16:24:10 * nirik thinks it may break other stuff too if we muck with treeinfo 16:24:19 ... 16:24:23 it uses instrepo as a package repo if it's a valid one, i'm not sure if it requires it to be 16:24:26 I am pretty opposed to mucking with the beta tree personally. 16:24:39 nirik: i think just flat removing upgrade.img from it is as far as i'd be willing to go 16:24:43 nirik: Even removing the upgrade.img? 16:24:47 that doesn't really violate any rules afaik 16:25:02 Would that break the treeinfo? 16:25:18 note that in my plan 3 is substantially less important than 1 or 2, because the consequence of not doing it is not 'people wind up with broken systems' but 'it's more difficult for people to upgrade' 16:25:29 I guess I could handle that... if you can talk dgilmore into it. ;) 16:25:43 sgallagh: if .treeinfo wasn't re-generated it'd have a pointer to a non-existent image, i guess. 16:25:46 adamw: instrepo does have to be a repo. it can be an empty repo, but it needs to be a repo. 16:25:51 k 16:26:10 so if we make a new update img we would also need signed treeinfo, etc, etc. 16:26:16 right. 16:26:18 so it's not as simple as just a update image 16:26:25 wwoods: fedup expects a .treeinfo and only a .treeinfo ? 16:26:37 you can't just have instrepo be a direct pointer to an image? 16:26:37 it expects .treeinfo.signed 16:26:43 k 16:26:46 or .treeinfo, if you have --nogpgcheck 16:27:12 can we just tell people to use the branched tree? 16:27:20 I guess it's not signed tho 16:27:23 nirik: the thing that worries me about that is it's generated automatically, nightly. 16:27:30 in theory it could break any day. worse than this. 16:27:33 some other regression might appear there 16:27:34 right 16:27:45 yeah, but this is a beta... if you are upgrading to a _BETA_ don't you expect things like that? 16:28:02 in theory (again) no-one should be breaking it between beta and final, but then in theory systemd shouldn't have broken it between TC3 and TC4. :P 16:28:06 nirik: No, you probably expect the upgrade to work and then the runtime to be flaky 16:28:29 we do also have the option of just fixing fedup, communicating the issue very clearly and declaring that that's enough. 16:28:44 (for Beta, obviously we'd fix it harder for Final). 16:28:51 adamw: that makes me scared 16:29:30 I guess I am ok with your 3pronged attack, although I don't know what all it will take to make a fixed one. We do have the fix for the updates.img side of things in hand? 16:29:31 nirik: how much work would it be to build a sort of mini-tree with a fixed fedup? 16:29:40 nirik: i dunno if it's done yet, but it'd be trivial 16:29:47 fedup isn't the problem. upgrade.img is the problem. 16:29:49 nirik: just revert the systemd patch that introduced the timeout 16:30:03 wwoods: sorry, when i say 'fix fedup' i mean 'work around the problem in fedup', as we've been working on. 16:30:10 (or add a systemd patch that just sets the action to "not reboot") 16:30:17 well, I guess in theory it's just pungi runs and then signing things/moving them around. 16:30:46 ok, well 16:30:50 Let's attempt to do the "fix and communicate" piece regardless of anything else we do. 16:31:01 i think we may have at least discussed things as far as it makes sense to in a QA meeting, QA doesn't make the ultimate call on this anyway 16:31:06 Do we have a fixed/worked-around version of fedup ready for a Bodhi update yet? 16:31:13 as a QA group, do we want to make a formal suggestion / recommendation? 16:31:25 yeah, and we can ping dgilmore as soon as he's in later today. 16:31:29 sgallagh: wwoods has a fix (different from mine) which he thinks works, i can test it here after the meeting 16:33:00 my personal opinion is that I'd be definitely OK if we wiped the broken upgrade.img from Beta tree, shipped a fedup that works around the issue, and pushed out a systemd without the patch so the upgrade.img in development/21 doesn't have the bug. we could then work to provide an alternate upgrade.img somehow or other, but doing that is less important than making it hard to run into the bug. 16:33:29 i'd be possibly ok with the just-change-fedup-and-document-it plan, it *is* still a beta. 16:33:36 any other thoughts from qa folks? 16:34:10 the only downside to doing that would be that the development/21 one isn't signed... if that matters to people. 16:34:18 I'm fine with the fix and document 16:34:32 As am I 16:34:52 We've always said (fairly clearly) that upgrades are only supported from a *fully-updated* previous release. 16:34:54 as the lower bound for what we do 16:35:07 Now, of course the problem is pushing out this fix to the mirrors in time for the beta release 16:35:12 adamw: sounds good 16:35:21 Do we know what the usual lag is on that? 16:35:30 masta: is on push duty this week, please coordinate with him... 16:35:48 sgallagh: i think we can do it with some co-ordination, if wwoods gets the builds to bodhi relatively quickly, we test them quickly, and releng does a stable push right after w test 16:36:02 * nirik nods. 16:36:04 nirik: what's the window on stable pushes usually? 16:36:08 (and is it worth asking the mirrors not to actually release the Beta until Wednesday if we had to?) 16:36:24 sgallagh: that's worthless, it's the f19 and f20 package sets that matter in this perspsective 16:36:25 sgallagh mirrors? usually daily, but some mirror releases manually 16:36:26 adamw: Right, but the trickle-out to the actual mirrors is what concerns me 16:36:33 * masta looks in 16:36:37 adamw: well, I have been pushing all {21|20|19} updates and updates-testing... that takes a long time. ;) 16:36:42 who invoked the masta? 16:36:45 but if we do just a stable of each it shouldn't be too bad 16:36:46 the collective 16:36:50 I know we usually say "within 48 hours" 16:36:55 sgallagh: also, the mirrors are largely on autopilot so we can't really tell them to not do things 16:36:59 nirik: only 19 and 20 would matter, so a special stable push for just those two maybe 16:37:03 masta: we need to make sure a as yet unfiled update for fedup goes out asap. 16:37:13 to 19 and 20. 16:37:22 nirik: ok, I'll block pushing on that one update. thanks 16:37:22 yeah, 16:37:25 Well, I suppose we could just not have the Beta announcement go live until the mirrors have synced 16:37:48 btw, as a side note, i see a huge pile of changes to systemd yesterday. i'd have been much happier with *one* change to drop the offending patch. sigh. 16:37:59 so, need update filed, karma for stable, then push 19/20 stable only. 16:38:16 adamw: See my previous concerns about "opening the floodgates..." 16:38:33 the systemd package diff from the build in Beta to the current f21 git tip is 81,000 lines. yay, systemd. 16:38:48 adamw: he backports lots of stuff from rawhide to stable branches... sometimes it bites. Most of the time its fine 16:39:01 I vote we ask them to revert all changes and fix this one issue first. 16:40:07 wow, they changed it from being '216 plus a bunch of stuff' to being '217 minus a bunch of stuff'. i think i need to go in there and break some heads about proper stable software development practices again. 16:40:35 * danofsatx goes to hunt for life-giving coffee 16:40:46 sgallagh: that seems like an excellent plan 16:40:58 adamw: yours as well :) 16:41:09 so let's see 16:42:16 propose #agreed QA recommends that we at a minimum ensure fedup for 19 and 20 is updated to work around this issue today (and will provide karma to ensure that's possible). We also endorse the proposal to delete the Beta upgrade.img and provide an alternative in a special location if releng considers it viable and not too much work. 16:43:05 sounds reasonable 16:43:10 ack 16:43:41 ack 16:43:51 ack 16:43:55 ack 16:44:04 * danofsatx found coffee....and Skittles 16:44:18 adamw: I'm reaching out to zbyszek over in #fedora-devel. I will update if and when he responds 16:45:18 sgallagh: i'm drafting a mail touching on that and otehr concenrs 16:45:22 #agreed QA recommends that we at a minimum ensure fedup for 19 and 20 is updated to work around this issue today (and will provide karma to ensure that's possible). We also endorse the proposal to delete the Beta upgrade.img and provide an alternative in a special location if releng considers it viable and not too much work. 16:45:29 ok, that doesn't leave us with much tiem for our other topics, but let's try 16:45:43 sgallagh: mattdm: i assume there'll be a special fesco meeting or ticket or something to come up with a final plan for fedup? 16:45:53 wwoods: many thanks for being around to work on this btw 16:46:25 adamw: zbyszek is willing to do the aforementioned stripped-down build 16:46:25 adamw maybe? fesco is best at doing things on wednesday afternoon, rather than at short notice 16:46:41 mattdm: I think we can probably manage a quorum to deal with this 16:46:58 Even if I have to start making phone calls 16:47:04 sgallagh: that's good news 16:47:18 sgallagh do you want to organize that? ("want") 16:47:21 mattdm: sgallagh: well, we need someone or other to make a choice and start making stuff happen 16:47:53 adamw: Since I'm still catching up, can you send a summary of the options and I'll do what I can to wrangle a decision? 16:47:53 qa, releng and wwoods can handle the 'get fedup updated' side of the equation, but we'd probably need more powah to start wiping upgrade.imgs and building special trees 16:47:57 sgallagh: sure 16:48:18 I doubt there will be any argument on the "minimal solution" of fix-and-scream-loudly at least 16:48:18 we likely need dgilmore around, which he will be later today when he wakes up. :) 16:48:23 right 16:48:54 ops, sorry, late here, re-reading backlog 16:48:56 nuking the image from mm can be done pretty quickly 16:49:08 nirik: has MM already been updated to point to beta tree for fedup? 16:49:10 building a new one and pointing to it might be more tricky 16:49:20 nirik: i'd say it'd be a good idea to drop that *right now* for the present 16:49:23 adamw: good question, I don't think so, since the bits are still closed. 16:49:26 OK. 16:49:27 * nirik looks to make sure 16:49:57 OK, so the other stuff I had for f21 beta: 16:50:34 #info Common Bugs needs updating - https://fedoraproject.org/wiki/Common_F21_bugs , http://bit.ly/fedora-commonbugs-proposed is the list of bugs that need to be added to the page (plus some old stale ones) 16:51:18 i had a note about livecd-tools stable updates but it looks like both 19 and 20 packages got karma and were pushed stable, so yay 16:51:31 did anyone have any other things we need to square up ahead of Beta release? 16:52:06 Beta looks good from my viewpoint, but I don't know everything. 16:52:23 (other than fedup, that is) 16:52:54 sgallagh had something, according to kparal, iirc 16:53:05 That was for the Final discussion 16:53:22 yeah, that's next 16:53:29 ok, as we're short on time, let's go to that 16:53:40 #topic Fedora 21 Final schedule 16:53:59 so there was some talk in go/no-go about moving up the 21 Final schedule in some way 16:54:13 it has been suggested to move up any or all of the following: 16:54:30 a) TC1 compose date, b) Final freeze date, c) the actual scheduled Final release date 16:55:03 move up to when? 16:55:07 Right; my recommendation was that we should move the Final Freeze (and associated compose) up by one week 16:55:14 link to current schedule (for documentary purposes only, of course) 16:55:24 Giving us a two-week period to do release validation and deal with blockers 16:55:27 #link https://fedoraproject.org/wiki/Releases/21/Schedule 16:55:30 currently the schedule is for Final TC1 to be composed on 2014-11-11 (next Tuesday), Final freeze to occur 2014-11-25, and Final Go/No-Go on 2014-12-04 16:55:37 Hopefully to avoid slipping, which we cannot afford. 16:55:47 danofsatx: also https://fedorapeople.org/groups/schedule/f-21/f-21-quality-tasks.html 16:55:48 (If we slip Final at all at this point, it pretty much becomes January) 16:56:13 #info it was discussed at Go/No-Go whether any or all of a) TC1 compose date, b) Final freeze date, c) the actual scheduled Final release date could be moved up from the current schedule - see https://fedoraproject.org/wiki/Releases/21/Schedule 16:56:25 sgallagh: with current schedule, we have one week for slip... the second week we have before holidays is no-go 16:56:41 my concern with this is that folks might get pretty burned out with what would be basically an eternal validation treadmill from Beta TC1 to Final release 16:56:50 sgallagh: well, not 2 weeks with thanksgiving, but yeah 16:56:56 also that we have not yet done necessary criteria/test case alterations for Final 16:57:00 and with beta issues you're fighting right now 16:57:01 jreznik: We're already looking at the 9th. Slipping to the 16th is getting into vacation territory 16:57:18 what do QA folks think about any or all of these ideas? 16:57:24 sgallagh: 16th is still somehow doable (means it has to be ready week before) 16:57:52 but if it means we would burn out our qa friends, then january does not look that bad 16:57:55 I'm sorry, I'm not getting what "move up" means - one week sooner or later? 16:58:13 kparal: earlier 16:58:18 thakns 16:58:39 I'm for moving it earlier 16:58:54 same here 16:59:08 Thanksgiving (USA) is the other argument for moving it up; not a ton of dev will be happening that week anyway. 16:59:22 I'd rather use that time for finding blockers if we can 16:59:41 kparal: roshi: when you say 'it', which 'it'? 16:59:44 adamw: just checked mm and fedora-install-21 is not currently going anywhere. 16:59:47 a), b), c) or any combination 16:59:52 sgallagh: it's nov 27, right? 17:00:03 Thanksgiving is Nov. 27th, yes 17:00:12 nirik: is that what fedup uses? 17:00:16 earlier 17:00:19 many folks in the us however take the 26,27th,28th off 17:00:27 adamw: yes, I think so. 17:00:31 adamw: probably everything, if that's reasonable 17:00:38 sgallagh: sorry for us european folks being barbarrian and forgetting thanksgivings... 17:00:45 jreznik: =) 17:00:45 nirik: the same here in cz 17:00:53 the 27th and 28th are both RH holidays. 17:00:58 jreznik: and us canadian idiots having it inconveniently at a different time 17:01:14 and many people are going to take some PTO before christmas 17:01:27 so the earlier the better 17:01:28 jreznik: You're not barbarians. And I really don't want to picture that, thanks 17:01:32 kparal: that's why we don't have much time for slips before holidays 17:01:44 so... 17:01:59 if everyone's OK with spinning Final TC1 tomorrow, i guess you're all masochists...and... 17:02:10 sgallagh: no, I just feel very bad when I forget as I understand how important day it is for you guys 17:02:19 insane and QA, that's the same, isn't it 17:02:50 propose #agreed QA in principle supports moving any or all of Final TC1, Final freeze and Final release date a week earlier on the schedule to provide more time for blocker identification and fixing and less likelihood of a post-Christmas slip 17:03:04 ack 17:03:10 you folks do realize this means more release validation testing, more blocker bug review and more FE bug review, right 17:03:10 =) 17:03:24 yeah 17:03:30 * adamw calls the whisky depot 17:03:39 ack 17:03:53 ack thpppppt 17:04:00 if anyone QA-ish disagrees with the proposal, please speak up 17:04:07 (or forever hold your peace, etc, etc) 17:05:13 it's really tough question as it would mean the first go/no-go on Nov 27... 17:05:23 jreznik: No 17:05:42 Or at least, I don't think we need a Go/No-Go earlier, just more time 17:05:50 I suppose an unofficial check-in would be a good idea though 17:06:26 I'm not even going to dare to suggest that we could be Go on that day. (Especially since rel-eng wouldn't be able to do the necessary stuff over the holiday weekend to get to the mirrors) 17:06:27 sgallagh: one of what adamw said is "Final release date a week earlier" 17:06:39 oh, you're right. 17:07:00 sgallagh: we did go/no-go on thanksgiving once and it was far from ideal 17:07:09 I'm not sure that would work 17:07:10 Point of note, I will not be attending any IRC meetings on the 27th. 17:07:25 I'd suggest keeping the planned release date, just getting extra validation/blocker time 17:07:38 And on the oddball chance we're actually done early: everyone take a day off 17:08:03 sgallagh: yeah and seems like qa are masochist enough - if we would be able to start testing earlier with early tc1, it could be enough 17:08:32 So, 21 TC1 tomorrow, then? 17:08:35 * danofsatx ducks 17:08:53 it's going to be "good times" :) 17:09:48 there's still an option to slip to January right now - but that would be quite a lot of additional work (as early Jan release is almost impossible with people still being out) 17:10:06 again this is just QA voting in principle, you can work out the details for fesco meeting or whatever 17:10:14 #agreed QA in principle supports moving any or all of Final TC1, Final freeze and Final release date a week earlier on the schedule to provide more time for blocker identification and fixing and less likelihood of a post-Christmas slip 17:10:16 and I really don't want that but I also don't want to shoot ourselves into the knee now 17:10:47 we should try and get fesco to decide this and communicate it asap... as developers will have less time before freeze. 17:10:50 adamw: It was probably going to be *effectively* QA's decision, as if you guys weren't willing to run that treadmill again, it wasn't going to happen 17:11:11 sgallagh: i'd say you should also run it by anaconda and releng folk, as they're the others who mainly get caught on the treadmill. 17:11:14 nirik: I'll file a ticket and ask for votes right-the-hell-now 17:11:23 sgallagh: thanks 17:11:40 anaconda/releng could be also affected as adamw said 17:11:48 * sgallagh nods 17:11:48 ok, I had a a topic for a Test Day checkin but we're 10 minutes over time 17:11:51 roshi: are you on top of test days? 17:12:00 for the most part 17:12:05 waiting on dates for some stuff 17:12:10 sgallagh: add me to CC pls 17:12:12 atomic 17:12:16 Will do 17:12:22 I'll check in on all that today 17:16:34 OK. 17:16:36 so just quickly 17:16:37 #topic Open floor 17:16:38 anyone have anything urgent that can't wait till next week? 17:17:39 * roshi has nothing 17:19:39 OK, let's wrap up then' 17:19:45 thanks for coming and following the topics, folks 17:20:02 np - thanks for running the meeting :) 17:20:13 #endmeeting