16:02:45 #startmeeting Fedora QA meeting 16:02:45 Meeting started Mon Jan 27 16:02:45 2014 UTC. The chair is adamw. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:02:45 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:02:51 #meetingname fedora-qa 16:02:51 The meeting name has been set to 'fedora-qa' 16:02:55 #topic Roll call 16:02:57 * tflink is here 16:03:04 ahoyhoy folks, sorry i'm late 16:03:11 * brunowolff is here 16:03:13 * mkrizek is here 16:03:16 * kparal here 16:03:50 * Viking-Ice here 16:04:01 * garretraziel here 16:04:34 big turnout, i see 16:05:09 * pwhalen is here 16:05:55 #topic Previous meeting follow-up 16:06:24 * kparal nudges pschindl, roshi 16:06:35 * roshi is here 16:06:44 #info "adamw to post a summary of our position on EOL to the FESCo ticket" - I did that, see https://fedorahosted.org/fesco/ticket/1198#comment:42 16:06:58 #chair roshi kparal 16:06:58 Current chairs: adamw kparal roshi 16:07:46 * nirik is lurking 16:07:57 #info "adamw to post a mail to test@ highlighting things we can be working on during the 'quiet time'" - did that too, see https://lists.fedoraproject.org/pipermail/test/2014-January/120167.html , thanks to viking for adding another suggestion 16:09:03 any other follow-ups from last time? 16:09:04 I should add in both of these contexts that me and roshi have started working on the QA community role which merges bugtriaging and testing process QA community member 16:09:21 it doesn't look like fesco is driving the EOL process bus right over a cliff or anything, so i don't see anything to worry about there 16:09:41 #info Viking-Ice and roshi have started working on the QA community role which merges bugtriaging and testing process QA community member 16:10:06 thanks for looking at that, guys 16:10:23 hopefully people can earn respectfully their badges for participating 16:10:37 * nirik hopes you will run plans by maintainers as well for feedback before implementing anything. 16:11:06 I'm not wg's so ofcourse I do 16:11:25 neither will I attack our foundation while I'm at it 16:11:39 * nirik rolls eyes. 16:13:02 allllrighty then 16:13:04 moving along 16:13:18 * pschindl is here 16:13:30 #topic SELinux update discussion 16:13:38 hi pschindl, i see you sneaking in late ;) 16:14:05 so, in case anyone missed the agenda and wants to do some speed-reading, the background is at https://lists.fedoraproject.org/pipermail/devel/2014-January/194077.html and https://lists.fedoraproject.org/pipermail/devel/2014-January/194384.html 16:14:36 as there was rather a lot of discussion about the right response to the SELinux Update Disaster of 2014 on devel@, and it's obviously a QA-y topic, it seemed right to bring it up 16:14:39 * cmurf was here at 9.00, then spaced out, and is now here. 16:14:40 * kparal has been speed-reading the whole day and still hasn't gotten to this one yet :/ 16:14:51 do folks think i was on point with my posts there? anyone on Team Kevin? :) 16:14:56 (Kofler, not Fenzi) 16:15:14 I kinda skipped this thread entirely 16:15:24 but I dont doubt that you nailed it 16:15:41 pretty well 16:16:05 yeah, I don't think anyone is on board with remove selinux entirely and drop all our updates process completely. 16:16:16 so mostly what I said is automated testing will be really helpful with the 'delayed action' kind of bug (update initially seems ok, but an important action fails later), and we should make writing that test a priority 16:16:32 and aside from that, i just kept banging the bodhi 2.0 drum, because this is yet another case of overloading the meaning of "+1" and "-1" 16:16:46 I think, given it was a mistake the things we should be doing is finding better process that might have prevented or caught the mistake, before the update went to stable. 16:16:48 * kparal thinks there are easier tests to write which will be no-less helpful 16:16:49 but if anyone had some more immediately productive thoughts... 16:16:59 kparal: which ones? 16:17:06 I read it was nothing to do here at the moment except possibly encourage maintainers to not always use the default karma of 3. 16:17:20 The selinux guys weren't trying to add new restrictions in the update. 16:17:28 "nothing to do" until there's a more fixed process change. 16:17:29 adamw: for example something that breaks actual installation of a package, not just delayed issues 16:17:33 one option would be to only make karma count after the update has reached -testing 16:17:46 cmurf: that was kinda my position - i don't see that there's much twiddling around the edges we can do that well help enough stuff without harming too much, until we have better karma system 16:18:05 kparal: well...this kind of bug is rather worse than an update that just doesn't install 16:18:14 yeah, I know 16:18:21 adamw: I'm fine with that. I got hit by this selinux thing also, but I also consider it an kind of edge case. 16:18:21 I dont think new karma system will prevent things from slipping under the radar 16:18:41 Viking-Ice: i don't think it's a magic bullet, but we just can't even address the obvious problems in the current case sensibly without it 16:19:16 the obvious problem is "This update fixed my bug!" isn't an entirely solid reason to say 'OK, let's ship it', but that's really damn hard to fix without the concept of different types of karma 16:19:17 it would be great to have this covered by automated testing, I'm just not sure we want to have this as the absolute priority 16:19:30 speaking of bodhi2.0, I'd like to do a irc meeting on it sometime in the next few weeks... go over where we are, and plans and see if theres anything we can do to move it along... 16:19:46 would be great to get as many qa folks as possible to attend. 16:20:04 The short term solution seems to be making sure selinux updates spend more time in updates, since they can have effects that don't show up until significantly after the update is installed. 16:20:08 i mean, we could float something like "tell people on bug reports to leave 0 not +1 when writing "this update fixed my bug!", but I suspect a lot of them would ignore it, and i'm sure sometimes maintainers *do* want "this update fixed my bug!" to mean +1 16:20:32 brunowolff: well, so can updates to lots of other packages, theoretically. 16:20:53 kparal, tflink had outline the priorities in automation right 16:20:56 what about a per package default? 16:21:31 it starts as 3, but then, e.g. selinux-policy could have its default changed to 10 or whatever so that the package maintainer doesn't have to always remember to change it from 3 to 10? 16:21:34 I think autopush should be re-evaluated in the first place (not directly relevant here) and then we shouldn't push a kernel or selinux update in a day (even with enough testers), unless it's security related 16:21:35 And some other ones probably should also wait longer. Glibc and kernel should also wait a bit. 16:21:40 karma 16:21:42 cmurf: per-package settings for various update attributes would be interesting i guess...but i don't know how hard it is to implement, and if it's in bodhi or fedpkg or where 16:21:53 I got very annoyed the last time kernel broke my system because it was pushed in 3 hours or so 16:21:56 let me start gathering #infos... 16:22:34 #info brunowolff notes 'make selinux updates spend longer in testing' is a commonly suggested short term approach 16:22:44 #info cmurf suggests per-package defaults for update release criteria 16:22:45 adamw: I think it's merely adding one switch, it's not really fixing the problem. 16:22:47 Add systemd and dracut to the list. Dracut problems can take a long time to show up. 16:22:49 however, I saw that FESCo rejected the idea of having a minimal time for critical path packages in testing 16:22:57 it would just moderate the problem 16:23:08 brunowolff: +1 16:23:13 #info kparal suggests re-evaluating autopush and a minimum updates-testing time for certain updates 16:23:43 I dont think having this longer will solve anything 16:23:47 in testing 16:24:11 brunowolff, problem is the test matrix.. not all install options get tested 16:24:12 question, did this selinux-policy go to updates? or was it only in updates-testing? 16:24:29 it went stable' 16:24:33 oops 16:24:43 part of the thing that surprised people is it made it to stable in about two hours from submission without ever touching u-t 16:24:47 adamw: disable push to -stable until it has reached -testing 16:24:58 this should unsure a wider distribution among testers 16:25:14 still wouldn't help in this case 16:25:18 #info pingou suggests disallowing push to stable until update has made it to -testing 16:25:25 pingou: yeah i'm not super fussy on how to embargo certain updates, but they need to be embargoed 16:25:27 either by time 16:25:27 because we would need at least two days in testing for someone to discover this 16:25:29 yeah, that suggestion came up a lot but i'm not a huge fan 16:25:31 or by higher karma 16:25:41 or by ending autopush 16:25:43 adamw: why not? 16:25:43 it's a completely arbitrary measure, really - the push to updates-testing is a technical implementation detail 16:25:45 or some combination 16:25:53 if people tested the package they tested the package 16:25:57 That seems over kill. I think what we are worried about is stuff automatically moving to stable too quickly. 16:25:58 yes 16:26:03 and maintainers need the ability to overwrite what ever get's implemented and push to stable upon their own judgement 16:26:25 the problem here is that people, arguably correctly, gave +1 just for installing and rrebooting. usually selinux changes manifest quickly on a reboot. 16:26:29 Viking-Ice: they always have that (as long as it meets the absolute baseline requirements) 16:26:32 but this time, they manifest only on the next update attempt. 16:26:52 adamw: ẁe could evaluate the karma only once the update is in -testing, so that if it's +1 already, it's gonna be pushed to -stable the day after 16:26:59 cmurf: basically, yeah, it's a combination of the fact that it's a delayed action bug *and* selinux updates get karma fast *and* the threshold default and and and... 16:27:21 adamw: yes it's not any one single thing 16:28:50 I still think that approaching this as a karma thing is wrong but hey that's just me... 16:28:55 i think the simplest option in the short term is encourage certain package (maintainers) to set karma requirements higher 16:29:36 Viking-Ice: well it's presently the simplest knob to tweak, it's already there, it's not exactly reliable which means it's not a good long term fix 16:29:40 (which the selinux-policy maintainers have already said they plan to. ;) 16:29:53 right, so…? 16:30:21 yeah, i just wanted to make sure we didn't leave out any thoughts on the topic, and if anyone had a genius idea make sure we don't lose it =) 16:30:28 Viking-Ice: it's certainly better than nothing, right? 16:31:05 i do like the idea of letting the default update config be changed per-package, but i don't know if it's feasible - did anyone want to suggest that, via a ticket or whatever? 16:32:02 Do we really want developer time doing that rather than getting 2.0 out? 16:32:13 cmurf, right it's as good as something better comes along to replace it 16:33:09 brunowolff: also a consideration, yeah. 16:33:31 the possible consequence of higher karma is if a cat is still let out of the bag, it takes that much longer (possibly) to get a fix released. Presumably a package update can be yanked fairly quickly if it's bad enough. 16:34:16 cmurf: at present we forbid ourselves from 'yanking' an update that goes stable 16:34:25 if needed, the karma can be decreased and package pushed automatically 16:34:36 *manually 16:34:36 if you really want to revert the change you have to do it with a spec bump 16:35:15 so, do we have any consensus proposal here, or are we generally OK with leaving the update policy as-is and letting the selinux maintainers take care of improving their processes? 16:35:27 pulling from stable is bad for a lot of reasons. :) In this selinux case it would have made pushing out the fix take longer. 16:35:28 in that case I still think increasing the karma requirement is the better approach for now, hopefully that avoids the worst updates from making stable in the first place 16:36:06 adamw: I'm find with the concensus and letting selinux maintainers improve their process. 16:36:18 +1 letting the selinux maintainers take care of improving their processes? 16:36:31 +1 16:37:32 proposed #agreed while there are various little tweaks that could be considered to the update policy and process as a 'response' to this, we don't think any of them obviously solves the problem without causing other problems, and we trust the selinux maintainers to configure their updates more conservatively in future 16:37:37 +1 16:37:39 sound OK? 16:37:47 ack 16:38:01 ack 16:38:11 ack 16:38:14 ack 16:38:14 ack 16:38:18 ack 16:38:25 ack 16:40:26 thanks 16:40:31 agreed before christmas ? 16:40:31 #agreed while there are various little tweaks that could be considered to the update policy and process as a 'response' to this, we don't think any of them obviously solves the problem without causing other problems, and we trust the selinux maintainers to configure their updates more conservatively in future 16:40:33 sorry :) 16:40:40 #topic Unmaintained package discussion 16:40:51 this is another one that came up on devel@, viking wanted us to talk it over 16:41:26 richard hughes noted as part of his appdata stuff that the repos contain a lot of apparently unmaintained (upstream and/or downstream) software, and there's been a 'spirited' discussion as to whether we need to Do Something About It 16:41:34 anyone apart from me and viking been following it? 16:41:41 linky: https://lists.fedoraproject.org/pipermail/devel/2014-January/194251.html 16:41:41 Yes 16:41:59 basically we do not want to waste our resources in our community triaging/filing bugs against components that never will be fixed because upstream/maintainership is dead 16:42:10 I only skimmed it. but unmaintained doesn't mean broken 16:42:13 fwiw my general opinion is it would probably be a good thing if someone took charge of a careful effort to weed the repo a bit, but i'm not the person who has the time for it 16:42:31 is not that infra jobs 16:42:46 and was no fesco working on a better cleanup policy or was I dreaming 16:42:51 Viking-Ice: let's improve our bugzilla to add a banner to specified packages to clearly state that 16:42:54 no. infrastructure just builds and distributes. 16:43:11 zbigniew suggested "Even a simple list of packages ordered by the time from last non-mass-rebuild release multiplied by the number of currently open bugs would be quite useful" 16:43:14 yes, there was a fesco ticket on it... I've just not had time to try and work out any proposal. 16:43:15 which i thought was kinda genius 16:43:32 there's also pingou's list of packages not built in the last 200+ days... 16:43:35 I've been following it, I'm not sure what there is to be done? 16:43:36 Viking-Ice: i find this kind of thing isn't usually a case of "it's X team's JOB to go and do it", but "person Y decides to go out and do it" 16:43:37 it will improve the situation but it wont solve it 16:43:52 if you're person Y and you come up with a good plan, people usually will let you go ahead and do it 16:44:12 nirik: I should update to 150+ days since the last mass-rebuilt seems to be ~133-134 days ago 16:44:23 or 140+ days even 16:44:27 adamw, we need to put our time and effort to those that respond well to our efforts heck in somecases respond et all 16:44:34 cmurf: i think viking may have wanted us to take a position on the issue given its qa implications (the idea that we may be testing/submitting bugs into a black hole) 16:44:52 Oh 16:45:01 Viking-Ice: did you have a specific proposal for people to vote on, or anything, or did you just want to talk the issue over? 16:45:05 but there's no way to know that really. 16:45:32 surely we can somehow detect active maintainers 16:45:42 More like active packages? 16:45:51 Like a "package activity" meter? 16:46:03 I don't think automatic detection is going to work all that well. There will be lots of false positives. 16:46:17 just ping the maintainers 16:46:35 are you active no response == orphan packages 16:46:43 This kinda gets back to the old question of bugs that maintainers don't respond to. 16:46:56 Maybe we should worry more about cases where we run into non-functioning packages, where the maintainer is unable or unwilling to do anything to get a package functioning again. 16:47:04 If they just get pinged on any stale bug, it's annoying to them. 16:47:24 brunowolff: I think so. 16:47:37 brunowolff: right. 16:47:41 I was thinking about monthly or 3 months or at the end of each cycle ping 16:47:44 not per bug 16:47:52 hmm 16:47:55 Even if a package has a bug, it might still be better to have the package with the bug, as opposed to not at all. 16:48:01 Thats been proposed before. Some packagers were very against it. 16:48:20 "now you have maintained foo package for a whole release cycle do you plan keep on doing that for another" kinda of thing 16:48:27 are you there? are you there? are you there? gets really old... 16:48:59 I'd expect a blanket reminder email to just result in "blah yes i'm here stop bugging me" rather than it getting the intended result which is to ensure their packages have up to date information, including whether they're abandoned. 16:49:20 let's say we dont want to bother everybody then that means we need health monitoring on components 16:49:26 the "are you present" ping doesn't really solve the problem 16:49:36 it proves the maintainer is breathing, maybe 16:49:43 some easy way for us in QA to spot components in trouble 16:49:55 yeah, that might be a better approach... 16:50:11 Then why not do something in the cases where QA notices a problem. 16:50:14 I could work with integrating that into the new triage process and make that a standard routine to look at 16:50:15 so then we need a metric to define "trouble" 16:50:37 right. there's lots of possible data... 16:50:48 X bugs in Y months and Z maintainer responses 16:50:48 trend monitoring of reports as well as their status 16:51:04 cmurf: integrating review of package and dropping the one with the worst review ? 16:51:10 upstream releases not updated to, important bugs with no response, etc 16:51:14 we kinda need that anyway since triagers will need to see the whiteboard flag "triage" 16:51:34 misc: maybe but is the review process a reasonably scientific sample? 16:51:38 if there's only one review... 16:52:20 health monitoring on components is that not like an extension of the retrace server ? 16:52:20 well, perhaps packages 'in trouble' could be asked to be re-reviewed? if the problems were packageing related...and not just upstream issues or the like 16:52:51 nirik, I'm actually adding addional step called upstreaming into the triage process 16:53:12 well, retrace doesn't give the story entirely either. 16:53:31 you could have a component thats really buggy, but the maintainer is very responsive and trying to help. 16:53:46 then the maintainer is responsive 16:53:53 right 16:53:53 and even upstream is doing new releases to fix things... ie, the situation is improving. 16:53:56 that's what matters 16:54:07 as well as some bugs takes longer to fix then others 16:54:29 really it's going to need to be a collection of information, then a group of humans looking at the 'in trouble' ones to see what can be done. 16:54:31 cmurf: i was more thinking of review like in android market, ie if users give a lot of bad review, then something is going on 16:54:42 nirik, and that group of humans is the triagers 16:55:08 misc: so this could be done through software center now I think? but we need a way to track and manage packages that aren't in center. 16:55:10 it could be, sure. 16:55:35 so let's agree looking at here in QA somehow figure out a way to "health monitor" components 16:55:57 i think software center does it's "reviews" via tagger (?) 16:56:09 it's not reinventing that particular wheel anyway 16:56:14 ignore software center we are not going to be using that 16:56:19 cmurf: yeah, but I think cleaning a bit would be beneficial, we can iterate later for stuff that are not covered 16:56:33 Viking-Ice: that sounds like it'd be a useful process to have indeed 16:56:50 in fact, having multiple set of indicator and looking manually at the biggest "offenders" could be sufficient 16:56:56 Viking-Ice: do you want a #action ? or does anyone else want to take it? 16:57:21 Viking-Ice: I don't think software center or tagger reviews is *the* only way or even primary way to describe package health, but it could be one component in a metric that describes component health 16:57:22 adamw, we probably have to write something for it ourselves mean we need a list for the new triaging effort way to filter out bugs that has been labeled with "triaged" 16:57:28 if one is not useful, then you drop that source of metrics, if one doesn't cover all, then you add more 16:57:34 and present that as an list to the next triager 16:57:49 sorry, not quite sure what you're getting at 16:58:04 adamw, wait for the proposal ;) 16:58:19 you will understand when you see how it works 16:59:03 ok, but just as a process thing, do you want the #action item for taking a swing at this? 16:59:29 Viking-Ice: it would fit in nicely with the proposal - might as well throw it in 16:59:39 right 16:59:46 put it on the pile 16:59:59 #action viking-ice to look at a possible process for flagging bitrotten packages as part of the larger triage process proposal 17:00:03 that wording OK? 17:00:43 yeah sure dont think it hardly matters 17:01:05 roger 17:01:08 #topic open floor 17:01:13 so, anything else? quickly, as we're over time 17:01:37 * roshi has nothing 17:02:26 okey dokey, thanks for coming, folks 17:02:34 * adamw sets special very short quantum fuse 17:02:48 #endmeeting