17:05:32 <sgallagh> #startmeeting Discussion of Koji/Rawhide automated testing plans
17:05:32 <zodbot> Meeting started Mon Jul  1 17:05:32 2013 UTC.  The chair is sgallagh. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:05:32 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
17:05:46 <sgallagh> #meetingname Discussion of Koji/Rawhide automated testing plans
17:05:46 <zodbot> The meeting name has been set to 'discussion_of_koji/rawhide_automated_testing_plans'
17:05:59 <puiterwijk> sgallagh: you might want to do a #chair?
17:06:17 <sgallagh> puiterwijk: As soon as the mikes join, I will
17:06:23 <puiterwijk> I se
17:06:26 <puiterwijk> see even
17:06:37 * nirik is here as well, but also doing other stuff.
17:07:56 <sgallagh> #chair mbonnet mikem23 puiterwijk sgallagh
17:07:56 <zodbot> Current chairs: mbonnet mikem23 puiterwijk sgallagh
17:08:15 <sgallagh> #topic Overview/Goals
17:08:47 <sgallagh> Ok, so first a little context.
17:09:05 <sgallagh> After FUDCon Lawrence, we came to Fedora with some ideas for making Rawhide more usable.
17:09:50 <sgallagh> Specifically, we had some ideas about producing tests that could "gate" package drops into the public Rawhide repo, so if they impacted any functionality, the repocreate wouldn't happen
17:09:51 <tflink> ah, I was wondering what meet on freenode met :)
17:10:21 <sgallagh> #chair  mbonnet mikem23 puiterwijk sgallagh tflink
17:10:21 <zodbot> Current chairs: mbonnet mikem23 puiterwijk sgallagh tflink
17:10:22 * nirik likes the idea in principal, but really hopes we can do things so it doesn't slow rawhide down.
17:10:28 <sgallagh> #chair  mbonnet mikem23 puiterwijk sgallagh tflink nirik
17:10:28 <zodbot> Current chairs: mbonnet mikem23 nirik puiterwijk sgallagh tflink
17:10:54 <sgallagh> nirik: Well, I'm not sure we'll define that as a goal.
17:11:03 <sgallagh> Certainly, I'd like not to *cripple* Rawhide
17:11:13 <sgallagh> But right now, it has a habit of running very fast and tripping
17:11:16 <tflink> I think it would slow rawhide down by definition, no?
17:11:17 <nirik> well, I think it's something to keep in mind.
17:11:29 <nirik> I don't think we should say 'no slowdown'
17:11:53 <mikem23> Well, it depends on the tests
17:11:56 * nirik doesn't think rawhide trips all that much, but it could be better for sure. ;)
17:12:26 <sgallagh> nirik: Well, the long-term goal here would be that we should reasonably be able to expect that people working on building stuff for Fedora should be running Rawhide.
17:12:34 <sgallagh> Which is far from the truth today.
17:13:18 <sgallagh> Anyway, the idea of gating Rawhide on a set of minimally-available functionality was more or less agreed to be desirable, but we didn't have any resources to spend on it at the time.
17:13:19 <nirik> sure, but I think some of that is just misperception that rawhide is broken all the time, which isn't really the case as much as it used to be
17:13:55 <sgallagh> nirik: That may be partially true, but one of the easiest ways to change perception would be to implement a plan to prevent the perception from being reality.
17:14:09 <sgallagh> Then it can at least be demonstrated to have been "yesterday's news"
17:14:26 * nirik has been running rawhide full time on his main laptop for 7 months now.
17:14:29 <sgallagh> Anyway, we now have an available resource, at least for the next two months: puiterwijk
17:14:36 * nirik cheers.
17:14:53 <puiterwijk> wow, I'm seen as a resource... not sure if I should be too happy about that :-)
17:15:47 * dgilmore is kinda here
17:15:49 <mikem23> BTW, did folks have a chance to read my reply to the invite?
17:16:08 <sgallagh> Essentially, I had hired puiterwijk as an intern and he's already finished what I had scheduled for him, so I got approval to move him onto this project.
17:16:48 <sgallagh> mikem23: Yeah, I did. I'm glad to hear that it may be less work than we thought.
17:17:20 <tflink> sgallagh: not really, unless I'm severely misunderstanding something, there are several parts missing to the plan
17:17:23 * nirik is unsure of what exactly is being proposed.
17:17:28 <sgallagh> Before we move into actual execution plans, I just wanted to confirm that everyone understands what we're trying to do.
17:17:32 <mikem23> Well, it'd not necessarily less work, but little to none of it needs to land in Koji proper
17:17:40 <sgallagh> Which, clearly, is not the case.
17:17:50 <nirik> I don't. Is this written up somewhere? and got buyin from fesco/releng/qa?
17:18:09 <nirik> I mean I recall the discussions at fudcon, but that wasn't very detailed.
17:18:16 <sgallagh> nirik: There was a long thread on fedora-devel back in the February/March timeframe
17:18:23 <mikem23> Let's talk about the current state of things. How things get into Rawhide
17:18:36 <sgallagh> And no, we don't have a full writeup. That's part of the intended outcome of this discussion
17:18:45 <nirik> yeah, but my memory of it was "this would be nice to do something, lets come up with a plan and come back and tell everyone what we want to implement"
17:18:50 <nirik> ok, great.
17:19:11 <dgilmore> sgallagh: so kinda what you want is to have things not land and go out in rawhide unless its passed some kind of automated qa?
17:19:15 <sgallagh> Specifically, puiterwijk's first task in the next week or so is going to be putting together a proposal and design document
17:19:24 <sgallagh> dgilmore: In a nutshell, yes.
17:20:07 <sgallagh> puiterwijk's job would be to get that framework in place, not to write the initial tests (other than some placeholders to test his work, of course)
17:20:11 <dgilmore> sgallagh: ok, thats not too hard to do and doesn't involve any koji changes
17:20:32 <dgilmore> sgallagh: we could have builds tag into f20-candidate
17:20:34 <nirik> right, just change things to land in f20-pending or whatever
17:20:38 <dgilmore> which would trigger the qa
17:20:40 <sgallagh> dgilmore: Right, which I didn't know when I called this meeting. Let me forward mikem23's email response
17:20:47 <dgilmore> when that passes we move them to f20
17:20:50 <nirik> however, what does that mean for the buildroot?
17:21:17 <dgilmore> nirik: i guess it would mean we need buildroot override capabilities for rawhide
17:21:25 <sgallagh> nirik: I was going to get to that. I'd like to see this address buildroot concerns as well
17:21:31 <nirik> yeah, and chainbuilds no longer work.
17:21:38 <nirik> which would be unhappy for some people.
17:21:40 <sgallagh> I've had a couple cases in the last few months where bad packages in the buildroot caused issues
17:21:40 <dgilmore> since until something passes qa it wouldn't go to the buildroot
17:21:47 <mikem23> dgilmore, right, then the work becomes building an automated system for moving stuff from f20-candidate (or whatever suffix we like) into f20
17:21:54 <dgilmore> nirik: would need a wrapper of some kind
17:22:03 <dgilmore> mikem23: right
17:22:13 <nirik> well, if the tests were quick enough and automated, I guess it could still work.
17:22:19 <nirik> just be slower.
17:22:30 <sgallagh> nirik: Well, as far as chain-builds, now that bodhi can create overrides, I think we can modify the chain-build procedure to land those. Then it will work on *any* branch too.
17:22:32 <dgilmore> nirik: always possible yeah
17:22:47 <puiterwijk> nirik: well, if we would do it with every build, it might trip on some things that also need another package updated
17:22:56 <mikem23> for chain-builds, we could also create an alternate buildroot/target
17:23:38 <nirik> puiterwijk: yeah, some tests are just not going to be easy at all.
17:23:48 <mikem23> i.e. one that include the f20-candidate builds
17:24:03 <dgilmore> mikem23: thats also an option
17:24:21 <nirik> or we could just leave buildroot for now and keep populating it from the candidates.
17:24:49 <mikem23> right, but not the rawhide compose
17:25:06 <dgilmore> lots of options
17:26:39 <nirik> yeah, thats part of the problem... there's a lot of ways to hook things in, so it's hard to say whats best here.
17:27:10 <mikem23> and if we want to get fancy, there could be multiple stages of validation, with stage1 getting it into the buildroot and stage2 getting it into the compose. ....but perhaps it is best to start with a simple perturbation of what we have
17:28:27 <mikem23> So, what are the main issues we're trying to address. What kind of rawhide breakage are we most worried about?
17:28:50 <mbonnet> seems like at least a repoclosure would be good before putting things in the chroot
17:28:53 <nirik> well, broken deps are anoying... but thats a pretty difficult problem to tackle.
17:28:55 <sgallagh> Personally, I'm in favor of creating the alternate buildroot target as well, since if we introduce a broken dependency (especially something like glibc or gcc), I'd rather that we avoid breaking EVERYTHING
17:29:06 <mikem23> mbonnet ++
17:30:08 <sgallagh> mikem23: Well, when we first discussed this, the goal was to base it on functional tests (rather than API tests, etc.)
17:30:34 <sgallagh> I.e. the most basic test would be: could we create a VM image that boots to a usable system with SSH access?
17:31:03 <sgallagh> If the answer is no, then something SERIOUSLY bad went into this build, and it needs to be blocked.
17:31:16 <mbonnet> sgallagh: I think we might want to separate issues into "things that prevent people from being able to build" and "things that will cause runtime breakage"
17:31:18 <sgallagh> And then expand on that in *priority* order.
17:31:31 <sgallagh> mbonnet: That's fair.
17:31:50 <nirik> how do we want to handle groups of updates? or does this operate only on one build at a time?
17:31:54 <sgallagh> I'm sort of looking to ensure that someone could use Rawhide as a rolling release distro if they were so inclined.
17:32:00 <mbonnet> sgallagh: issues in the first category should be checked before letting the package into the buildroot.  Things in the second category could be checked later and trigger an untag of the offending build.
17:32:16 <sgallagh> nirik: Well, right now the repocreate runs periodically, right?
17:32:22 <sgallagh> I'd tie it to that I think
17:32:23 <tflink> nirik: if we're talking about dependency trees, they can't be one build at a time
17:32:24 <mikem23> nirik, with rawhide, we don't really have natural grouping of updates
17:32:35 <nirik> right, so we run into problems.
17:32:41 <nirik> libfoo builds and is a abi bump
17:32:42 <sgallagh> mbonnet: Well, I'd want it to be checked *before* it became public.
17:32:53 <nirik> it goes to test and fails because it causes 20 broken deps.
17:33:03 <sgallagh> Like I said: that way someone doing bleeding-edge development or DevOps could use Rawhide for their base platform
17:33:08 <nirik> how could we rebuild those 20 packages without it being available.
17:33:20 <mbonnet> sgallagh: I think any kind of functional testing is going to take time and resources, and blocking buildroot updates on that may create unreasonable delays, especially as the library of tests that get run increases.
17:33:23 <mikem23> nirik, that would be a job for the alternate buildroot
17:33:24 <nirik> if we do push it into buildroot each one of those packages by itself would fail until libfoo passes.
17:33:52 <nirik> each new buildroot would mean another newrepo task. ;( we already have a lot.
17:34:02 <sgallagh> mbonnet: Yeah, I'm okay with having two levels of checks to address that.
17:34:10 <mikem23> nirik, just two buildroots up from our one
17:34:16 <sgallagh> mbonnet: I was trying to express my desires, not an implementation :)
17:34:27 <nirik> ah, so one for 'all pending stuff'
17:34:35 <mbonnet> sgallagh: sure
17:34:52 <mikem23> yep, which is effectively the one we have now
17:35:09 * nirik nods
17:36:34 * nirik will have to think about the flow some here...
17:37:30 <nirik> as to building a vm on each buildroot, not sure how easy that would be... lots of moving parts there.
17:37:56 <tflink> and it would take quite a bit of time
17:38:06 <nirik> we could duct tape something with fedmsg / fire compose / upload to cloud / run / test. But I don't see it being easy
17:38:55 <sgallagh> Sadly, "easy" and "useful" seem to be a continuum.
17:39:13 <nirik> are there non vm type tests we could start with? or you think a vm is the base to build on?
17:39:53 <sgallagh> nirik: Well, above we were talking about having two types of tests.
17:39:53 <mikem23> I think if the complexity of the tests starts to balloon, you'll probably need to get into the idea I mentioned earlier, multiple levels of gating
17:40:16 <sgallagh> One for blocking the buildroot, another for the public yum repo
17:40:18 <nirik> yeah, so start with builtroot ones?
17:40:20 <mikem23> So the thing is, rawhide is rawhide
17:40:30 <mikem23> we're talking about 'taming' rawhide to some extent
17:40:32 <sgallagh> I think we can start with the buildroot ones for maximum short-term gain
17:40:46 <nirik> IMHO, rawhide is much less raw than it used to be.
17:40:54 * nirik has been working to make it so.
17:40:55 * mattdm parachutes into the conversation
17:41:06 * mattdm wishes there were anaconda for rawhide
17:41:22 <sgallagh> mikem23: Well real or perceived instability of Rawhide tends to lead to people not actually testing their changes there, but just pushing them to Rawhide as part of the process of backporting them to stable releases
17:41:24 <mikem23> nirik, fair enough, but what makes it that way. Is there an effective gating process elsewhere?
17:41:24 <tflink> once you get to implemenation, there's a whole dimension of this problem that I haven't seen discussed: gating and overriding results
17:42:04 <tflink> if that's part of a later conversation, that fine but it's not a trivial problem
17:42:06 <nirik> mikem23: nope... aside from people finding breakage and fixing it quicking, stressing to maintainers to test changes before building, etc.
17:42:35 <puiterwijk> tflink: I think that's more for later, as right now we're trying to get the ideas/goals clear
17:42:42 <nirik> http://fedoraproject.org/wiki/Releases/Rawhide#Audience
17:43:23 <sgallagh> tflink: I agree, it's something we need to keep in mind. But right now I think we're still trying to figure out *where* things go, then we'll get on to how we implement them
17:43:28 <nirik> buildroot does break from time to time, but thats usually noticed the day of.
17:43:31 <tflink> puiterwijk: ok, just wanting to make sure that expectations aren't too high
17:44:00 * tflink isn't clear how you can write a proposal without discussing all the required moving parts, though
17:44:08 <nirik> and a 'untag' gets things working again (if it's the base buildsys group)
17:44:34 <mattdm> on the thing a little bit ago: rather than composing a whole new vm, tests could be taking a nightly minimal snapshot and running yum install of the new package on that....
17:44:44 <puiterwijk> tflink: I didn't say for a later conversation, just not at this specific time in the conversation, I think;)
17:44:50 <sgallagh> nirik: untag is still unpleasant to people who picked the package up on their system
17:45:10 <nirik> sgallagh: they wouldn't have... it's only just been built. Unless they manually downloaded it and updated with it
17:45:44 <nirik> for example, filesystem broke a while back.
17:45:44 <sgallagh> nirik: Right now, anything built in Rawhide ends up in the public repo unless it's caught and untagged before the repocreate run
17:45:51 <sgallagh> Or am I mistaken?
17:45:53 <nirik> it was noticed the next newrepo after it landed.
17:46:04 <nirik> yes, once a day when it's composed.
17:46:15 <nirik> if you build something it lands in the buildroot next newrepo.
17:46:23 <nirik> it's composed and pushed out once a day later.
17:46:41 <sgallagh> ah
17:46:49 <sgallagh> Ok, so my understanding there was off
17:46:51 <mattdm> (so, isn't the "gating" propsal about running some quick tests before that happens?)
17:46:56 <nirik> from the time it's built to the time the compose starts it could be untagged and never go out in the compose.
17:47:04 <nirik> right.
17:47:38 <nirik> so, another option here:
17:47:53 <mikem23> OTOH, It could also be untagged shortly after the compose starts and still get in.
17:47:55 <mattdm> could do a compose with everything, run tests, and on failure bisect until problem is found
17:47:57 <sgallagh> Ok, so if the compose is only happening once a day, that might well be the place to run the functional tests, at least.
17:48:08 <nirik> have buildroot checks, and have compose checks... the compose checks just run before/as the compose
17:48:21 <sgallagh> nirik: +1
17:48:29 <nirik> yeah, but then do we just fail the compose? or try and prune things that are causing problems?
17:48:42 <mikem23> how long do composes take? I'm not sure we have time to bisect
17:48:43 <nirik> mikem23: yep. very true.
17:48:46 <mattdm> first pass, fail the compose
17:48:52 <mattdm> second pass, be more clever.
17:48:54 * nirik looks
17:49:17 <nirik> 3.5 hours or so
17:49:21 <mattdm> depending on the tests and the output we get we can probably be smarter than just bisecting
17:49:37 <mikem23> heh, yeah, no bisect with composes
17:49:44 <mattdm> what is the bottleneck in that 3.5 hours?
17:49:51 <nirik> much of thats deltarpms and such I suspect.
17:50:05 <nirik> also just the sheer number of packages.
17:50:08 <nirik> http://kojipkgs.fedoraproject.org/mash/rawhide-20130701/logs/
17:50:10 <mattdm> can some of that be delayed and done only after the successful compose?
17:50:13 <mikem23> lots and lots of io
17:50:28 <sgallagh> Good point: deltarpms should probably only be generated if the tests pass
17:50:33 <mikem23> right, but that might be a significant restructure of the compose process
17:50:39 * mikem23 is not sure
17:50:50 <sgallagh> Probably worth looking at that.
17:51:19 <nirik> yeah... worth seeing it we can just add tests early, then untag things that are busted
17:51:46 <dgilmore> sgallagh: the slowest part of the nightly rawhide compose is deltarpm generation
17:52:08 <mikem23> Or we could just run our own 'ultralight' compose for testing purposes and only do the full rawhide compose from the gated tag
17:52:23 <mattdm> here's a question: is it useful to do the gating for packages in the "base design" (whatever that ends up being)?
17:52:26 <sgallagh> dgilmore: Ok, so then if we reordered things so that deltarpms are all generated *after* all of the other packages are composed, then we can run the tests on the full packages first.
17:52:31 <dgilmore> mashing a repo without deltarpm is much much quicker
17:52:36 <sgallagh> And abort if they fail *before* wasting all the deltarpm time
17:52:40 <tflink> are we assuming that new tests would have to be created?
17:53:02 <sgallagh> mikem23: That could work
17:53:29 * nirik likes the idea of it just being part of the compose, so there's no timing issues.
17:53:38 <sgallagh> dgilmore: How much longer does a create+delta take vs. a create?
17:53:39 <mattdm> mikem23 what is ultralight compose?
17:53:41 <sgallagh> 2x? 10x?
17:53:47 <dgilmore> sgallagh: its doable, we could mash without deltas and write some new process to generate delta rpms
17:54:06 <nirik> if we do things seperately we need to make sure its done before compose starts... if it's part of compose it can natually just go on after tests.
17:54:10 <dgilmore> sgallagh: i think mashing without deltas is about 15 minutes
17:54:29 <sgallagh> dgilmore: And with it is 3.5 hours?
17:54:47 <dgilmore> sgallagh: ~ yeah, depends on what exactly changes in the tree
17:54:58 <mikem23> mattdm, a compose with all the extra stuff not needed for first line tests turned off. notably deltarpms. probably other stuff as well
17:55:13 <mikem23> not sure how fast we can get it down to
17:55:22 <mattdm> mikem23 but still of the entire tree. got it.
17:56:23 <sgallagh> mikem23: Well, if all the new stuff is in a side-tag anyway, can we just create that as a temporary repo and point our tests to the existing buildroot+the temp repo? Creating the temp repo would be practically instantaneous (except for mass rebuilds)
17:57:46 <mikem23> sgallagh, yes, you could do testing base on normal rawhide compose + pending updates via yum. However this would not catch installer issues and possibly others
17:57:56 <mikem23> still, probably worth considering
17:58:13 <mattdm> i think it's okay for installer testing to be separate
17:58:25 <mattdm> there's a project to do automatic testing of the installer anyway.
17:58:41 <tflink> mattdm: which  project?
17:59:22 <mikem23> would still need a "no deltas" mash of the pending tag to update from (to get multilib right).
17:59:37 <mattdm> tflink I don't remember details I just remember anaconda team being interested
17:59:59 <tflink> mattdm: ok, I'm aware of at least 2 that are targeted @ anaconda
18:00:11 <tflink> I suspect you're talking about dogtail, though
18:00:39 <dgilmore> mikem23: rawhide is never installable
18:00:50 <dgilmore> so we cant do installer testing anyway
18:01:47 <nirik> unless anaconda folks are ok with us making it installable again. :)
18:01:50 <mattdm> dgilmore Rawhide is _not currently_ installable. It used to be, of course. Making it so again would be very useful.
18:02:07 <sgallagh> "installable" can mean different things, too.
18:02:12 <dgilmore> mattdm: it was turned off on purpose as part of no frozen rawhide
18:02:20 <sgallagh> "Anaconda doesn't work" is different from "an AMI image can be generated"
18:02:30 <mattdm> I think automatic testing of anaconda and better gating of rawhide are on a course for making it work again.
18:02:46 <nirik> well, it was disabled as part of no frozen rawhide by request of anaconda folks.
18:02:49 <mattdm> sgallagh Except, I think we eventually need to converge on AMI generation using anaconda as well.
18:03:02 <nirik> they were getting a flood of bug reports for things when they were not finished landing them.
18:03:06 <nirik> which just made more work for them
18:03:35 <dgilmore> to make a change there anaconda folks would need to agree
18:03:45 <mattdm> absolutely
18:03:47 <nirik> so, they would build a new anaconda with partial support for something that they were working on and people would file bugs on it they would need to close a 'hang on, we are working on it' all the time
18:04:24 <nirik> so, yeah, perhaps we could change perceptions or get someone to triage their rawhide bugs or something.
18:04:37 <mattdm> or work on a branch and check things in when they work?
18:05:09 <nirik> yeah, they also got things like 'why is there no new version to fix foo', etc.
18:06:34 <mattdm> I don't see why anaconda is special in that particular regard.
18:06:50 <nirik> they also may be more receptive now that the big re-write has happened.
18:07:08 <nirik> anyhow, something to talk with them about
18:07:45 <sgallagh> mattdm: Because they were literally the first thing anyone saw when trying to install Rawhide
18:08:44 <mattdm> sgallagh So they were getting a lot of bug reports related to things _in general_ which were broken in rawhide and not really anaconda related?
18:08:57 <mikem23> gating may change how the anaconda team feels about installable rawhide
18:09:06 <mattdm> it seems like _this_ proposal directly addresses that
18:09:16 <mattdm> mikem23 exactly -- i hope so
18:09:29 <sgallagh> mattdm: Yeah, I suspect that this is at least a partial solution to that problem
18:09:34 <mattdm> (not just change how they feel but also really address the underlying concerns)
18:09:39 <sgallagh> But we should probably ask dcantrell/dlehman about it
18:09:51 * mattdm nods
18:10:25 * sgallagh notes that we're past the one-hour mark. I'm free to continue, but I understand if others need to reconvene.
18:10:55 <mikem23> I should head out
18:11:10 <mikem23> Would love to see a write up of the meeting and/or a proposal
18:12:22 <sgallagh> Well, I've seen a lot of ideas, but no concrete proposals yet.
18:12:24 <nirik> yeah, I think we at least have some agreement on where to hook into?
18:12:41 <sgallagh> Ok, perhaps I missed it.
18:13:25 <sgallagh> mikem23: Also, I've got zodbot recording this, so I'll send the record to all of the participans when we close it out
18:14:03 <sgallagh> nirik: Would you mind summarizing where you think the hooks are?
18:14:20 <sgallagh> I got that we probably wanted to do something during the compose, but I missed where the buildroot hooks would be
18:14:45 <nirik> So, I think we need two places... buildroot checks and compose checks
18:14:51 * sgallagh nods
18:14:56 <nirik> the compose checks could just be added to the compose process.
18:15:06 <nirik> the buildroot we could look at hooking into fedmsg for...
18:15:23 <nirik> or even just run something peroidically...
18:15:26 <sgallagh> nirik: With the caveat that we may want to split out the repocreate and deltarpm generation to save time
18:15:46 <puiterwijk> nirik: except that fedmsg is based on an "as available" service, and there's no promise if messages even arrive at all, let alone in which timeframe, AFAIK?
18:16:00 <nirik> true.
18:16:11 <puiterwijk> sgallagh: well, or just move the deltarpm creation to the back, and have it just bail out before that if tests fail, right?
18:16:12 <sgallagh> True, if we ran it periodically, all we'd really need to do is email all of the owners of packages that built something between checks, so they could see if they caused it or were affected by it
18:16:27 <nirik> so, the buildroot checking would just need to look at anything tagged into the buildroot, run checks on them and then move them to the 'normal' buildroot.
18:16:35 <sgallagh> puiterwijk: That's what I was trying to say. Do the create, then run tests, then deltas if the tests pass.
18:16:40 <nirik> or if they failed, mail the maintainer.
18:16:55 <puiterwijk> sgallagh: I see, your message seemed to indicate that they were to get two different processes
18:17:01 <tflink> sgallagh: not sure that would work so well - we've had issues with signal-to-noise ratio before with automated checks
18:17:30 <sgallagh> tflink: Explain?
18:17:54 <sgallagh> My recommendation initially is that we limit ourselves to critical issues.
18:18:04 <sgallagh> i.e. gcc won't produce usable binaries.
18:18:17 <tflink> autoqa used to be a lot more spammy, sending out a lot of emails. we got dev pushback and AFAIK, some people started routing autoqa mail to /dev/null
18:18:37 <tflink> my concern was more with "emailing everyone who built" on check failure
18:18:48 <mattdm> in this case, the packages won't land until the problem is fixed, right?
18:18:53 <sgallagh> Sure, but between the checks, it should be a short list
18:18:55 <puiterwijk> mattdm: right
18:18:59 <nirik> well, for buildroot I htink a good first cut would be to run a basic build with it and fail if it can't install the base buildroot in mock
18:19:18 <sgallagh> nirik: That's a good place to start.
18:19:40 <nirik> I guess it would need to untag in that case... so the fix could be built.
18:19:40 <sgallagh> I might also suggest having it run at least a trivial compile (or an autoconf suite)
18:19:50 <nirik> but untagging in every case wouldn't be a good idea.
18:20:21 <nirik> sgallagh: we could, might be tricky to identify the bad build tho, no?
18:20:38 <nirik> if gcc, glibc, and libfoobar all landed at once...
18:20:40 <sgallagh> tflink: Perhaps: "Mail everyone whose package included a BuildRequires on the failed package?{
18:20:54 <sgallagh> nirik: True
18:21:21 <nirik> we could also do different things for critpath/non critpath
18:21:28 <sgallagh> Which was why my initial suggestion was to just treat the packages as a set and notify everyone that had built in that 20-minute period
18:21:37 <mattdm> nirik +1 critpath differentiation
18:21:37 <tflink> sgallagh: I might not be understanding the types of checks being proposed, but I don't see why it can't be limited to the candidates for which build(s) broke stuff
18:22:04 <sgallagh> tflink: That's what I said: only the owners of packages that are currently trying to enter the buildroot.
18:22:06 <tflink> or I could be misunderstanding the scope/implementation
18:22:14 <sgallagh> Not all maintainers in Fedora.
18:22:17 <sgallagh> That's horrifying
18:22:51 <tflink> it sounds like I'm really not understanding what's being proposed - I'll start re-reading the backscroll
18:23:11 <nirik> so foo bar and baz are build and are in a newrepo... the tests run after that and see that mock cannot init a buildroot, so it untags all three and mails the maintainers "hey, one of these 3 packages broke the build root, if it's not you, retag, if it is, fix it"
18:23:40 <sgallagh> nirik: That's pretty much what I was trying to say, but more elegantly put.
18:24:05 <mattdm> I think that's fine as a first pass and then a further refinement can try to identify which package
18:24:28 <tflink> the check is just for composability, then?
18:24:31 <nirik> I think init built root in mock is a good first goal, yeah... then we could make it build some test package or something, etc
18:24:36 <tflink> nothing more fine grained than that?
18:24:54 <nirik> tflink: its for 'the base buildroot... ie, the buildsys-build group can be installed in mock'
18:25:03 <nirik> if thats not true, 0 builds will happen after that.
18:25:31 <mattdm> then we do more fine-grained tests against an image built from the compose, right?
18:25:42 <sgallagh> tflink: That's one test, not the exhaustive set. As nirik says, we may also add a test for "does this (set of) package(s) compile in this buildroot"
18:25:46 <nirik> so, this just tests the packages in that base buildsys-build that you can do a 'yum install buildsys-build' or whatever and have it work.
18:27:17 <nirik> yeah, so I think the division would be:
18:27:31 <nirik> buildroot test -> check that we can build things sanely.
18:27:46 <nirik> compose test -> check the stuff we built that as a collection it is sane.
18:28:31 * nirik has to step away for a few minutes.
18:35:36 <sgallagh> Sorry, had to disappear for a few minutes. Tornado warning in my area.
18:35:56 <sgallagh> I'm going to close out this meeting and try to schedule a continuation tomorrow. I'll send out the recorded minutes shortly.
18:36:57 <sgallagh> #endmeeting