18:15:57 #startmeeting Fedora Release Engineering Meeting 18:15:57 Meeting started Mon Aug 31 18:15:57 2009 UTC. The chair is Oxf13. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:15:57 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:16:05 #topic roll call 18:16:16 ok 18:16:36 ping: notting jwb warren wwoods rdieter lmacken poelcat dgilmore spot 18:17:01 hello 18:17:20 Have you folks seen the ongoing nss drama? It isn't fully fixed yet, but at least most packages build against it now. 18:17:39 yes, been watching it 18:18:16 * dgilmore is here 18:18:30 * spot is mostly here 18:19:05 i'm here in spirit, though on the phone 18:20:23 alright lets get started. 18:21:11 Our last meeting was the 17th, and there weren't any action items for the meeting 18:21:21 #topic Fedora 12 Alpha recap 18:21:50 So 12 Alpha went out, albeit a bit late 18:21:52 woo and all that 18:22:10 pretty uneventful release process, no big surprises 18:22:21 Oxf13: I see there is a dist-f12-maven, are they actively working on it now? 18:22:41 warren: we'll get to that later, thanks 18:22:42 Oxf13: are the extra targets like that the reason for newRepo taking so long, or other reasons? 18:22:46 ok 18:23:52 We also enabled early branching for F-12 18:24:39 so far, only fedora-release has been built for fedora 13 18:24:52 but I suspect we'll see more as time progresses 18:25:07 I've branched a few packages. 18:25:34 cool 18:26:13 still going into rawhide for now, yes 18:26:14 ? 18:26:27 F-12 branched packages go into rawhide 18:26:36 and devel/ for unbranced packages go to rawhide as well 18:26:48 k. need access to the f12 key via sigul before they start going to updates-candidates 18:26:51 builds from devel/ where the package already has an F-12 branch go into dist-f13, which does nothing yet. 18:26:59 jwb: did I give that to you yet? 18:27:05 don't think so 18:27:10 k 18:27:14 easy enough to fix 18:27:26 pre-branching was the last ticket for the Alpha milestone, and I can go close that now 18:27:45 I have a lot of tickets to create for the Beta milestone 18:27:59 warren: a bunch of targets have been removed 18:28:07 anything else regarding F12 Alpha? 18:28:16 Oxf13, did we ever decide if bit flip could be automated? 18:28:25 mostly so you don't have to be awake at ass-early AM 18:28:30 it could be yes 18:28:45 doable for beta? 18:29:46 sure, it's just an at job 18:30:06 k. add another ticket to open for beta i guess :) 18:30:15 'schedule bit flip at job' 18:30:23 I have a question about the targets when it is appropriate to ask. 18:30:33 warren: noted. 18:31:29 #topic Snapshot 1 18:32:21 Snapshot 1 is scheduled to be released this Friday 18:32:28 snapshots are typically just Live images 18:34:05 Live images haven't been composing lately due to ssl and nss fallout 18:34:21 multiple people are working on that issue today, hopefully it'll be cleared up by the time we're ready to compose the snapshot 18:34:54 is anything composing right now? anaconda has broken deps on ssl 18:35:40 yeah, not much 18:35:44 other than a repo of packages 18:36:41 is nscd still broken with ssl? 18:36:50 #info Need to track ssl/nss efforts leading up to snapshot release. 18:37:12 warren: no clue. We don't really test the bits, we just make them available for others. 18:37:28 it was a broken dep 18:37:32 looks fixed 18:37:42 we were having hell rebuilding glibc last week due to nss 18:37:43 anything else on snapshot 1? 18:38:11 here (late) 18:39:35 #topic Slow newRepo tasks 18:39:47 newRepo tasks seem to have gotten even slower lately. 18:39:59 We knew they got slow after the koji upgrade, but recently it's gotten worse 18:40:07 multiple hour delays trying to do chain-build 18:40:08 oh, the maven target was removed? 18:40:20 I looked into this a bit and noticed a few things. 18:40:39 1) we had a lot of extra build targets that cause newRepo tasks anytime anything in the fedora stack changed. 18:40:47 of note, dist-f12-openssl and dist-f12-maven 18:40:58 both gone now right? 18:41:01 the openssl target was being actively used until last week, and I retired it. 18:41:19 dist-f12-maven was created a while ago, back toward the beginning of the f12 cycle. It was heavily used for a short period of time 18:41:24 Was maven really needed? It seems maven is self-contained and doesn't need its own buildroot? 18:41:38 I mean, maven isn't any less broken with its own buildroot. 18:41:49 the maven work has been taken over by other people, who either didn't know about, or care about the buildroot and private branches we created and did things on devel/ into the main repos 18:41:56 so I've removed the dist-f12-maven target too. 18:42:31 2) I also made it a point to cut short the inheritance inspection for each newRepo task 18:42:43 cut short means? 18:42:50 our current tags, dist-f12 and dist-f12-build had inheritance going all the way back to dist-fc6 18:42:58 oh 18:43:11 however nothing is changing in anything from dist-f9 or further back 18:43:29 so I've created f9-cutoff and f9-build-cutoff 18:43:42 I'm cloning dist-f9-updates and dist-f9-build into these tags 18:43:55 and will make dist-f10(-build) inherit from them accordingly 18:44:08 Oxf13: i thought we decided awhile ago to use dist-f9-eol 18:44:14 and then bump it each release 18:44:20 This will save some time when getting the latest build information, for the sake of newRepo and other such tasks 18:44:39 dgilmore: I don't recall any decision. I had asked about it one night and by the time anybody gave me feedback I had already created the cutoff tags 18:44:58 im looking to upgrade /mnt/koji hopefully that will help some also 18:44:59 the name really doesn't matter, it will only be seen by people digging through inheritance listings 18:45:08 Oxf13: i gave feedback when i saw your comment 18:45:21 dgilmore: but you didn't see it until after I made the tags (: 18:45:33 regardless 18:45:47 this is a small but important speedup for our process. 18:45:51 we can probably try limiting newRepo tasks to just 3 18:45:55 since its at 4 now 18:46:07 3) I realized that we recently added epel to our koji setup. 18:46:28 Oxf13: which added 3 targets 18:46:31 this adds at least 3 targets, dist-4e-epel, dist-5e-epel, and dist-5e-epel-infra 18:46:47 Oxf13: we should be able to disable the olpc targets 18:47:01 and with updates and overrides going out frequently this adds even more newRepo tasks to our queue 18:47:13 and as dgilmore mentioned, we have a hard cap on how many concurrent newRepo tasks we run 18:47:23 its 4 right now 18:47:33 its 3 internally 18:47:42 and there are many more targets internally 18:47:48 dgilmore: do we have good data as to if 3 at a time will result in more finished in an hours time than 4 at a time? 18:48:17 Oxf13: no. i only know that internally limited to 3 to reduce db thrashing 18:48:20 dgilmore: are you certain that nothing is using the olpc targets? 18:48:47 Oxf13: cjb would know ofr sure. but there last releases have come from fedora proper 18:49:52 ok. so as I see it, we have a number of avenues to pursue in order to speed up our repo creations 18:49:56 ol 18:50:05 A) reduce number of targets. 18:50:17 B) Shorten inheritance chains 18:50:27 C) Fine tune concurrent repo run limits 18:50:53 D) profile newRepo task to discover delays and improve 18:50:59 E) speed up disk to reduce time to run createrepo 18:51:26 dgilmore: when I last looked at it, the createrepo time was a small small fraction of the total task time 18:51:59 which box does it run on? 18:52:06 Oxf13: it spends ~ half the time in init and half on the builder doing the createrepo and uploading the metadata 18:52:11 warren: the builderws 18:52:14 builders 18:52:16 ooh 18:52:17 er not even half. 18:52:27 http://koji.fedoraproject.org/koji/taskinfo?taskID=1646486 for example 18:52:35 Mon, 31 Aug 2009 16:57:46 UTC 18:52:50 the i386 task didn't even get started until Mon, 31 Aug 2009 17:44:32 UTC 18:53:01 and it was done by Mon, 31 Aug 2009 17:51:08 UTC 18:53:02 Do the createrepo runs on ppc builders take longer? 18:53:14 that's only 7 minutes for createrepo and import 18:53:19 Oxf13: ok, last i looked it was about half/half 18:53:28 err, newRepo tasks 18:53:29 by far the most time we're spending is on init 18:53:29 Oxf13: so the init its taking forever 18:53:46 Oxf13: which really is all on the db/hub 18:54:05 Oxf13: i have an idea to try something 18:54:18 Oxf13: kojira runs on koji2 18:54:34 ill make sure that the builders and the public hit koji1 18:54:37 warren: ppc builders seem to do the createrepo / import task in about 7 minutes, same as the other arches. 18:54:40 adn see if that helps at all 18:54:58 ok. 18:55:21 #action dgilmore to move builders + public to koji1 allowing kojira more of koji2 resources in an attempt to speed up newRepo init time 18:55:55 #info typically newRepo init time is close to an hour, where as the actual createrepo time + import is 7~ minutes 18:56:10 database queries are the slow part? 18:56:36 warren: something in the init process. We need proper profiling to know which part is the "slow" part. 18:56:41 dgilmore: when was the koji upgrade? 18:56:55 Oxf13: before the f11 mass rebuild 18:57:00 Oxf13: this code is in koji git? 18:57:16 warren: yes 18:57:20 Oxf13: we needed to support strongerhashes 18:57:51 dgilmore: the upgrade also allowed adding epel buildroots right? 18:58:01 warren: Oxf13 it allowed external repos 18:58:01 so any epel newRepo task was after the upgrade? 18:58:23 Oxf13: i added epel buildroots shortly after the upgrade 18:58:32 ok 18:58:36 so people could scratch build epel builds 18:58:45 dgilmore: which has been incredibly helpful 18:58:59 http://koji.fedoraproject.org/koji/taskinfo?taskID=1267794 18:59:12 Ok, prior to the upgrade, newRepo tasks were going as quick as 10 minutes 18:59:18 that's init, createrepo, and import 18:59:33 init was taking 4~ minutes 19:00:27 #info prior to koji upgrade, newRepo init duration was 4~ minutes. Now it's 60~ minutes 19:01:28 interesting 19:01:30 Oxf13: that's also "prior to mass rebuild" 19:01:48 mbonnet: what other variable does that throw in? 19:01:55 mbonnet: to that particular mass rebuild. We've had mass rebuilds before that too 19:02:10 warren: many more packages, more data to deal with, more loop iterations 19:02:52 Oxf13: sure, just saying it may not be tied directly to the upgrade. The newRepo code hasn't changed significantly. 19:03:02 mbonnet: erm, it was the same amount of packages, just the top level tag had more builds 19:03:12 what's the best way to instrument python apps? 19:03:19 Oxf13: that's what I meant, more builds 19:03:59 Oxf13: I've seen a lot of variability in newRepo task duration too 19:04:16 I don't think they *all* take 60 minutes. 19:04:44 do any now take less than 40 minutes? 19:04:53 I've seen between 40 - 120 minutes lately 19:04:59 mbonnet: as of late, anything dist-f* seems to be taking 60+ minutes 19:05:45 does anybody remember the date of the upgrade? 19:06:16 do we have numbers on the database load? 19:06:25 http://koji.fedoraproject.org/koji/tasks?state=closed&view=tree&method=newRepo&order=-completion_time 19:06:38 might be easy to add a "Duration" column 19:06:41 just subject two numbers 19:06:47 subtract 19:07:56 https://koji.fedoraproject.org/koji/taskinfo?taskID=1644875 19:07:59 < 30 minutes 19:08:07 what were you guys wondering about on db load? 19:08:38 and actually, that init took about 15minutes 19:08:44 db load (db3 where koji is) has stayed pretty flat. Around 1 19:08:55 mmcgrath: any different from March? 19:08:59 mmcgrath: speculating on the cause of the slow newRepo tasks 19:09:40 anyway, this needs more investigation 19:09:44 mmcgrath: how about load on the kojihub machines? 19:09:54 mbonnet: I could help poke if someone could open a ticket telling me what commands get run and where 19:10:03 #action 0xf13 to plot newRepo task duration over time for the past year or so 19:10:15 #action Oxf13 to plot newRepo task duration over time for the past year or so 19:10:19 I mistype my own freaking nick 19:10:24 koji2's busy, load around 4 but not horribly busy 19:10:37 not swapping or anything 19:10:40 mmcgrath: is that the one the builders hit? 19:10:44 mmcgrath: everything is on koji2 right now 19:10:55 mbonnet: yeah, ping me after the meeting if you we want to look closer. 19:10:56 dgilmore: correct. 19:11:01 mbonnet: builders, public, kojira 19:11:08 dgilmore: gotcha 19:12:00 anything else regarding newRepo tasks for today's meeting? 19:12:39 alright. 19:12:40 Oxf13: are logs of each newRepo kept anywhere? 19:12:57 warren: what kind of logs? 19:12:59 there aren't really any logs 19:13:10 there might be some minimal output from createrepo, but that's not the slow part 19:13:25 someplace instrumentation could be printed to 19:13:45 warren: I suggest taking that to #koji after the meeting 19:13:51 ok 19:13:52 if you're interested in adding profiling code 19:14:01 probably much easier to just log to syslog from the kojid 19:14:41 alright looks like we're done with newRepo, we've got some action items out of it. 19:14:46 warren: did you have any further topics? 19:14:50 no 19:15:01 #topic Open Floor 19:15:10 anything else for the gallery? 19:16:09 Any reason for us not to make F-12 branches of new packages? 19:16:15 us == CVS admins? 19:16:32 Folks are asking for them; I'm not sure why, but I assume they understand where to find their builds. 19:17:36 tibbs: you could 19:17:52 tibbs: im sure people are asking because they dont know that they dont need to 19:18:24 I guess in that case they'll be confused about why they don't see their devel builds anywhere. 19:19:15 As long as there's no technical reason why I shouldn't do it, I'll go ahead and make F-12 builds when requested. 19:21:29 tibbs: yeah, probably just a communication issue 19:22:45 alright, thanks all for coming! 19:22:51 #endmeeting