18:00:08 #startmeeting 18:00:08 Meeting started Fri Dec 2 18:00:08 2011 UTC. The chair is davej. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:08 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:13 whoo 18:00:18 #meetingname kernel 18:00:18 The meeting name has been set to 'kernel' 18:00:32 word 18:00:43 here 18:00:50 \o/ 18:01:06 shall we start with the usual run through of each release ? 18:01:19 sure 18:01:49 want me to start with rawhide? 18:02:02 lets do them sequentially. so f14. those 200 or so bugs left open. I'm guessing a lot of them will be closed and stay as such when users move to newer releases. 18:02:16 but a bunch of them are sure to be still problems in 15/16 18:02:57 which is kinda bad, because bisecting is going to be even less of an option 18:03:00 some of those have been around for ages. When they become 15/16 bugs, I think we should try and do something more drastic about the longer open ones 18:03:19 I mean, if they've been open 3 years, chances are, we're not going to fix it, and they should really start bugging upstream instead 18:03:26 yes 18:04:50 that's about all I have for 14. anything to add ? 18:05:01 the -106 kernel is the last update 18:05:02 EOL is Dec 8 18:05:17 yeah, unless something dramatic happens this week, we're done there. 18:06:11 so. 15/16 we might as well deal with together, given they're now the same kernel 18:06:37 usual story: new release - buried alive in bugs. 18:06:38 except for the fake versioning and some config options, yeah 18:06:57 cebbert: anyone mentioned any feedback about your new change for the versioning ? 18:07:26 i pushed an F15 update that should fix vbox, nobody has provided feedback 18:07:45 ok 18:08:08 cebbert, s/fix vbox/fix compiling vbox modules 18:08:10 i guess i should post something somewhere - i couldn't find a bug report for that 18:08:16 heh 18:08:28 there seemed to be more noise on the lists than in bz about that 18:08:29 cebbert, i think i aliased one to amd_iommu.h 18:09:22 so something that I wrote down as a potential for f15/f16 updates. Dropping CONFIG_OPTIMIZE_FOR_SIZE. thoughts ? 18:09:46 has anyone tested if it makes a difference? 18:09:53 is it really that much faster 18:09:56 ? 18:10:31 I've done a bunch of microbenchmarks this week, and it's measurable. not massive, but still. 18:10:47 i'm fine with dropping it 18:11:28 http://fpaste.org/AHQJ/ is one run I did 18:11:59 different base version, but 3.1.2 -> 3.1.4 shouldn't have skewed them 18:12:34 doesn't seem like there's any real downside, and stuff gets slightly faster, so I can't see users complaining about the change 18:13:13 #action switch F15/F16 builds from -Os to -O2 18:14:04 anything more for 15/16 ? 18:14:11 would be nice if we could use -Os some places and -O2 others 18:14:30 cebbert: I think we do. some parts of acpi at least used to be -Os 18:14:47 because the interpretor was huge, and didn't need to be fast 18:14:50 davej, so on these "i did something on the disk and now the GUI/machine hangs for a few seconds" issues... 18:15:12 i'm wondering if we should poke at that more. the mm patch we added didn't seem to do much, did it? 18:15:17 Please?!? 18:15:32 it did help somewhat 18:15:50 i'm also wondering if full disk encryption plays into it at all 18:16:00 cebbert, not for me. at least not that i can tell 18:16:02 the last few versions, something weird has been going on in block/ see also all those 'copying files to usb kills interactivity' bugs. 18:16:11 did anyone try reducing the cache size and latency numbers in /proc/sys/vm? 18:16:22 there's various threads upstream going on that might yield some improvements eventually 18:16:32 but that's going to 'pick up on a rebase' type material 18:17:14 cebbert, that kind of thing is what i was meaning by poke. do we need to change any defaults to avoid flushing back 4G of page cache to disk, etc 18:17:49 i guess i should be able to reproduce that and test, i think i have a machine with 4GB 18:18:03 it's annoying that the kernel can't get that right and pick sane defaults for itself. 18:18:04 i'll look at it 18:18:20 davej, right 18:18:28 but afaik, it doesn't even try 18:18:40 akpm seems to be of the opinion that distros should run something magical on boot up to adjust that sort of thing 18:18:44 which seems kinda hokey 18:18:45 it should be throttling dirty data per-device based on how fast it can write data 18:19:31 "never allow more tha X seconds worth of unwritten data on any device" 18:20:08 so something I'm most concerned about in 15/16, are those ext4 bug reports. has Eric made any progress on those ? 18:20:15 but instead we have a giant shared writeback quota 18:20:36 davej, no. he just looked at it last night for the first time 18:20:40 ok. 18:20:59 those all seem to be happening on resume from suspend, iirc 18:21:00 davej, apparently there's a thread upstream, but it's all of like 4 emails long and consists mostly of Ted asking questions 18:21:21 cebbert: I noticed that on a few. that's going to be fun to debug. 18:22:57 ok. move onto rawhide ? 18:23:07 sure 18:24:01 so rawhide is at 3.2-rc4 now. which is fairly stable from what i can tell 18:24:01 want to talk about -extra ? 18:24:08 ok 18:24:17 and then we added a modules-extra subpackage as discussed on the list 18:24:36 a couple of hiccups since it went in. one was a broken dep in PAE, which i fixed this morning 18:25:06 and now linville has hit some other warnings with some modules in the main kernel package requiring things in modules-extra 18:25:25 how do we make sure someone updating from f16 gets both packages? 18:25:38 he has a patch to revert some of the sorting logic, but i'm wondering if we want to just call depmod to sort it all out and then move things 18:25:54 that's the thing, do we want them to automatically get -extra ? it's supposed to be the shit that not everyone needs, so.. 18:25:59 cebbert, why would we need that? 18:26:02 probably better to call depmod 18:26:20 i have two main concerns calling depmod 18:26:43 1) we'd need to call it, use the results, then throw them all away. which is ok but somewhat wasteful 18:26:57 2) lately is seems to want to consume every piece of memory the machine has 18:27:07 2 is going to suck if it kills koji builds... 18:27:39 yeah, depmod has been sucking pretty hard lately. not heard anything more on it from jonmasters 18:28:17 i'll look at linville's issues and patch a bit more today and maybe try depmod 18:28:22 see how it goes 18:28:28 what happens if an f16 user is using a module that's now in the -extras pkg? won't things get broken if he only gets the base f17 package? 18:29:02 cebbert, yes, but it's somewhat going to be the case on a fresh install too... 18:29:15 well the breakage won't be of the order "can't boot", so it's not critical, and should be fairly obvious. 18:29:21 them: "my hardware isn't working" us: install modules-extra 18:29:28 davej: it's still on my todo, I apologize for not getting you an update sooner 18:29:28 aargh 18:30:06 there is 0 point in making a modules-extra package that always gets installed by default 18:30:15 me goes off to get "install kernel-modules-extra" tattooed on his forehead ;) 18:30:23 :) 18:30:43 honestly, if this proves to be more trouble than it's worth, i'm fine with dropping it entirely 18:30:49 years ago there was something ubuntu did that some "there are binary drivers for your hardware, clicky here to get them" popup on first login. it would be nice if we had a popup like that for -extras 18:30:52 right below "did you try and updated kernel" 18:31:07 davej: I've set aside some time to get some progress on that before this time next week. I'll get you an update. 18:31:13 jonmasters: cool 18:31:29 * jonmasters has been sucked into mostly all day meetings a bit this week 18:32:19 hmm, i wonder if we could recognize that automatically. we have all the dependency info for all modules available at build time. 18:32:59 hrm? 18:33:49 if we could somehow fake the autoloading out, we could detect people trying to load modules that aren't installed 18:34:28 no 18:34:36 because depmod is run at package install time 18:35:02 so anything in ...`uname -r`/extras/ isn't included when the kernel package is installed 18:35:23 right, that would have to change, or we'd have to provide dummy modules for everything we removed 18:36:12 anyway, in theory it could be done 18:36:42 i'm starting to like 'turn off esoteric modules' more and more ;) 18:36:56 that's another point.. 18:36:58 heh 18:37:11 if enough people are complaining about something, that might mean it shoudl be moved back out of extras 18:37:34 yeah, so back and forth 18:37:44 anyway, better move on, to get through the agenda items we had noted down.. 18:38:04 so something that came up earlier this week when I was doing the -O2/-Os benchmarking. 18:38:18 the old story of us doing only -debug builds for rawhide. 18:38:29 couple things here. 18:38:45 1. it's a pain for people doing benchmarking. 18:39:17 2. some debug options seem to have gotten more expensive (or we're doing more work which is spinlock-heavy these days) 18:39:57 we do have the "--with=release" build option now for people who want to benchmark 18:40:26 right, but that still requires them to do a separate build 18:41:06 so the specfile contortions to do a separate -nondebug build are too ugly for words 18:41:14 but I'm wondering if it might be worth doing a release build once a week 18:41:25 so for eg, the monday build every week has debug off 18:41:30 then we turn it back on for the rest of the week 18:41:50 i was hoping that build option would quiet the complaints at least a little :) 18:42:14 i have a feeling people will just wait to update until mondays 18:42:27 hmm, i'm guessing people would only install those kernels and skip the rest 18:42:41 fine, less bugs ;-) 18:42:50 N 18:43:03 yes, at the cost of not finding stuff that needs to be fixed during the merge window 18:43:11 *potentially 18:43:48 well for most of the stuff that lockdep etc finds, most the time we either get a bunch of reports, or we see it ourselves anyway. 18:44:19 why is the spinlock debug so slow? maybe it's not using queuing like the asm code does? 18:45:11 now there's a project for someone: optimize the spinlock debugging... 18:45:14 cebbert: the biggest complain seems to be "gnome-shell got really slow", indicating that dri is perhaps doing something special 18:46:36 if it's too slow to be useful, optimizing a debug feature might make sense 18:47:27 aiui it's pretty fast already, but the problem is the overhead involved in tracking all the locks 18:47:52 so if something is creating thousands of objects with locks in them, that's a lot of work for lockdep to keep up with 18:48:08 (I'm theorising here, I have no idea wtf dri is actually doing) 18:48:11 oh that's right, you can't turn that off by itself 18:48:39 so we don't really know if it's just the spinlock debugging that's the issue 18:48:52 another proposal, we don't do it every monday, we pick a day at random to surprise people ;) 18:48:57 i guess someone could test with just that enabled 18:49:26 or.. we don't turn off /all/ debug, just the heavyweight stuff. 18:49:41 yeah, that might work 18:50:07 missing a lockdep report or two isn't the end of the world. things like slab debug is more concerning. 18:51:32 davej, random seems worse 18:51:45 REGRESSION! YESTERDAY'S KERNEL WAS AWESOME AND TODAY'S SUCKS 18:51:46 ok, how about once per -rc ? 18:53:24 there are no good answers :/ 18:54:29 I'd like to get to a point where we can track performance over time to see where things are going wrong. 18:54:56 because right now, we do pretty much no work in this area, then when someone decides to make a new rhel, we hear all the "hey, this is slower than 2.6.18" reports 18:54:57 and that point should be perhaps right before/after Alpha? 18:55:31 and also, if we make big changes in options, we really want to know if/when performance drops off 18:55:50 in my tiny brain, we'd still want Alpha to have debug on because it's the first big release people try. but after that... 18:55:59 agreed 18:56:18 the enterpise kernel people can do their own benchmarks ;) 18:56:23 that doesn't help us on what to do in branched state 18:56:35 does rawhide default to debug again? 18:56:48 rawhide never turns off right now. 18:57:07 i meant in our glorious future cases :) 18:57:10 i'd leave it that way 18:57:30 so rawhide is always on, Alpha is on, post Alpha is off 18:57:49 coming up on end of the hour. 18:58:09 is there another meeting after? 18:58:14 I think so 18:58:19 the cloud people up next I think 18:58:24 CLOUD! 18:58:41 the cloud people should put their meeting in the right spot then 18:58:41 we can take this part back to #fedora-kernel 18:58:51 ok 18:58:55 meeting channel page shows before us 18:59:02 anyway, sure #fedora-kernel 18:59:13 ok. so might as well wrap it up here for now. 18:59:32 same time in two weeks.. 18:59:36 #endmeeting