18:00:00 #startmeeting Infrastructure (2012-04-19) 18:00:00 Meeting started Thu Apr 19 18:00:00 2012 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:00 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:01 #meetingname infrastructure 18:00:01 The meeting name has been set to 'infrastructure' 18:00:01 #topic Robot Roll Call 18:00:01 #chair smooge skvidal CodeBlock ricky nirik abadger1999 lmacken dgilmore mdomsch threebean 18:00:01 Current chairs: CodeBlock abadger1999 dgilmore lmacken mdomsch nirik ricky skvidal smooge threebean 18:00:11 who all is around for a infrastructure meeting? 18:00:15 * lmacken 18:00:19 * skvidal is here 18:00:23 * threebean is here 18:00:27 * athmane is here 18:00:31 * ianweller is last-minuting a physics lab report but will pay attention 18:01:18 * pingou 18:01:24 buenos 18:02:06 ok, lets go ahead and dive in. 18:02:13 #topic New folks introductions and Apprentice tasks. 18:02:14 If any new folks want to give a quick one line bio or any apprentices 18:02:14 would like to ask general questions, they can do so here. 18:02:32 any new folks? or questions on easyfix tickets or general getting started questions? 18:03:54 ok, moving on then... 18:04:02 #topic two factor auth status 18:04:03 no changes on 2fa - pto the beginning of this week. 18:04:18 ok. We can ping it next week... 18:04:29 #topic Staging re-work status 18:04:34 I've worked on this some. 18:04:40 I've commited docs on the new setup. 18:04:56 please read http://infrastructure.fedoraproject.org/infra/docs/staging.txt and let me know if anything needs fixing. 18:05:18 I'm going to try and get the bapp01->02 move done, and easyfix moved to production, then we can do the staging branch killing. ;) 18:05:53 any questions on that ? 18:06:35 ok, I'll schedule the actual staging killing after those other 2 things are done. 18:06:38 #topic Applications status / discussion 18:06:44 any news on applications this week? 18:07:08 abadger1999 / threebean / lmacken / pingou / CodeBlock / anyone else who works on apps. ;) 18:07:33 New python-fedora this week that should resolve some issues clients were seeing. 18:07:45 yeah, thanks for that fix. 18:07:54 abadger1999++ 18:08:12 * nirik still doesn't fully understand it, but thats ok 18:08:16 I'm not aware of any of our scripts that were running into this but it won't hurt them to update on the infra boxes. 18:08:58 fedmsg progress - got a mediawiki plugin done and tested locally. waiting still on lots of package reviews. 18:09:13 he's alive! 18:09:44 threebean: could you perhaps drop a list to the infra list? or a wiki page link or something? then we can try and divide them up... 18:09:54 sure thing 18:10:08 * threebean writes that now 18:10:09 #info reviews needed to move fedmsg ahead. 18:10:20 #action threebean to post list 18:11:08 ok, any other app news? 18:11:22 where did we ever get with re-writing fpaste? 18:12:15 dgilmore: oh yeah, have some news on that... 18:12:32 we are looking at possibly using sticky-notes instead for the server side. 18:12:46 http://gitorious.org/sticky-notes 18:12:47 mmello, I believe was interested in helping out with reviews, too. 18:13:01 it's under review, pending some fixes from upstream. 18:13:13 athmane has found some issues and submitted patches to it. 18:13:24 abadger1999: yes..I'm 18:13:32 nirik: cool 18:13:44 mmello: get to work ;) 18:13:50 abadger1999: threebean is helping to understand the process and I'm currently review my first package 18:13:53 dgilmore: :) 18:14:03 #info looking at sticky-notes for a pastebin server 18:14:11 nirik: would the current fpaste client work with the new server? or need rewritten? 18:14:18 there's a command line 'pastebinit' already in fedora. 18:14:25 cool 18:14:35 so, we could look at making it obsolete fpaste and perhaps provide a compat link. 18:14:50 it misses a few things like the --sysinfo tho. 18:15:18 anyhow, it looks promising. 18:15:34 hello nirik and everyone 18:15:44 morning codemaniac. welcome. 18:16:01 good morning to you too nirik 18:16:03 nirik: :) figured id ask 18:16:09 but php 18:16:15 hi all. did i get lost timezones again? 18:16:17 pingou: yeah, its php. 18:16:27 ctria: meeting just started 15m ago. ;) 18:16:35 any further news on applications? 18:16:35 :( 18:16:49 sorry for joining late team 18:16:56 no problem at all. 18:17:24 CodeBlock: you around ? 18:17:39 ok, will move along then... 18:17:44 #topic Meeting time 18:17:54 is everyone ok with this meeting time? 18UTC? 18:17:59 +1 18:18:00 +1 18:18:03 +1 18:18:07 +1 18:18:25 ok then... seems popular. 18:18:47 yea niriki , i am very much comfortable with this time , its +5:30 UTC here 18:19:07 +1024 18:19:13 cool. 18:19:34 #topic Upcoming Tasks/Items 18:19:43 #info 2012-05-01 to 2012-05-15 - F17 Final Freeze. 18:19:43 #info 2012-05-01 - nag fi-apprentices. 18:19:43 #info 2011-05-03 - gitweb-cache removal day. 18:19:43 #info 2012-05-09 - Check if puppet works on f17 yet. 18:19:43 #info 2012-05-10 - drop inactive fi-apprentices 18:19:44 #info 2012-05-15 - F17 release 18:19:53 so final freeze is coming up. 18:20:25 One thing I will probibly schedule soon is a migration of our hosted machine... hosted03 -> hosted01/02. Should be a pretty easy migration. 18:20:37 anyone have other items they are working on they would like to schedule or shout out? 18:21:46 ok, I'll take that as a no. ;) 18:21:55 nothing here 18:22:06 #topic Private Cloud 18:22:14 just a short mention of our cloud plans.... 18:22:14 we have new progress? 18:22:35 not really... but I made a wiki page: https://fedoraproject.org/wiki/Infrastructure_private_cloud 18:22:58 so, if people have suggestions/ideas, post them on list. 18:23:01 * skvidal counts backward from 5 18:23:16 we are currently waiting for hardware still, but I think we have close to the config we want speced out. 18:23:32 nirik: you have already finalized the budget and initial designs right , t 18:24:02 what exactly the cloud is going to host ? 18:24:05 codemaniac: yeah, for the initial part of it. We can hopefully expand it next year if it works out. ;) 18:24:11 codemaniac: read the page - it explains 18:24:12 see the wiki page... 18:24:52 so, I think it will be very handy once we have it in place. 18:24:56 just curiosity, can't we have any cloud solution from red hat or this is what red hat proposes? 18:25:13 as we get closer to deploying it, we will no doubt want lots of people to help test it and such. 18:25:22 nirik: who's doing the work? I'd love to help. I should have some time in the next couple of weeks to ramp up 18:25:31 ctria: we evaluated a number of cloud tech, and found eucalyptus to meet our needs. 18:25:52 herlo: skvidal is point on it... we are kinda holding on hardware tho. 18:25:59 hi 18:26:13 nirik: skvidal: okay, just ping me when you are ready 18:26:14 we have a test cluster running. it doesn't do a lot b/c of the lack of ips to use w/it 18:26:27 nod 18:26:44 yeah. hard to use when you can't get to it from outside easily. 18:26:51 me too in , would like to offer whatever i can 18:27:23 #info waiting for hardware, will move forward once thats done. 18:27:24 what will it take to get more IPs? Or do we need to free some up elsewhere? 18:27:53 the new hardware will have a pile of external ips for us to use. 18:28:01 okay, great 18:28:14 * herlo is done asking questions now 18:29:25 cool... 18:29:28 #topic Open floor 18:29:42 anyone have items for open floor? general questions? etc 18:29:59 global updates 18:30:14 should I go ahead and just shoot a yum update on most everything? 18:30:59 yeah, I think so. 18:31:02 +1 18:31:08 catch back up after the freeze. 18:31:10 ok 18:31:37 I don't think we need reboots tho. The last update kernel fixes several security issues, but not sure how much they apply to our setup 18:32:04 second thing 18:32:12 builders and koji 18:32:25 I'd like to take buildvm01 and 02 out of koji for a while 18:32:31 skvidal: ok 18:32:31 so I can mess with them 18:32:44 sounds fine to me. Not much build activity right now anyhow. 18:32:56 and I'd like to find out about more migration to having more builders be vm-based 18:33:32 what were the next steps in that plan? 18:33:53 1. we need a way of telling koji that some pkgs need A LOT of ram 18:34:13 2. we need to convert more of the x86-## boxes to buildvmhost 18:34:33 skvidal: we have no way today of knowing that 18:34:50 A lame question , what is koji ? 18:34:52 dgilmore: won't we need that for arm anyhow? and seed with their list? 18:34:58 codemaniac: the build hub 18:35:00 codemaniac: not lame at all. ;) Our buildsystem... 18:35:11 skvidal: for arm we have a seperate channel called heavybuilder, and it has a list of packages in the koji hub policy that it sends to that channel 18:35:13 is there a list of packages to blame for LOT ram? 18:35:16 but its not an automatic thing 18:35:23 its a manually maintained list 18:35:25 dgilmore: oh - so we can't do that? 18:35:28 ctria: more or less 18:35:40 http://fedorahosted.org/koji and http://koji.fedoraproject.org/koji/ 18:35:41 then we could probably create a new channel and redirect them there 18:35:44 codemaniac: https://fedoraproject.org/wiki/Using_the_Koji_build_system should answer the questions do not hesitate to ask for clarification 18:35:48 skvidal: we can but its prone to failure and will need constant tweaking 18:35:56 in the new channel to have better specs for builders 18:36:07 dgilmore: having all of our hosts have 8GB+ ram seems.... like poor resource use 18:36:13 dgilmore: does changing that need a hub restart? 18:36:16 thanks misc, nirik .Sure i will ask 18:36:20 nirik: it does 18:36:20 dgilmore: since 95% of our pkgs don't need even 2GB of ram 18:36:25 thats a pain. 18:36:25 skvidal: right 18:36:39 skvidal: +1 to not using a hammer for all builders 18:36:41 I suspect the largermem pkgs are probably under 50 18:36:46 wonder if we could ask upstream koji for a better fix... some on the fly thing. 18:37:06 and I think the list would not grow much over time 18:37:19 nirik: well we could probably write the policy so that if the task load is above say 4 where 6 is the max it goes to a bigger builder 18:37:28 I've run into similar issues, I think it's possible to have a koji setting for ram requirements. 18:37:43 nirik: but there could be something under a load of 4 that uses lots of ram to say link 18:37:53 yeah. 18:37:59 the last load is dynamically worked out based on avaergae build completion time 18:38:05 the lask load 18:38:10 task 18:38:11 lol 18:38:14 couldn't cgroup be used to track the ram used by a set of process ? 18:38:25 misc: builders are all rhel6 18:38:53 so, I guess we all ponder on the problem and try and come up with a good solution... 18:39:02 nirik: I think your suggestion is a good place to start 18:39:07 ask koji upstream for help 18:39:24 ok so can we track the memory on vm ? 18:39:25 it would be nice to have a better solution for arm too... 18:39:33 I think that's how balooning does, no ? 18:39:36 nirik: probably the best solution would be to have koji keep track of ram usage during builds and store that. then we could do things nicely 18:39:43 but thats likely alot of work 18:39:43 misc: the problem is that we get a package to build and we don't know how much mem it will need to build. 18:39:45 misc: you mean on the fly allocation of ram? 18:39:52 ( or maybe polling memory in the vm every minutes, to have some stats ) 18:39:53 +1 nirik's suggestion 18:40:00 misc: normally the event which tells us it is out of memory is that it OOMs 18:40:01 dgilmore: hum... yeah. 18:40:26 we can somewhat safely make assumptions based on previous builds 18:40:30 or perhaps a FTBFS run could also tell us which packages need more mem and we set the list based on that? 18:40:46 at least it would be a good seed. 18:41:07 i suspect that having a task load of more than 4 would cover 95% of cases 18:41:11 maybe 100% 18:41:12 #info need a solution to find builds that need lots of memory. 18:41:13 there is definition for package name in koji policies http://fedoraproject.org/wiki/Koji/Policies 18:41:22 #idea ask koji developers upstream for a solution 18:41:44 nirik: hmm 18:42:04 nirik: so here's an idea of what i could do on buildvm## 18:42:14 it would take a while, though... 18:42:30 iteratively build every pkg on the 4gb they have available 18:42:37 and see which ones cause it to oom 18:42:53 that would take quite a long while... 18:42:56 channel = method createrepo :: use createrepo has req_channel :: req is_child_task :: parent source */gcc* :: use heavybuilder source */glibc* :: use heavybuilder source */kernel* :: use heavybuilder 18:42:58 I know :) 18:43:07 dgilmore: -ENOCONTEXT 18:43:09 thats a snippet of the policy use on arm 18:43:28 once we have the cloud setup, we could fire up N builders with 4GB and mass rebuild... :) 18:43:28 it came aross bas 18:43:44 nirik: so... 18:43:49 ceph is apparently another one. 18:43:58 webkitgtk is also a pig ;) 18:44:01 nirik: I could convert an existing x86-## box 18:44:05 nirik: to buildvmhost 18:44:06 skvidal: http://fpaste.org/QfFj/ thats the full current policy used on arm 18:44:16 nirik: and then i'd have 8+ builders to test with 18:45:00 yeah, still will take a while, but less while. ;) 18:45:53 I'm open to other options... 18:45:56 considering our build load 18:46:03 I could probably rob 2 more machines w/o us losing out muchj 18:46:24 I'm fine with that, but perhaps we should see if koji dev's have any brilliant ideas we are missing first? 18:46:24 skvidal: sure. we really only hit the builders really hard for mass rebuilds 18:46:25 our peak afaict has been like 23% use 18:46:36 nirik: you're no fun at all ! :) 18:46:46 I know, right? :) 18:46:52 nirik: I can bounce an email to buildsys list 18:46:58 dgilmore: technically, that's some form of mass rebuild, so maybe this can be done at the same time 18:47:20 ( or, if someone is gonna rebuild everything, this can produce a list of FBTS and a list of package that need more ram ) 18:47:40 nirik: ill talk with the mikes 18:48:07 dgilmore: even better :) 18:48:12 works for me. 18:48:22 #action dgilmore to talk to mikes about koji options. 18:48:39 misc: we typically only do mass rebuilds when there's a good reason... rpm payload change, gcc change, etc. 18:48:40 agh, late, sorry 18:48:41 * CodeBlock here 18:49:08 misc: but we hope to use the new cloud setup to mass rebuild to check the FTBFS stuff. 18:49:11 nirik: I think I'd like the opportunity to see if I can do a sensible port of ftbfs over to mockchain, etc 18:49:38 yeah, would be good. keeping in mind hopefully that it would be portable to the privatecloud. 18:49:52 it should just be ssh-based 18:50:08 so the only difference is how you get the instances started 18:50:13 not how you spawn off jobs 18:50:26 yeah. 18:50:45 question about ftbfs 18:50:55 does that build all rawhide srpms against rawhide? 18:51:20 yep. 18:51:34 we can ask mdomsch for his scripts. 18:51:55 skvidal: i believe thats how mdomsch did it 18:51:58 I think it was just basically: build everything once, get list of fails, rebuild, cycle until you have no changes. 18:52:19 nirik: nod 18:52:26 nirik: I was checking on which base it built from 18:52:30 but there's issues with that too I guess. 18:52:32 can someone expand ftbfs for me plz 18:52:32 ie: did he take a single rawhide snapshot 18:52:40 Fails To Build From Source 18:52:42 herlo: fails to build from source. 18:52:45 tx 18:52:49 skvidal wins! 18:52:51 ie, you get the src.rpm we have in git and it doesn't work. 18:53:06 http://fedoraproject.org/wiki/Fails_to_build_from_source 18:53:25 yeah, I have done something very similar with my koji project 18:53:44 rawhide can be a challenge, since there's so many moving packages. ;( 18:53:50 * dgilmore needs to run 18:54:08 nirik: which is why I was thinking about taking a snapshot of rawhide into a local mirror 18:54:13 nirik: that all the builders can see 18:54:47 you mean for the build repo/binary rpms? or the git repos building from? 18:54:51 or both. 18:55:25 build repo/binary 18:55:37 so that what it is building against for the build time doesn't change 18:55:54 yeah. 18:56:04 although if we can get it down to under a day, it would be the same rawhide. ;) 18:56:18 I cannot fathom how many machines we would need for that 18:56:28 large number... 18:56:32 dgilmore: can I convert x86-16,17, 18 over to buildvmhosts? 18:57:20 skvidal: I think he had to run... 18:57:27 okay 18:57:34 lets continue out of meeting on this... 18:57:36 ok 18:57:39 sorryt for running on 18:57:41 anything else for meeting? 18:57:47 no worries at all. Good to discuss. 18:58:06 one more item 18:58:26 does anyone else have any interest in ansible and/or tryting to write playbooks? 18:58:33 * herlo does 18:58:38 just haven't had time yet 18:58:42 understood. 18:58:47 will next week or two for sure 18:58:54 one last thing 18:58:58 depending on what gets done with euca, I'd be up for it 18:59:01 yeah, I'd like to look at ansible more too. 18:59:09 I didn't hear any horrible screaming about the ssh key thing last week 18:59:15 just some minor grumbling 18:59:27 so.... I'm going to go ahead and make a magic key and put it on lockbox01 18:59:44 and start playing with it on the builder installs I've been working on 18:59:51 fair enough. 19:00:03 it would only be those instances for now, right? until we expand ansible use? 19:00:09 exactly 19:00:12 nothing else will have it 19:00:23 seems sane. Shake out any issues now 19:01:06 #info will setup initial ansible ssh keys for testing with. 19:01:19 okay, I'm done with all my openfloor things 19:01:20 ok, shall we call it a meeting? 19:01:23 :) 19:01:36 yup 19:01:40 call it 19:01:43 I have one quick thing 19:02:00 CodeBlock: go for it. 19:02:33 we talked about Stickynotes as a pastebin last week, I quickly made a Fedora theme mockup (showed nirik last week, but wanted to get some more voices about it) 19:02:33 http://images.srv1.elrod.me/stickynotes-fedora.png 19:02:44 CodeBlock: oh, codemaniac was looking to see if he could help with dpsearch. If you catch him on-line, see if he can assist any 19:03:13 I think it's a pretty good start... might need more blue. :) 19:03:14 nirik: cool, he's been trying to ping me and our schedules have been totally missing each other 19:03:48 CodeBlock: oh, cool 19:03:52 * herlo looks 19:04:00 #info provide feedback on sticky-notes theme: http://images.srv1.elrod.me/stickynotes-fedora.png 19:04:03 oh, very nice! 19:04:30 CodeBlock: I like the theme, I bet we could package that up pretty quickly 19:04:48 ping me when you have a package and I'll do a review 19:05:02 we can get that into the infra repo soon I hope. 19:05:02 herlo: it's already up. ;) dcr226 has it up and I am reviewing. 19:05:07 herlo: Cool. It needs some coding work, I just wanted to get a working prototype, but I will continue working on it 19:05:09 nirik: with the theme? 19:05:19 I'm specifically talking about the theme 19:05:23 herlo: oh, sorry, no, not the theme... 19:05:27 yeah, would be good to get that in too. 19:05:28 :) nw 19:05:42 ok, calling it a meeting as we are over. 19:05:45 nirik: yeah, knew about the stickynotes review 19:05:54 CodeBlock: awesome. 19:05:54 Lets all continue over on #fedora-admin and #fedora-noc and #fedora-apps. 19:05:58 indeed 19:05:58 thanks for coming everyone! 19:06:02 #endmeeting