19:00:01 #startmeeting Infrastructure (2012-02-02) 19:00:01 Meeting started Thu Feb 2 19:00:01 2012 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:01 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:00:01 #meetingname infrastructure 19:00:01 The meeting name has been set to 'infrastructure' 19:00:01 #topic Robot Roll Call 19:00:01 #chair smooge skvidal Codeblock ricky nirik abadger1999 lmacken dgilmore mdomsch 19:00:01 Current chairs: Codeblock abadger1999 dgilmore lmacken mdomsch nirik ricky skvidal smooge 19:00:16 * skvidal is here 19:01:05 I'll wait a few and start after folks drift in... 19:03:16 ok... I guess lets dive in. 19:03:18 #topic New folks introductions and apprentice tasks/feedback 19:03:26 any new folks around who would like to say hi? 19:03:42 or apprentices have questions/concerns/notes? 19:03:47 hello 19:03:49 * abadger1999 here but not new ;-) 19:03:56 * jsmith lurking, but not new either 19:03:57 * nirik just sent out the monthly apprentice ping email. 19:04:02 Southern_Gentlem: Greetings :-) 19:04:40 * jds2001 says hi :) 19:04:42 welcome Southern_Gentlem 19:05:44 * wsterling here 19:05:50 hey wsterling 19:06:14 ok, moving along then... 19:06:25 #topic two factor auth status 19:06:33 I don't see herlo around... 19:06:56 I'm going to try next week and see what I can do to move this forward. I'd like to see us get to the first planned thing soon. 19:07:04 hey yall 19:07:11 (which is sudo for sysadmin-main using 2 factor) 19:07:15 hey dgilmore 19:07:34 #topic Staging re-work status 19:07:42 so averi has been working on this. 19:08:11 we have been testing each of our stg machines and seeing what needs to be fixed when we point them at the master branch in puppet instead of the staging branch. 19:08:39 however, a implementation issue/question has come up with respect to the apps... how to organise the stg config in the master branch. 19:09:00 this might be something for the list, but thought I would bring it up here. 19:09:38 The options seem to be: 19:10:09 a) any files that are different between stg and prod get conditionals and their own copy of the file in .stg. 19:10:36 b) we copy everything to another tree for the application (ie, modules/bodhi/ vs modules/bodhi.stg/ 19:10:41 b is a lot of changes tho. 19:11:08 well b is true for any app we're deploying twice in 2 different configurations 19:11:15 for example: rsync 19:11:24 we can have an 'rsyncd' module 19:11:32 but if our configurations are wildly different enough 19:11:40 it will be a pain to keep both of those in one module 19:11:55 the rsync stuff is more like plan a) above I think. 19:11:56 when, in reality, we end up needing rsyncd-download and rsyncd-projects or some-such thing for modules 19:12:02 the rsync stuff, currently 19:12:04 but another example 19:12:05 httpd 19:12:12 httpd really is more like b 19:12:29 we have a daemon which is going to provide divergent services with divergent configs 19:12:40 why should httpd be radically different? 19:12:41 putting them all in one module is just a recipe for confusion and frustration 19:12:55 jds2001: have you looked at our websites/httpd module layout? 19:13:02 ever tried to follow it back? it's a nightmare 19:13:22 anyway 19:13:28 my point is more like this 19:13:30 yeah, but it winds up being very modular, which i think is a good thing 19:13:41 jds2001: modular in much the same way that atoms are modular 19:13:43 but yeah, following it all back is living hell 19:13:44 yes, you can build anything 19:13:50 but you have to know WAY too much about it 19:14:09 * jds2001 wont argue that :) 19:14:09 ugh sorry.. 19:14:14 anyway - if bodhi in staging is going to be a major diverging point 19:14:26 perhaps what would be good is to post the changes we have on say app01.stg and have people propose their puppet solution. ;) 19:14:26 then I say just make a separate module dir 19:14:33 bodhi.stg or bodhi.tng 19:14:37 but it's not very different. 19:14:45 if it is only going to be config changes 19:14:50 but they are across a lot of files 19:14:52 then i'd say think about 19:15:05 bodhi/files/staging/somedir <-- and using recurse 19:15:14 and bodhi/files/production/somedir <-- and using recurse 19:15:36 ie: if it is all of the same piece - w/ different configs - then all in one module 19:15:55 if it is not really the same service/implementation anymore then separate modules 19:15:57 it's more just bodhi.cfg 19:16:13 then why is that difficult? 19:16:15 why not just 19:16:19 bodhi.cfg.$hostname 19:16:24 or bodhi.cfg.stuff 19:16:32 have bodhi.cfg be 'production' 19:16:37 and everything else is a modification around that 19:16:50 yeah, that was my thought, but not sure if it works for application developers. ;) 19:17:02 which part? 19:17:07 and more to the point 19:17:13 we did this with the postfix configs for stg hosts... added a conditional and stg hosts get foo.stg config. 19:17:17 abadger1999: ? 19:17:32 I thought our app devels were going to start treating the apps more like apps 19:17:35 and less like configs? 19:17:53 hum/ 19:17:53 those apps need to e configured, no? :) 19:17:55 ? 19:18:14 jds2001: that's not the same thing as hotfixes and patching 19:18:21 right. 19:18:37 to me, a hotfix should be a new rpm 19:18:46 nirik: I thought the discussion from fudcon resolved out to: 19:18:51 1. everything is production 19:19:02 if we change just cfg to have foo.cfg and foo.cfg.stg that works fine until you need to change something in a manifest, in which case you need to add it with a conditional for stg... which I don't know is that bad really... 19:19:04 2. we have some boxes called 'staging' but really all they are is production boxes for developers 19:19:12 right. 19:19:30 have we actually encountered the manifest conditional issue, yet? 19:19:38 so it's the same apps but with different config or versions or changes... 19:19:56 well, we are using it in the postfix stuff now I think... 19:20:07 nirik: explain? 19:20:09 if we have separate files for bodhi.cfg vs bodhi.cfg.stg; we'll need to have conditional manifests. 19:20:18 abadger1999: ??? 19:20:29 we need to have source= 19:20:31 have a list 19:20:38 but it's not CONDITIONAL 19:20:41 since we'll want a different files depending on whether the host is on a stg box. rght? 19:20:46 again 19:20:51 a source= fall through list 19:20:52 yeah. so, look at: 19:21:06 puppet/modules/postfix/manifests/init.pp 19:21:08 postfix, rsync, resolv.conf 19:21:16 yeah. 19:21:30 for staging hosts we set: postfix_group = stg 19:21:36 so they get a staging config. 19:22:33 but this only works for files, right? 19:22:47 umm this is unix - everything is a file :) 19:22:54 Hmm... I'm leary of mixing this in there as well but I'm not sure we'll hit what I'm thinking of in practice. 19:23:00 if stg needs a change from prod in say bodhi to work with a new version, that would need to be conditional. 19:23:01 abadger1999: huh? 19:23:02 Say we have bodhi.cfg, bodhi.cfg.masher 19:23:13 and then we need stg versions of both of those. 19:23:35 What's the fallthru look like? 19:23:37 yeah. 19:23:47 unless we're talking about 100 boxes 19:23:51 the fall through is 19:23:55 $hostname 19:24:03 $bodhi_group 19:24:10 bodhi.cf 19:24:12 g 19:24:14 or 19:24:21 $bodhi-group.$hostname 19:24:23 $hostname 19:24:27 $bodhi-group 19:24:30 bodhi.cfg 19:25:17 am I misunderstanding something? 19:25:34 So bodhi group would be ('', 'masher', 'stg', 'masher.stg') ? 19:25:46 bodhigroup can be whatever ytou want 19:25:59 bodhigroup=toshiolovesgingerale 19:26:20 you just make a file corresponding to it 19:26:45 I think the files part works fine with this, but the issue other changes in manifests, etc will need to be in a staging or host conditional... which means if you are not carefull you leak that change to production... but perhaps thats not such a big deal in practice. 19:27:04 nirik: I'm still confused by that 19:27:21 is there a change we normally make to enable/disable something that is not actually done in the config file 19:27:26 and then the service is reloaded? 19:27:31 why not make the conditional on bodhi-group 19:27:31 sorry 'notified' 19:27:37 skvidal: so, say we have a new raffle app. 19:27:37 jds2001: indeed 19:27:41 nirik: okay 19:27:51 we want to test out the new version. It needs 2 new packages installed. 19:28:06 so, we add that to the raffle module, but we don't want the production one getting that yet. 19:28:17 okay 19:28:20 so, we have to add it with 'if != staging' or whatever. 19:28:34 and that's onerous? 19:28:37 then we test it out and get it working after some more changes. 19:28:44 and we clip the 'if' 19:28:50 then we need to recall what things should be changed to make it alive in prod. 19:29:00 which are all inside the 'if' 19:29:11 unless there's other things in there that should stay in stg. 19:29:37 or someone forgets the if and adds it and it messes up production. 19:29:50 okay 19:29:59 I am failing to see how this is a problem that's unique to this situation 19:30:10 when we would migrate from master/staging with branches 19:30:12 perhaps it's not. 19:30:13 we'd have all of this and more 19:30:19 true. 19:30:30 Well.. it's what hte solution looks like. 19:30:36 abadger1999: ?? 19:31:46 So if we had a separate module-level directory for bodhi, we could capture the persistent differences between production and stg in conditionals 19:31:55 * nirik notes we make poor use of comments in puppet files. Perhaps using them more would help this kind of thing. ;) 19:32:09 And the testing-of-new-version changes would just be in the file itself. 19:32:34 abadger1999: okay... 19:32:36 then do that 19:32:45 * skvidal doesn't understand all the handwringing here.. 19:32:46 then when we merge from stg to production, or vice-versa, it would be a straight cp bodhi.stg/file bodhi/file 19:32:46 right. It's a lot more change tho. 19:33:16 So... it's just what it looks like/how we use it afterwards. 19:33:18 nirik: how so? 19:33:48 because puppet hates duplicates... so all classes will need to also be renamed, right? and then conditional on those in the node files? 19:34:05 bodhi::app::epelmasher vs bodhi-stg::app::epelmasher 19:34:44 nirik: I thought that was only true if you tried to include them both 19:34:50 * skvidal tests 19:34:52 yeah, I did too. 19:34:54 not sure. 19:35:07 easy to find out 19:35:12 * skvidal picks on his favorite fall-host 19:35:13 but then how do you include one of the stg ones? 19:35:23 in the host manifiest 19:35:28 bodhigrp=staging 19:35:36 if $bodhgrp=='staging': 19:35:40 include bodhi.stg 19:35:44 else 19:35:48 include bodhi 19:35:49 fi 19:36:15 so releng01.stg node file has: 19:36:17 include bodhi::app::masher 19:36:21 how does that translate? 19:36:27 yeah, puppet has no way of telling what's in your puppet code - only what tends up in a compiled catalog does it care about. 19:36:42 nirik: something I'm not sure of 19:36:43 if you can do 19:36:48 include $var::app::masher 19:36:50 If we could do include bodhi$env that would be even easier. 19:36:55 abadger1999: testing 19:36:56 yeah. 19:36:59 an dif someone does something stupid like tries o combine staging and production, they deserve the fail that awaits. 19:37:16 (and we want it to fail) 19:37:59 anyhow, how about we investigate this more and go on with the meeting? 19:38:08 works for me 19:38:33 19:38:36 if we can work out a way to have seperate dirs without massive churn it's fine with me. 19:38:44 We don't have to make it perfect, just better than what we have now ;-) 19:38:47 I wanted to throw out something with the staging re-work but not how we are goign to organize fiels in Puppet. 19:38:59 wsterling: go ahead... 19:39:34 As to the building out of infrastruture on-demand in staging there is a module that will allow Puippet to provision libvirt guests, https://github.com/carlasouza/puppet-virt#readme 19:39:51 huh... interesting. 19:39:58 * skvidal vomits 19:40:04 Cheff is more mature in that area but it would require an entire shift which would be hard... 19:40:04 1. we don't have 'images' 19:40:17 2. we do not have templates 19:40:33 3. I do not think we want to dig further into the hole which is puppet 19:40:37 this is just my opinion, though. 19:41:21 we have been looking at various cloud/virt setups to see if they will help us out too. 19:41:30 wsterling: thanks for the link. 19:41:46 #topic Upcoming outages 19:41:55 we have two outages tonight... 19:42:03 #info download-i2 outage tonight. 19:42:12 #info pkgs and koji outage tonight. 19:42:17 Hopefully they will go smoothly. 19:42:32 #topic Applications status / discussion 19:42:45 abadger1999 / lmacken / threebean: any application news of note? 19:43:02 New FAS will be coming out today. 19:43:06 new bodhi went out yesterday 19:43:16 I was going to see if I could summarize the url thread and come up with something we can all agree on and move forward with. 19:43:30 unless someone else would like to (that would be just fine with me too :) 19:43:36 Cool. 19:44:07 I played around with glusterfs some... if everyone could look at my mail and see if there are any other places where it might help us that would be great. 19:44:29 If anyone is interested in working on FAS, I have a few i18n issues that should be interesting to work on. 19:45:01 It could serve them well for working on i18n in other web projects as well. 19:45:15 #info will try and finalize url scheme next week. 19:45:28 #info please note uses for glusterfs on the list. 19:45:42 #info some easyfix fas tickets available in the i18n space. 19:46:05 s/easyfix/interesting/ :-) 19:46:13 also next week I can look at spinning up stuff for packages production... would be good to move forward to prod 19:46:22 sorry. 19:46:22 #undo 19:46:22 Removing item from minutes: 19:46:32 #info some interesting fas tickets available in the i18n space. 19:47:14 ok, any other apps news? OH... I have one more. 19:47:27 We talked about search engines again the other day. 19:47:49 I noted that sphinx had a mw plugin and would be easy to setup for that... but harder for everything else. 19:48:04 then we thought about xaipan... which is being used by tagger. 19:48:17 Thoughts on trying to use xapian for all our searching? 19:49:42 I'll likely start a list thread on that idea. 19:49:46 I think, see how easy it is to administrate xapian+omega for searching mediawiki. 19:50:00 Compare to ease of administrating sphinx-mw plugin. 19:50:05 yeah. 19:50:07 Pick the winner. 19:50:21 we could probibly setup a test xapian box and crawl the wiki and see. 19:50:46 I think sphinx will be easier for the wiki, but less easy for everything else. 19:51:11 It's not impossible to change search technology in mw later. 19:51:15 ping: back on the include $var discussion - when we get a chance 19:51:43 So if sphinx is a snap to setup, and maintain... I don't see a problem doing that until/unless we have the itch to deploy something more complex for all of fp.o 19:52:06 Otherwise it'll never get done :-) 19:52:06 well, the plugin still needs packaging, but yeah... we can see. 19:52:09 yep. 19:52:17 #topic Staging redux 19:52:26 skvidal: ? 19:52:35 $var=foo 19:52:43 include "${var}::app" 19:52:44 works 19:53:08 you can see an example of this in torrent02.fedoraproject.org.pp in manifests/nodes 19:53:23 you have to have the braces and the double quotes 19:53:31 w/o the double quotes it won't expand it 19:53:39 and puppet will quietly not include it 19:53:45 (and it won't tell you anything about that either) 19:53:57 neat 19:54:05 so, abadger1999 I think that's what you want 19:54:17 19:54:55 wfm if it works for everyone else. 19:55:01 so, with this setup we want bodhi.stg and bodhi to be exactly the same... except staging machines use bodhi.stg/ 19:55:34 so, when you make changes in stg it only affects stg and when you are done you can cp/rsync them to production with only your changes, right? 19:55:59 and I think if we decide any app configuration is sufficiently weird/diffferent that the above helps -we can do that 19:55:59 and we want that setup for any of our apps we usually test in stg? or ? 19:56:11 but in the cases where it is just a config file (like postfix) we don't need to 19:56:15 we can just use fall-through files 19:56:49 sounds good. 19:57:34 #topic Open Floor 19:57:43 anyone have anything for open floor? 19:58:05 euca and rhev 19:58:08 and $other things 19:58:17 last week I got to play with both eucalyptus and rhev 19:58:26 and they are definitely very different :) 19:58:44 yeah. 19:58:58 eucalyptus might get us a place where we can treat more systems like what wsterling was recommending 19:59:16 but instead of using puppet, we';d be using the euca2ools 19:59:19 what does it use as storage, btw? filesystem? lvm? 19:59:32 local + nfs + otherthings 19:59:45 you can have to talk to iscsi w/o any problem. 20:00:11 ok. 20:00:30 I'm going to work on installing euca3 tomorrow 20:00:40 which is currently 'devel' but it is a slushie devel 20:01:06 sounds good. 20:01:08 I talked to one of the euca devs for a while last friday and his recommendation is euca3 b/c the interface is better and a variety of features act more like they should 20:01:39 he also worked on a couple of ways to simplify deployment of new systems w/o having to rely on someone else's images 20:01:52 I talked about the ridiculousness of image creation on my blog last week 20:02:03 and it was followed up by a series of other comments/blogs 20:02:24 the reality is that creating images to use on instances in euca is stupidly hardly 20:02:40 yeah, sounds like there might be progress on fixing it ? 20:02:43 and there is no legitimate reason for this complexity - it appears to be entirely amazon's fault 20:02:46 yeah 20:02:57 by amazon I mean - b/c of a desire to copy ec2 20:03:12 no one has been focusing on the local deployment/image creation mechanism 20:03:22 but in each and every case the admin maintaining the systems has either 20:03:31 1. been using other people's images (which I cannot personally imagine) 20:03:47 2. or hacking it up by themselves and figuring out the same 40 steps everyone, ultimately, takes 20:04:11 in the case of some red hatters they spent a lot of time writing boxgrinder to work around ec2/euca not allowing you to simply run the frelling installer 20:04:21 it is, on its best day, the dumbest thing I've ever seen 20:04:29 not their work 20:04:31 it's not dumb 20:04:43 that they had to do it - rather than the cloud systems just fixing it is dumb 20:05:03 well, hopefully it can get fixed in the right place... and you can just install whatever you want to install. ;) 20:05:06 yes 20:05:08 exactly 20:05:10 there is some progress 20:05:14 we'll see where that goes 20:05:30 I'm still using junk03 and 05 right now 20:05:48 and 02 is a rhev manager install - but it can be reinstalled - I have all my notes on it 20:05:50 we should have some more machines there soon for testing. 20:05:56 junk04? 20:06:06 and 3-4 new ones. 20:06:09 nirik: great 20:06:17 it would be great to do some torture tests on gluster 20:06:27 I'm going to do some more gluster testing... yeah. 20:06:34 in particular on slow links. 20:06:51 nod and I'd like tosee how well it performs with brutal writes 20:06:55 even over fast links 20:07:01 like lots and lots of small files 20:07:21 the case I'm thinking of is all the cache and other data that packages is going to be making 20:07:27 especially the git checkouts 20:07:28 yeah, my test the other day is some virtuals here at home... but I can setup more on new junk boxes easily. 20:07:50 for serving lots of small files, the docs suggest using the nfs frontend... 20:07:58 but thats less than ideal for failover. 20:08:22 anyhow, we are over time, anything else? or shall we call it a meeting? 20:08:57 I don't have anything else. 20:09:53 thanks for coming everyone! 20:09:53 #endmeeting