19:00:01 #startmeeting Infrastructure (2011-05-26) 19:00:01 Meeting started Thu May 26 19:00:01 2011 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:01 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:00:01 #meetingname infrastructure 19:00:01 #topic Robot Roll Call 19:00:01 #chair goozbach smooge skvidal codeblock ricky nirik abadger1999 19:00:01 The meeting name has been set to 'infrastructure' 19:00:01 Current chairs: abadger1999 codeblock goozbach nirik ricky skvidal smooge 19:00:05 * ricky 19:00:11 * skvidal is here 19:00:13 * janfrode 19:00:15 * CodeBlock 19:00:39 * Klainn lurking 19:01:17 ok, lets go ahead and dive in... 19:01:32 #topic New folks introductions 19:01:38 any new folks want to say hi? :) 19:01:51 or apprentice folks like to chime with any questions or comments... 19:02:13 janfrode: I saw your check on that one machine... meaning to reply there. ;) Sorry for the delay. 19:02:37 nirik: yes, was it something like that you wanted per machine ? 19:02:59 yeah. I think we can fix some of the things on that one and it will help with others. 19:03:35 * nirik will try and look in more detail on that later today. 19:04:09 any other new folks/apprentice issues? I did update the page a bit more: https://fedoraproject.org/wiki/Infrastructure_Apprentice 19:04:48 I'll also try and find a few more easy tickets for apprentice folk to look into if they find them interesting... 19:05:03 ok, moving on then in a minute... 19:05:11 * StylusEater is late, sorry 19:05:22 is late also, sorry 19:05:25 no problem. 19:05:38 you guys have any questions or comments on apprentice stuff? 19:05:47 here but not here 19:05:57 do we think it is working? 19:06:06 I think it's too early to tell... 19:06:10 does being able to login and look around help? 19:06:16 does for me 19:06:17 nirik: not right off hand 19:06:18 skvidal: can we run the puppet changes from that ticket? 19:06:30 from which ticket? 19:06:31 StylusEater: which ticket? 19:06:32 well in understanding tuff 19:06:33 skvidal: yes being able to login does help 19:06:34 stuff 19:06:43 goozbach: tuff stuff 19:07:00 nirik: grabbing it 19:07:21 .ticket 2777 19:07:22 StylusEater: #2777 (/etc/system_identification missing on some hosts?) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2777 19:07:59 skvidal: we were in freeze when the last meeting happened and you mentioned to put some changes in the ticket 19:08:04 ah 19:08:05 yes 19:08:09 thanks for reminding me 19:08:13 yah - I can check that in 19:08:13 cool. 19:08:40 skvidal: mcpita feedback too if you have time. 19:08:55 .ticket 2747 19:08:56 StylusEater: #2747 (publictest minder script/db) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2747 19:08:56 StylusEater: oh yes - I completely forgot about that... 19:09:06 StylusEater: again, thanks for the reminder 19:09:08 excellent. We can handle those out of band? 19:09:13 yes 19:09:38 ok, moving on in a minute unless something more on this topic comes up... 19:09:57 when is the apprentice cleanup? 19:10:25 CodeBlock: good question. I had it as part of the per release housecleaning, but I think we should make it more often. ;) 19:10:37 I was going to mail the group on the 1st and ask for feedback and how people were doing. 19:10:50 people who don't reply by say the 7th, we could remove? 19:10:55 sounds good 19:10:58 +1 19:11:00 we can always readd people who are away. 19:11:04 yep 19:11:20 nirik: good idea 19:11:48 #action nirik will ping all fi-apprentice folks on the 1st. Inactive people will be removed the 7th. 19:11:56 #topic Upcoming Tasks/Items 19:12:18 CodeBlock: when did you want to do the noc01 update? monday (which is a holiday), or ? 19:12:41 gah, yes Monday is a holiday ..and Tuesday is updates day 19:12:50 2011-05-31 18UTC UPDATES/reboots 19:13:07 Wednesday? 19:13:08 also, Smooge wanted to do secondary cutover on the 31st (from 01 to 02) 19:13:15 (re nagios) 19:13:16 sure. 19:14:04 also, next week I am going to look at a new puppet and app01. 19:14:10 (or later this week I guess) 19:14:20 they are both on xen14, which is heading out of warentee. 19:14:39 so Wednesday the nagios stuff is definitely happening, it's been pushed back like 5 times now. 19:14:46 yeah. 19:14:54 it would be good to get it over. 19:14:57 yeah 19:15:07 Other upcoming items: 19:15:14 2011-06-14 or so: post release housecleaning tasks. 19:15:14 2011-06-17 FPCA drop dead. 19:15:47 I'm going to try and add more details to the post release housecleaning. 19:15:51 Re: the fpca drop dead. I have a script to gather some fas/pkgdb metrics to run. 19:15:53 will mail the list for more feedback 19:16:02 abadger1999: cool. 19:16:34 Will do that sometime this week. 19:16:43 I think we are about 33% of people in cla_done+1 who have signed the fpca now. 19:16:58 nirik: That's great news! 19:17:38 excuse me, 36% it seems. ;) 19:18:07 ok, anyone have any other upcoming items they would like to discuss? 19:18:49 ok, moving on then... 19:18:56 * skvidal had one 19:19:01 go ahead: 19:19:07 merging of staging into master 19:19:09 * nirik can't type. go ahead. 19:19:10 and doing away with staging 19:19:28 1. is anyone opposed to getting rid of 'staging' in our puppet git repo? 19:19:40 I think stg the instances/machines has been useful. 19:19:57 nirik: the machines, sure 19:20:08 but I'm just talking about the staging 'branch' in our git repo 19:20:09 you mean change how we deal with it, right? 19:20:13 yes 19:20:32 I've found that generally folks apply changes to master 19:20:37 and never merge them over to staging 19:20:42 for a while I thought it was just me 19:20:47 but then I discovered it was not 19:21:10 so in looking at what we have 19:21:16 I think if we can come up with a better way that would be great. 19:21:28 well since the machines we use are just hostname.stg.blah 19:21:37 and we have to have that anyway for the nodename 19:21:51 .stg modules? 19:21:57 is there any reason that we want to make something to test out - we couldn't do them the same 19:21:59 That we delete when we're done? 19:22:03 or -staging class names 19:22:09 ricky: why not? 19:22:16 ricky: it's not like we're running out of disk space ;) 19:22:25 or is there some way to #include the real one and just add our changes/staging changes above that? 19:22:54 Puppet lets you do that, but then you can take those configs and paste them directly into the production configs 19:22:55 inheritance in puppet is sometimes...... challenging to follow out 19:23:05 So it's probably easier to just copy entire modules or classes that you need to modify 19:23:09 nod 19:23:23 There's things I like about the staging branch setup... but maybe it could be done differently. 19:23:41 abadger1999: what do you like about it? 19:23:47 well, most of the time they will not be different from the real machine they are staging for except for a staging variable and /etc/hosts, right? 19:23:58 Being able to merge master <=> staging is nice when you just have your changes -- debit is it's a pain when there's other peoples changes and you don't know what to do with it. 19:24:30 At work we run two git branches for puppet.. "master" on our lab/staging servers, and "production" on our production servers. We try to do all changes in "master", test them a few days on lab/staging before merging to "production". 19:24:42 having the $environment variable so a template can be the same for staging and production but create a different file for .stg and prod is the other thing I like. 19:25:00 Part of the trouble stems from the fact that our puppet configs is one giant repo, instead of being split out into separate bits that people care about. But I don't see that changing anytime soon 19:25:08 janfrode: yeah, our prod is 'master' which is a bit backwards. ;( 19:25:24 abadger1999: see the $environment actually is one of the items that bothers me 19:25:34 I dislike git branches in general, though, so if we can get something siilar without the git branches, I'd be more than happy. 19:25:36 So you end up doing a merge, and having 10 other people's staging changes conflicting 19:25:47 ricky: yes 19:26:20 abadger1999: b/c you end up diverging so much and not being able to find the changes 19:26:22 it would be nice whatever we end up with has a workflow that requires the changes in stg first... 19:26:25 but part of this is the way git does branches 19:26:32 nirik: except that doesn't work 19:26:38 skvidal: I mean in a template: 19:26:43 nirik: The problem with that right now is that our staging environment isn't complete :-/ 19:26:45 nirik: since our staging environment doesn't even remotely mimic our actual envir 19:26:46 or automatically having them appear in stg when they land in prod 19:26:56 yeah, there is that too. 19:27:01 nirik: automaticlaly just means anyone doing any real serious new work 19:27:05 in staging 19:27:12 There was work to make koji01.stg actually work and be testable against two years back, but that never finished 19:27:18 is going to get tromped on when someone adds something in master 19:27:23 I think staging has been mostly useful for web stuff, and that's about it 19:27:34 ricky: and then only with great effort 19:27:36 *Maybe* testing updates, if people notice that they break 19:27:38 skvidal: Things like this: http://fpaste.org/OYiM/ (from pkgdb.cfg) 19:28:03 it was usefull for pkgs01 changes recently. 19:28:03 abadger1999: that exact same thing - in the postfix config - makes me want to hurt things 19:28:17 this is just what ricky and I were discussing 19:28:17 skvidal: ah? So it's causing lots of divergence? 19:28:23 * abadger1999 looks at postfix config 19:28:31 in #fedora-noc 19:28:43 our modules are named after the app name 19:28:47 NOT after the service name 19:28:50 so we end up with 'httpd 19:28:51 module 19:28:54 which is useless 19:29:00 b/c everyyhing has an apache server on it 19:29:08 and we end up with entire world inside that module 19:29:26 we're much better off with service-instance names for our modules 19:29:28 so we have a module like 19:29:36 mailman-project 19:29:38 mailman-hosted 19:29:42 etc, etc 19:30:01 rather than having to do some horrible things to create 1 template file for fairly divergent configurations 19:30:13 * nirik nods. 19:30:21 skvidal: I don't see any <% if environment ==.... in the postfix module. 19:30:23 I'm fine with configfile.cfg.hostname, too 19:30:32 abadger1999: in that case it's not the environment that's tripping it up 19:30:36 skvidal: I think the service thing is different than the environment thing 19:30:38 abadger1999: in that case it is the 'is router' thing 19:30:44 * ricky isn't a huge fan of configfile.cfg.hostname 19:31:00 abadger1999: it is an it is not - it's all of the same problem in my head 19:31:07 skvidal: Yeah -- I'd agree with you that using templates for doing different services from the same template is not something we really want to do. 19:31:23 We end up having rsyncd.conf.download[1-5] and having to copy the same file back and forth - it'd be nice to have at the servertype level instead of at the node level 19:31:26 abadger1999: the issue is - we seem eager to conflate things to being 'one service' but it ends up costing us a lot of time when we have to make a big change 19:31:40 I like it for environment as it means changes to the staging and prod environment are in sync.... don't have to update in two places. 19:31:52 ricky: well in that case a service-instance would be like: rsyncd-download 19:32:26 but I would not recommend a global 'rsyncd' module b/c it just means we cram a bunch of disparate junk into the rsyncd module dir 19:32:32 Yes, I'd love to have that. And have most puppet things split by server type in a consistent way. 19:32:42 janfrode: you've said you use puppet in production 19:32:46 janfrode: how do y'all split up your modules? 19:33:17 abadger1999: how does it stay in sync? If you make a change to the template in master - you have to update it in staging, too - or at least merge it 19:33:26 skvidal: it's per app, not service.. 19:33:45 janfrode: so you have one big dir of 'httpd' that has a billion machines in it? 19:34:07 skvidal: Yes, but if we had staging and master and one config file for the staging instance and one config file for the master instance, then we'd have four places to edit. 19:34:17 skvidal: not gotten there yet.. We don't have httpd configs in puppet yet. 19:34:25 abadger1999: I'm not sure I understand 19:34:34 skvidal: I think the idea is that you make a change to stg, test, fix, come up with a final change that work, and then that exact change can be apply to master... 19:34:51 nirik: that is my understanding too 19:34:55 nirik: which, again, fails b/c are staging env is only a handful of machines 19:35:00 * nirik wonders if bcfg is looking better (if only for a chance to reorg our existance:) 19:35:04 One thing I wouldn't mind is to have a small httpd module that sets up basic stuff like logging, and then have anything more than that be configured per-service 19:35:05 people, for example, planet, most/all of koji 19:35:17 skvidal: I'd love something like a template when included into a staging host it uses the staging $environment ; when included into a production host it uses the production $environment. 19:35:26 skvidal: but we do have things like an sudo module, and a separate "project" module that overrides the sudoers "content" setting. 19:35:32 skvidal: so we're really trying to find a way to work around the assymetry of the environments 19:35:53 skvidal: although.... I guess that we'd need some way to prevent the new version of the file going to production before we've tested in staging... so we always need at least two steps... hmm... 19:36:08 StylusEater: and the persistent assymetry - since I doubt seriously we're ever going to duplicate the whole thing 19:36:15 skvidal: yes 19:36:31 nirik: so maybe that's a path to take 19:36:57 so can we identify base configurations that will always stay the same...maybe pack those into modules, then per node make tweaks? I imagine that's what skvidal is saying but I don't want to put words in his mouth. 19:37:08 Are we actually ever hoping to have a real staging environment, or will staging always only be useful for testing web app changes? 19:37:21 skvidal: well, or I guess targeted pruning/cleanups... ie, pick one thing and fix it up, next... 19:37:21 If we decide that, then maybe we can better design how staging vs. prod will actually work 19:37:25 ricky: I would not wager on ever having a real/complete staging env 19:37:53 nirik: so let's talk, briefly, about the places where staging is used 19:37:56 1. web app dev 19:37:58 2. nagios 19:38:08 those are the 2 I know of 19:38:10 hardly nagios... 19:38:22 CodeBlock: I was just going to mention that 19:38:36 yeah, mostly/all web app stuff... 19:38:37 that was basicaly a testing ground for nag3... 19:38:38 there is no point in staging nagios - it doesn't conclict 19:38:41 skvidal: I'd say the staging env works fairly well for testing some things -- the incompleteness doesn't mean we can't test read-only operation of bodhi, for instance, but we can't test that bodhi-koji-mash-createrepo all works. 19:38:47 *basically 19:39:02 But that mostly falls under web, right? 19:39:12 it's also used for pkgs/git/scm testing 19:39:24 and possibly db stuff? 19:39:25 abadger1999: right - so we don't know that when we update rpm or somesuch that we'll break something in the mash stack 19:39:27 simiarly, we can test pkgdb lets people change package ownership, but not whether pkgdb updates bugzilla bugs. 19:39:41 abadger1999: so help me out here 19:39:42 skvidal: right. But that doesn't mean we have to give up on testing bodhi in staging. 19:39:46 no 19:39:49 I'm not saying we do 19:39:52 19:39:59 I don't think anybody here wants to give up any staging functionality we have now 19:40:08 right now if you want to stage something out 19:40:09 you do 19:40:10 Just want to find out which functionality we want/need from it 19:40:12 git checkout staging 19:40:13 modify 19:40:15 commit 19:40:16 push 19:40:21 blah blah 19:40:25 test changes 19:40:35 git merge your changes 19:40:40 see how much they conflict 19:40:41 manually fix 19:40:45 git push 19:40:53 all I'm really suggesting is this 19:41:00 instead of doing git checkout staging 19:41:10 for your modules you're acting on 19:41:19 just make -staging copies of them 19:41:20 modify 19:41:29 and merge them back to not-staging when you're ready 19:41:41 skvidal: How do you merge them back when you're ready? 19:41:44 abadger1999: cp? 19:42:01 Well, that doesn't work without the env var. 19:42:03 abadger1999: git diff/patch 19:42:06 abadger1999: huh? 19:42:09 Conflicts can still happen unfortnuately :-/ 19:42:11 without the env var? 19:42:13 ricky: yes 19:42:21 ricky: but you don't have to dance between git branches 19:42:26 ricky: and occasionally have to start over :) 19:42:28 skvidal: So if we can get the s/env var/puppet environment var/ working, that would be fine. 19:42:46 abadger1999: I'm really not understanding the issue 19:42:51 skvidal: If you had pkgdb.cfg configured for staging... it has to take into account the incompleteness of staging. 19:42:52 Hm. I happen to hate git less than most people here, so maybe it's just me 19:43:07 But I prefer conflicts git knows about to conflicts git doens't know about 19:43:10 for instance, we don't want to send email in staging for somethings, but we would in production. 19:43:11 ricky: I'm not alone in having to 'rm -f puppet' and git clone it all again 19:43:25 But yeah, I can see the pain for people that don't like git :-) 19:43:38 abadger1999: so you just want to set an env var that cascades down? 19:43:51 skvidal: So the staging config isn't going to be 100% like the production config... unless we can conditionalize those sections. 19:43:59 We could keep the environments variable and just get rid of the branch. 19:43:59 skvidal: I think that might work. 19:44:05 skvidal: he wants to set that in stg when testing, but remove it before commiting/pushing to master/prod 19:44:07 skvidal: Then we could cp the files 19:44:15 Actually, we could even do this: 19:44:28 For all staging changes, add an if $environment == "staging" { stuff } 19:44:34 Then manually merge that in after you're done 19:44:58 ricky: We could do that -- I like cp the file better though -- 19:44:58 Then at least conflicts in the puppet manifests don't happen because there's only one copy of the file 19:45:07 Unfortunately conflicts still happen in config files 19:45:19 that keeps the conditionalized sections for things that are permanent differences between the environments. 19:46:35 okay - so it sounds like there is a fair amount of difference of how we do these things. 19:46:36 so pkgs01.stg.whatever would be a hard link to pkgs01.whatever and when making stg changes, add them to the $env == "staging" 19:46:44 I'm becoming crazy about the copy/rename classes/files because it increases the startup time for tiny changes that I typically do 19:46:49 **becoming less crazy 19:46:55 skvidal: So question about staging in branches -- do you think more/most of the issues are the sync from master=>stg not happening (rather than stg => master) ? 19:47:09 abadger1999: for me it has been both ways 19:47:13 k 19:47:16 * nirik thinks we need to ponder on this morning perhaps. 19:47:19 abadger1999: and also it has been a fair bit of "where the hell is this" 19:47:20 more. 19:47:23 sheesh. 19:47:27 skvidal: +1 to that last point. 19:47:34 I think the problem is that stg => master doesn't happen enoguh, or doesn't happen cleanly (ie it has to be via a cherrypick) 19:47:35 That drives me crazy too. 19:47:41 abadger1999: b/c I hate having that several minutes of "where is this in puppet" which I invariable have to do 19:47:45 Which then cascades into making master => stg all broken 19:48:13 * nirik hugs git-grep 19:48:20 git diff --stat staging master 19:48:21 665 files changed, 3846 insertions(+), 22484 deletions(-) 19:48:31 this is a situation where I would love to have a separate git repo for each of the major sections/modules 19:48:33 not a branch 19:48:36 a separate TREE 19:48:45 so when someone on the web team modifies a file 19:48:48 I don't have to care 19:48:50 Yes, that's what I mentioned above with the "doesn't seem too likely to happen" :-/ 19:49:09 It would mean that I can do merges on just the stuff that I care about 19:49:23 can puppet deal with multiple repos like that? 19:49:23 So somebody else's weird conflicts become not my problem :-) 19:49:32 Yeah, puppet doens't care about the repos at all 19:49:36 skvidal: Maybe we should break smolt off into a separate tree when we do that split and experiment? 19:49:39 ie, if we split out say smolt into it's own thing 19:49:39 It's just our git pull script would have to change 19:49:45 abadger1999: jinx. 19:49:50 :-) 19:49:50 It's quite easy to do incrementally 19:49:55 nirik: all we do on puppet push 19:50:00 nirik: is copy the files over to a path 19:50:06 nirik: it's not rocket science-y :) 19:50:25 We could do that, or smolt could be our first bcfg module ;D 19:50:43 anyhow, we have spent a while on this, shall we ponder more on it? or do we think we are going to come to a conclusion here? 19:50:53 CodeBlock: :) 19:51:05 nirik: acutally 19:51:11 with what codeblock said, in mind 19:51:16 I'd be very happy with splitting modules into separate repos, and I think it'd alleviate a ton of the merging pain 19:51:19 Iwas curious if you'd be willing to do something on the puppet1 upgrade 19:51:43 * nirik glanced at bcfg the other day. 19:52:01 nirik: how about naming the machine adminsrv or something like that 19:52:08 instead of the application-specific name of puppet 19:52:14 we can have as many cnames as we like :) 19:52:25 sure, that seems like a good idea... 19:52:28 but in the event we do decide to move to something else..... 19:52:51 Wouldn't we do a rebuild/rename when we move to something else anyway? 19:53:01 ricky: not necessarily 19:53:08 ricky: just standing up an app 19:53:09 Not hugely against renaming now, but don't see how the trouble of doing it can't be bundled with the trouble of switching 19:53:15 ricky: is all that's actually required 19:53:26 and it keeps us from making confusing statements when testing like 19:53:26 OK, I like rebuilding stuff, but fine then 19:53:44 "go edit bcfg2 in puppet" 19:53:47 "waha?" 19:53:58 etc 19:54:03 I think we don't want a 'lets cut all over to this tuesday'... instead we might keep using puppet for a while and have say bcfg2 in a mode where it makes no changes, but reports, etc. 19:54:12 nirik: nod 19:54:17 so, the same machine could well be running 2 of them at once. 19:54:32 nod 19:54:42 yeah 19:54:43 since in both cases they are both just a webserver/service 19:54:46 and a dir full of files 19:55:16 so, summary here: we don't like the branch and merge pain, we want to figure out a plan to remove that pain, further discussion on list or in -admin? 19:55:34 sure 19:55:45 #topic Quick Release retrospective 19:55:54 * CodeBlock might play around with setting up bcfg on a pt or three 19:55:58 * CodeBlock shuts up now 19:56:00 Anyone have items from the release? what went well? what blew up? 19:56:39 websites had a bunch of changes, but websites folks did great making them quickly... :) 19:56:52 our mirror tiering system didn't work too well, and we need to fix it before next release. 19:57:01 With the new spins creation process, the file name changes shouldn't bite websites as much next time. As in there shouldn't be any :-) 19:57:12 that would be good. 19:57:21 And we've now got a list of docs that we double check with docs in advance, which should mean less broken links next time 19:57:35 **fewer 19:58:34 #topic Open Floor 19:58:38 anything for open floor? 20:00:12 * nirik listens to the crickets 20:00:19 ok, thanks for coming everyone. 20:00:23 #endmeeting