19:00:18 #startmeeting Infrastructure ansible meetup (2013-07-17) 19:00:18 Meeting started Wed Jul 17 19:00:18 2013 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:18 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:00:19 #meetingname infrastructure-ansible-meetup 19:00:19 #topic Intro 19:00:19 The meeting name has been set to 'infrastructure-ansible-meetup' 19:00:29 hey everyone. who's around for some ansible talk? ;) 19:00:35 * oddshocks 19:00:38 * Smoother1rOgZ 19:00:40 * kubo 19:00:41 here :) 19:00:43 * pingou 19:01:01 * threebean is here 19:01:03 * nirik notes he's also in a fesco meeting and arm has this room in 1 hour. ;) 19:01:09 * tflink is here 19:01:29 * abadger1999 here 19:01:38 so, what I want to do here is just go over our existing ansible setup and some basic ansible info and then toss out some questions we have going forward. 19:01:49 then everyone can consider those and we can come up with a plan. 19:02:22 so, a bit of background first: 19:02:42 In fedora infrastructure in the past, we had/have puppet. 19:02:52 we are migrating over to ansible. 19:02:53 * lmacken here 19:03:00 http://infrastructure.fedoraproject.org/cgit/ansible.git/ 19:03:04 is our ansible repo. 19:03:24 we also have a ansible-private repo that contains passwords and keys and such in it that is not public. 19:03:49 in the existing puppet model, we have every single host pulling from the puppetmaster 2x/hour. 19:04:16 in the new ansible world, things are largely push on changes. 19:05:17 we have a ssh-agent on lockbox01 that contains the ansible key. Using sudo there, sysadmin-main folks can run playbooks and commands and use the ansible key out to hosts. 19:05:38 we have a number of things moved to ansible so far: 19:06:14 all builders/buildvms/buildvmhosts/releng, cloud stuff, all our arm machines, badges, mirrorlists and probibly some more I am forgetting. 19:06:50 in the past we have/had func for one off commands on hosts. The ansible command should have this pretty much replaced now. 19:07:35 so only -main can fire ansible? 19:07:35 for playbooks we also have some nice logging in place. 19:07:49 Smoother1rOgZ: currently that is the case... will talk more about that in a bit. ;) 19:07:57 * Smoother1rOgZ nods 19:08:09 if you look in the ansible/scripts/ dir there's a number of handy scripts. 19:08:30 one of these is 'logview' that lets you see any thing you like about playbook runs 19:08:59 this also logs ansible-cmd runs. 19:09:24 It also sends out a daily log of changes made via ansible. 19:10:29 we also have scripts in there to do virthost updates, etc. 19:11:01 #topic handy command line things 19:11:13 So, some things I personally have been using a lot with playbooks: 19:11:32 --list-hosts helpfully tells you exactly what hosts will be affected by that playbook run. 19:11:48 --check just tells you what would change, but doesn't make any changes. 19:11:53 nice 19:12:12 --limit -l (limits what hosts are affected... so you can run a playbook but limit it to just one host in the group) 19:12:31 --diff (great with --check) also shows you diffs of any copy/templates changes. 19:12:56 wow 19:13:02 -f N (number of forks to run... I think it defaults to 5 or so, but you can do 1 if you want to run one at a time 19:13:23 --start-at-task=taskname will let you run a playbook, but skip to that task. 19:13:24 this sounds awesome 19:13:35 --step (step thru plays one at a time) 19:13:57 * tflink needs to read more docs, didn't realize half of those existed but has wanted them when running playbooks 19:14:06 tflink, me either 19:14:50 yeah, they are all very handy. ;) 19:15:01 #topic Making a new instance 19:15:19 so, right now when we want to spin up a new instance for foo, you can do much of it in ansible. 19:15:37 1. add dns and/or 2fa keys (not in ansible) 19:15:47 2. add the new machine to ansible inventory 19:16:01 3. add ./inventory/host_vars/FQDN host_vars for the new host. 19:16:56 that will have in it ip addresses, dns resolv.conf, ks url/repo, volume group to make the host lv in, etc etc. 19:17:10 4. add any needed vars to inventory/group_vars/ for the group 19:17:33 this has memory size, lvm size, cpus, etc 19:17:45 5. add tasks/virt_instance_create.yml task to top of group/host playbook 19:18:10 6. run the playbook and it will go to the virthost you set, create the lv, guest, install it, wait for it to come up, then continue configuring it. 19:18:18 it's pretty slick. 19:18:26 look at mirrorlists* for a good example. 19:18:32 cool.. i need to do a portion of that soon for the badges production nodes 19:18:40 although I think some of it you already did on the first pass through 19:19:32 yeah. 19:19:42 a similar set of steps applies for cloud images. 19:19:47 s/images/instances/ 19:20:22 so, thats kinda what we have implemented so far... any questions on all that stuff so far? 19:20:57 nice, but i don't understand how ansible catch variables from inventory/host_vars/FQDN.. from where do you call ansible command? 19:21:33 kubo: so, you run 'ansible-playbook /path/to/playbooks/playbook.yml' 19:21:39 in that playbook there should be: 19:21:57 lets take a real example actually: 19:22:05 lets take a real example actually: playbooks/groups/mirrorlist.yml 19:22:14 in there is a line: hosts: mirrorlist 19:22:28 looking in inventory/inventory we see the mirrorlist group: 19:22:41 [mirrorlist] 19:22:42 mirrorlist-osuosl.fedoraproject.org 19:22:42 mirrorlist-ibiblio.fedoraproject.org 19:22:42 mirrorlist-phx2.phx2.fedoraproject.org 19:22:43 when you say "you can do these things in ansible", do you mean add it to the ansible git repo and then ansible will configure those aspects? 19:22:47 so it needs to run on those 3 hosts. 19:23:18 it then looks for variables... it looks first for a host variables for each host, then group variables, then global variables. 19:23:33 http://www.ansibleworks.com/docs/playbooks2.html/#understanding-variable-precedence has the full list. 19:23:37 do we have a git hook to trigger ansible once made changes or do we need to fire ansible on our one? 19:23:45 I thought global variables has the lowest precedence 19:23:50 ok, get it. I have a ittle problem with ansible variables :) 19:24:08 tflink: yeah, sorry, I was just listing them out there, thats not the correct order of precidence. 19:24:32 abadger1999: you can add to the git repo, then run the 'sudo -i ansible-playbook ...' to make it happen. (currently) 19:24:40 Smoother1rOgZ: we don't yet, going to get to that here. ;) 19:25:29 Do we have to extend the playbooks? I saw that we are using existing ones only. 19:25:49 not sure I understand the question... 19:26:03 we have been adding playbooks as we go and add new hosts or migrate things from puppet. 19:26:52 hm, ok - I mean the logic components for playing the playbooks are part of the ansible distribution 19:27:10 Do we need to extend them or is "anything in place" - aka batteries included? 19:27:14 yeah, the modules and such we use are pretty much upstream in ansible. 19:27:33 I've not run into much that needed any extension, but we can actually do that if we need to. 19:27:39 k, nice 19:27:50 ie, if we needed a local module for something we could just add it to our git repo until it's upstreamed. 19:28:14 ansible is available in fedora/epel, so do everyone feel free to play with it on your home machines. it's easy to do so 19:28:27 nirik: will do :) 19:28:39 #info short term needs/wants 19:28:48 #topic short term needs/wants 19:28:57 * mdehaan is done w/ other call 19:29:08 so, there's a bunch of short term things we want to try and do (or me at least): 19:29:19 * Smoother1rOgZ uses it at $dayjob but with different architecture ;) 19:29:27 Disable/remove func. 19:29:27 way for non sysadmin-main to run playbooks/commands. 19:29:27 way to trigger runs from commits. 19:29:27 better use of roles? 19:29:27 concrete way to handle stg vs prod 19:29:28 concrete way to handle hotfixes/patches 19:29:29 setup backup02 agent 19:29:53 all of these are things skvidal and I talked about and he was working on. ;( 19:30:00 we can probibly nuke func anytime 19:30:11 Ooo 19:30:25 for the others we need some more discussion. 19:30:27 * threebean nods 19:30:27 for the non-sysadmin-main permissions.. did you two have anything in mind? 19:30:31 poor Func 19:30:37 * Smoother1rOgZ recalls having some chat with seth & nirik regarding the prod/stg layout 19:30:52 it served us well, but ansible does everything now we needed it for. ;) 19:31:01 its tricky.. because I think it means you need to have read access to the private repo (unless we delegated responsibility to a daemon or something). 19:31:02 that was the plan! 19:31:38 so, on non sysadmin-main playbook runs: I have a wrapper script skvidal wrote that we were thinking of trying out. It would look at the playbook and the user and see if they were allowed to run that. 19:32:16 we were also talking about a trigger in git that could fire off runs. 19:32:27 :) 19:32:33 but that was going to be pretty complex. 19:32:44 so, I am open to ideas on how best to handle things. 19:32:57 depends on what that trigger would stand for 19:33:07 git hook Smoother1rOgZ I think 19:33:13 I'd really like to get it so people who have perms on some group of hosts to sudo can run playbooks on them... or anyone can trigger runs if they make some minor change later. 19:33:31 yeah.. the git trigger loses a lot of the flexibility like limiting to certain hosts, stepping through playbooks, etc.. 19:33:56 right now with puppet there's several work flows: 19:33:57 I know - my concern is more about what type of trigger 19:33:58 nirik: ansible is pull as much as push right, could we invert the approach: if you have sudo on X want to update X, go to X and run ansible on it? 19:34:08 a) I am making this important change and I want it pushed/applied asap! 19:34:13 or would this be too complex/cumbersome? 19:34:22 b) I am making this minor/cosmetic thing and I want it changed sometime later. 19:34:37 c) I am making this change, but want to give other people some time to look at my commit before it goes live 19:34:51 b and c are the same no? 19:34:51 pingou: won't work for our setup due to private repo... 19:34:59 nirik: good point 19:35:00 unless we make private available everywhere. 19:35:07 no 19:35:17 could we use two user setting with sudo? 19:35:19 yeah, I guess b and c overlap. 19:35:28 but if someone do 'a' after 'c', then 'c' do not work 19:35:34 nirik: I don't think b and c are the same. 19:35:35 nirik: perhaps overkill but something to think about, we've got at least one user integrating things with Jenkins + Gerrit 19:35:46 nirik: C has something like a mentoring workflow - which is great for beginners (like me) 19:35:50 yeah, we talked about that a bit too... 19:36:17 rhaen, +1 :) 19:36:18 I'd be more interested in that if any of those CM things were packaged and sane. ;) 19:36:18 mdehaan: we are using Jenkins that way at work for puppet - which works well. 19:36:32 CM? 19:37:20 sorry, CI (Continuous integration) 19:37:44 k. Well, Jenkins provides a yum repo which is ...well, working 19:37:59 however nowhere near Fedora packaging guidelines 19:38:05 yeah, but... jenkins: not packaged the way we require and java. 19:38:14 buildbot is a bit behind in the repos, but its been working well for me thus far 19:38:18 nirik: yep. :( 19:38:33 I've not looked at buildbot too much 19:39:30 nirik: what are the guys using for the buildvms? 19:39:59 so, we don't need to decide this now, but good to think on. Anything else in the short term list people would like to discuss or have me expand on? 19:40:05 nirik: I have playbooks for buildmaster/buildslave setup if you're ever interested 19:40:05 rhaen: libvirt/kvm? 19:40:47 nirik: oh, sorry - thought of testing stuff and the test driver for it. nm. 19:40:47 tflink: ok. I can take a look. 19:40:59 tflink: playbooks for buildbot/taskbot? 19:40:59 nirik: do you have all of this written down somewhere or on the wiki? 19:41:04 tflink: interested! :) 19:41:05 currently we aren't using ansibles 'roles' much at all. we should look at that. 19:41:06 handsome_pirate: yep 19:41:11 tflink: I'd like those, if possible 19:41:12 Smoother1rOgZ: not yet, but I can. 19:41:13 so we can add thoughts/comments/etc 19:41:39 handsome_pirate: I'm waiting to figure out where all this ansible-in-infra stuff is going before I make them public 19:41:39 #topic Upcoming migrations 19:41:51 so, things that are migrating or looking to soon: 19:41:56 nirik: should we run several more meeting on this or shall we include this in the misc meeting? 19:41:59 nirik: roles make your pathing life a lot easier and playbooks a lot cleaner 19:42:04 backup server (I'm almost done) 19:42:04 recommended 19:42:24 yeah, roles are super awesome 19:42:24 tflink: Roger 19:42:26 tflink: you are looking at qa machines migrating right? 19:42:44 mdehaan: yeah. I need to look and see what we could/should move. Suggestions welcome. 19:42:51 * pingou would be interested in migrating fedocal (good exercise) 19:42:55 nirik: We are 19:42:56 nirik: yeah, in the process of. have the base done, waiting to figure out whether we're going to be maintaining our own repo or merging in with infra 19:43:08 cool. 19:43:21 virthost, autoqa and taskbot-demo playbooks are done and (as far as I know) working 19:43:23 are there any other good short term targets? 19:43:57 tagger shouldn't be too hard as well and I guess badges will go directly to ansible, right? 19:44:06 * pingou looks at threebean 19:44:21 yeah, the only wrinkle is that we don't have proxies moved yet... 19:44:30 so proxy config for applications still is in puppet. 19:44:33 yup, tagger should be pretty simple. badges already in ansible. 19:44:47 collectd setup? 19:45:05 rhaen: already moved... well, the client part anyhow. 19:45:07 I've found something in the repo about it - should be fairly easy 19:45:18 but log02 should be not too bad to migrate, yeah 19:45:19 k 19:45:30 I was thinking of standing up a second proxy in phx2 with one of our extra ips and use it for setting up ansible against 19:45:43 smooge: not a bad plan... 19:46:31 that way it doesn't intefere with things and we can use it as a compare 19:46:39 another good way to do things is to split off some app from app* servers in puppet and migrate to new app specific instances in ansible. 19:47:04 for the backup server I have been using --check and --diff a lot... see what changes are there and if they are ok or not. 19:47:24 we could decoupled stg from prod for the migration no? 19:47:54 pingou: we could... but that might leave us in trouble if we are trying to test something in stg and it doesn't fully test it the way it is in prod 19:48:00 as in, move fedocal/tagger/... stg to ansible, check if that works, nuke puppet and deploy w/ ansible 19:48:38 pingou: yeah, would work. 19:48:58 the only problem is if we need to hotfix in the middle of the migration and want to test on stg first 19:49:01 ideally we would create new instances over just switching them from puppet. 19:49:10 but we could do that if needed. 19:49:31 well stg can be easily rebuilt no? 19:49:36 in theory. ;) 19:49:41 fair enough :) 19:49:57 #topic Longer term fun things 19:50:09 Some longer term plans once we have moved over things: 19:50:25 nagios integration - Just make it add/deal with nagios when we add new hosts. 19:50:41 fireball mode - so we can run over everything quicker. 19:51:10 nirik: technically I want to rip the 0mq out of fireball and make it a socket server, and chunk files in it 19:51:11 script common tasks - like make new hosted project, or clean denhosts entry or whatever. Just have an ansible script that does it all for you 19:51:16 not sure if anyone is bored and would want to help 19:51:33 mdehaan: yeah, saw that, but in the end it will still have the same effect right? just with fewer deps? 19:51:37 yes 19:51:44 * nirik nods. 19:52:02 #topic Questions / Open floor 19:52:19 ok, any questions about our setup or plans or anything at all? :) toss em out. ;) 19:52:29 I assume that the discussion about what to do with qa is for a later date? 19:52:45 So if want to create a new service on a new host -- what directories/files would we need to touch? 19:52:59 it can be. I guess from our talks I was thinking it would be better in infra repo, but I'm open to whatever works best with you. 19:53:30 the sooner we get it figured out, the better 19:53:46 I need to start cleaning stuff up and writing docs 19:53:54 abadger1999: dns, 2fakeys, inventory, inventory/host_vars/FQDN, inventory/group_vars/groupname, playbook, any new tasks for the playbook. 19:54:21 short term I guess I will look at the ansible-playbook wrapper script to allow people to run playbooks. 19:54:33 if that works we can use it until we have something better 19:54:53 if folks want to help me test that and/or review/fix it, I would be happy to have the help. 19:54:54 oky 19:55:21 I will help on this 19:55:29 do we want to migrate existing services off of app servers onto service-oriented boxes? 19:55:31 tflink: if thats in place and working you could start running the playbooks yourself. 19:55:36 abadger1999: I think so, yes. 19:55:37 abadger1999: yes. 19:55:50 Cool. I think we could do that with the new elections code pretty soon. 19:55:55 cool 19:55:55 excellent. 19:56:03 mirrorlists is a good example of this. 19:56:15 we moved that off from the app* boxes to mirrorlists-* instances. 19:56:28 nirik: sure, that's been the biggest problem so far 19:56:48 yeah, I want to lower the barrier there. 19:56:59 nirik: sure thing for the help 19:57:09 * nirik notes the arm folks have the room in a few minutes... anything else? 19:57:15 nirik: with apps* box, sudo will be handled by fas group right? 19:57:18 we can continue over in #fedora-admin. ;) 19:57:42 pingou: for split out app? or ? 19:57:54 nirik: thinking of the sudo to run playbook question 19:58:10 ah, yes, it would/will be. 19:58:23 ok folks, lets move to #fedora-admin... thanks for coming everyone. 19:58:26 #endmeeting