19:00:18 <nirik> #startmeeting Infrastructure ansible meetup (2013-07-17)
19:00:18 <zodbot> Meeting started Wed Jul 17 19:00:18 2013 UTC.  The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:18 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:19 <nirik> #meetingname infrastructure-ansible-meetup
19:00:19 <nirik> #topic Intro
19:00:19 <zodbot> The meeting name has been set to 'infrastructure-ansible-meetup'
19:00:29 <nirik> hey everyone. who's around for some ansible talk? ;)
19:00:35 * oddshocks 
19:00:38 * Smoother1rOgZ 
19:00:40 * kubo 
19:00:41 <rhaen> here :)
19:00:43 * pingou 
19:01:01 * threebean is here
19:01:03 * nirik notes he's also in a fesco meeting and arm has this room in 1 hour. ;)
19:01:09 * tflink is here
19:01:29 * abadger1999 here
19:01:38 <nirik> so, what I want to do here is just go over our existing ansible setup and some basic ansible info and then toss out some questions we have going forward.
19:01:49 <nirik> then everyone can consider those and we can come up with a plan.
19:02:22 <nirik> so, a bit of background first:
19:02:42 <nirik> In fedora infrastructure in the past, we had/have puppet.
19:02:52 <nirik> we are migrating over to ansible.
19:02:53 * lmacken here
19:03:00 <nirik> http://infrastructure.fedoraproject.org/cgit/ansible.git/
19:03:04 <nirik> is our ansible repo.
19:03:24 <nirik> we also have a ansible-private repo that contains passwords and keys and such in it that is not public.
19:03:49 <nirik> in the existing puppet model, we have every single host pulling from the puppetmaster 2x/hour.
19:04:16 <nirik> in the new ansible world, things are largely push on changes.
19:05:17 <nirik> we have a ssh-agent on lockbox01 that contains the ansible key. Using sudo there, sysadmin-main folks can run playbooks and commands and use the ansible key out to hosts.
19:05:38 <nirik> we have a number of things moved to ansible so far:
19:06:14 <nirik> all builders/buildvms/buildvmhosts/releng, cloud stuff, all our arm machines, badges, mirrorlists and probibly some more I am forgetting.
19:06:50 <nirik> in the past we have/had func for one off commands on hosts. The ansible command should have this pretty much replaced now.
19:07:35 <Smoother1rOgZ> so only -main can fire ansible?
19:07:35 <nirik> for playbooks we also have some nice logging in place.
19:07:49 <nirik> Smoother1rOgZ: currently that is the case... will talk more about that in a bit. ;)
19:07:57 * Smoother1rOgZ nods
19:08:09 <nirik> if you look in the ansible/scripts/ dir there's a number of handy scripts.
19:08:30 <nirik> one of these is 'logview' that lets you see any thing you like about playbook runs
19:08:59 <nirik> this also logs ansible-cmd runs.
19:09:24 <nirik> It also sends out a daily log of changes made via ansible.
19:10:29 <nirik> we also have scripts in there to do virthost updates, etc.
19:11:01 <nirik> #topic handy command line things
19:11:13 <nirik> So, some things I personally have been using a lot with playbooks:
19:11:32 <nirik> --list-hosts helpfully tells you exactly what hosts will be affected by that playbook run.
19:11:48 <nirik> --check just tells you what would change, but doesn't make any changes.
19:11:53 <threebean> nice
19:12:12 <nirik> --limit -l (limits what hosts are affected... so you can run a playbook but limit it to just one host in the group)
19:12:31 <nirik> --diff (great with --check) also shows you diffs of any copy/templates changes.
19:12:56 <threebean> wow
19:13:02 <nirik> -f N (number of forks to run... I think it defaults to 5 or so, but you can do 1 if you want to run one at a time
19:13:23 <nirik> --start-at-task=taskname will let you run a playbook, but skip to that task.
19:13:24 <oddshocks> this sounds awesome
19:13:35 <nirik> --step (step thru plays one at a time)
19:13:57 * tflink needs to read more docs, didn't realize half of those existed but has wanted them when running playbooks
19:14:06 <smooge> tflink, me either
19:14:50 <nirik> yeah, they are all very handy. ;)
19:15:01 <nirik> #topic Making a new instance
19:15:19 <nirik> so, right now when we want to spin up a new instance for foo, you can do much of it in ansible.
19:15:37 <nirik> 1. add dns and/or 2fa keys (not in ansible)
19:15:47 <nirik> 2. add the new machine to ansible inventory
19:16:01 <nirik> 3. add ./inventory/host_vars/FQDN host_vars for the new host.
19:16:56 <nirik> that will have in it ip addresses, dns resolv.conf, ks url/repo, volume group to make the host lv in, etc etc.
19:17:10 <nirik> 4. add any needed vars to inventory/group_vars/ for the group
19:17:33 <nirik> this has memory size, lvm size, cpus, etc
19:17:45 <nirik> 5. add tasks/virt_instance_create.yml task to top of group/host playbook
19:18:10 <nirik> 6. run the playbook and it will go to the virthost you set, create the lv, guest, install it, wait for it to come up, then continue configuring it.
19:18:18 <nirik> it's pretty slick.
19:18:26 <nirik> look at mirrorlists* for a good example.
19:18:32 <threebean> cool.. i need to do a portion of that soon for the badges production nodes
19:18:40 <threebean> although I think some of it you already did on the first pass through
19:19:32 <nirik> yeah.
19:19:42 <nirik> a similar set of steps applies for cloud images.
19:19:47 <nirik> s/images/instances/
19:20:22 <nirik> so, thats kinda what we have implemented so far... any questions on all that stuff so far?
19:20:57 <kubo> nice, but i don't understand how ansible catch variables from inventory/host_vars/FQDN.. from where do you call ansible command?
19:21:33 <nirik> kubo: so, you run 'ansible-playbook /path/to/playbooks/playbook.yml'
19:21:39 <nirik> in that playbook there should be:
19:21:57 <nirik> lets take a real example actually:
19:22:05 <nirik> lets take a real example actually: playbooks/groups/mirrorlist.yml
19:22:14 <nirik> in there is a line: hosts: mirrorlist
19:22:28 <nirik> looking in inventory/inventory we see the mirrorlist group:
19:22:41 <nirik> [mirrorlist]
19:22:42 <nirik> mirrorlist-osuosl.fedoraproject.org
19:22:42 <nirik> mirrorlist-ibiblio.fedoraproject.org
19:22:42 <nirik> mirrorlist-phx2.phx2.fedoraproject.org
19:22:43 <abadger1999> when you say "you can do these things in ansible", do you mean add it to the ansible git repo and then ansible will configure those aspects?
19:22:47 <nirik> so it needs to run on those 3 hosts.
19:23:18 <nirik> it then looks for variables... it looks first for a host variables for each host, then group variables, then global variables.
19:23:33 <nirik> http://www.ansibleworks.com/docs/playbooks2.html/#understanding-variable-precedence has the full list.
19:23:37 <Smoother1rOgZ> do we have a git hook to trigger ansible once made changes or do we need to fire ansible on our one?
19:23:45 <tflink> I thought global variables has the lowest precedence
19:23:50 <kubo> ok, get it. I have a ittle problem with ansible variables :)
19:24:08 <nirik> tflink: yeah, sorry, I was just listing them out there, thats not the correct order of precidence.
19:24:32 <nirik> abadger1999: you can add to the git repo, then run the 'sudo -i ansible-playbook ...' to make it happen. (currently)
19:24:40 <nirik> Smoother1rOgZ: we don't yet, going to get to that here. ;)
19:25:29 <rhaen> Do we have to extend the playbooks? I saw that we are using existing ones only.
19:25:49 <nirik> not sure I understand the question...
19:26:03 <nirik> we have been adding playbooks as we go and add new hosts or migrate things from puppet.
19:26:52 <rhaen> hm, ok - I mean the logic components for playing the playbooks are part of the ansible distribution
19:27:10 <rhaen> Do we need to extend them or is "anything in place" - aka batteries included?
19:27:14 <nirik> yeah, the modules and such we use are pretty much upstream in ansible.
19:27:33 <nirik> I've not run into much that needed any extension, but we can actually do that if we need to.
19:27:39 <rhaen> k, nice
19:27:50 <nirik> ie, if we needed a local module for something we could just add it to our git repo until it's upstreamed.
19:28:14 <nirik> ansible is available in fedora/epel, so do everyone feel free to play with it on your home machines. it's easy to do so
19:28:27 <rhaen> nirik: will do :)
19:28:39 <nirik> #info short term needs/wants
19:28:48 <nirik> #topic short term needs/wants
19:28:57 * mdehaan is done w/ other call
19:29:08 <nirik> so, there's a bunch of short term things we want to try and do (or me at least):
19:29:19 * Smoother1rOgZ uses it at $dayjob but with different architecture ;)
19:29:27 <nirik> Disable/remove func.
19:29:27 <nirik> way for non sysadmin-main to run playbooks/commands.
19:29:27 <nirik> way to trigger runs from commits.
19:29:27 <nirik> better use of roles?
19:29:27 <nirik> concrete way to handle stg vs prod
19:29:28 <nirik> concrete way to handle hotfixes/patches
19:29:29 <nirik> setup backup02 agent
19:29:53 <nirik> all of these are things skvidal and I talked about and he was working on. ;(
19:30:00 <nirik> we can probibly nuke func anytime
19:30:11 <handsome_pirate> Ooo
19:30:25 <nirik> for the others we need some more discussion.
19:30:27 * threebean nods
19:30:27 <threebean> for the non-sysadmin-main permissions.. did you two have anything in mind?
19:30:31 <mdehaan> poor Func
19:30:37 * Smoother1rOgZ recalls having some chat with seth & nirik regarding the prod/stg layout
19:30:52 <nirik> it served us well, but ansible does everything now we needed it for. ;)
19:31:01 <threebean> its tricky.. because I think it means you need to have read access to the private repo (unless we delegated responsibility to a daemon or something).
19:31:02 <mdehaan> that was the plan!
19:31:38 <nirik> so, on non sysadmin-main playbook runs: I have a wrapper script skvidal wrote that we were thinking of trying out. It would look at the playbook and the user and see if they were allowed to run that.
19:32:16 <nirik> we were also talking about a trigger in git that could fire off runs.
19:32:27 <Smoother1rOgZ> :)
19:32:33 <nirik> but that was going to be pretty complex.
19:32:44 <nirik> so, I am open to ideas on how best to handle things.
19:32:57 <Smoother1rOgZ> depends on what that trigger would stand for
19:33:07 <pingou> git hook Smoother1rOgZ I think
19:33:13 <nirik> I'd really like to get it so people who have perms on some group of hosts to sudo can run playbooks on them... or anyone can trigger runs if they make some minor change later.
19:33:31 <threebean> yeah.. the git trigger loses a lot of the flexibility like limiting to certain hosts, stepping through playbooks, etc..
19:33:56 <nirik> right now with puppet there's several work flows:
19:33:57 <Smoother1rOgZ> I know - my concern is more about what type of trigger
19:33:58 <pingou> nirik: ansible is pull as much as push right, could we invert the approach: if you have sudo on X want to update X, go to X and run ansible on it?
19:34:08 <nirik> a) I am making this important change and I want it pushed/applied asap!
19:34:13 <pingou> or would this be too complex/cumbersome?
19:34:22 <nirik> b) I am making this minor/cosmetic thing and I want it changed sometime later.
19:34:37 <nirik> c) I am making this change, but want to give other people some time to look at my commit before it goes live
19:34:51 <pingou> b and c are the same no?
19:34:51 <nirik> pingou: won't work for our setup due to private repo...
19:34:59 <pingou> nirik: good point
19:35:00 <nirik> unless we make private available everywhere.
19:35:07 <pingou> no
19:35:17 <smooge> could we use two user setting with sudo?
19:35:19 <nirik> yeah, I guess b and c overlap.
19:35:28 <misc> but if someone do 'a' after 'c', then 'c' do not work
19:35:34 <rhaen> nirik: I don't think b and c are the same.
19:35:35 <mdehaan> nirik: perhaps overkill but something to think about, we've got at least one user integrating things with Jenkins + Gerrit
19:35:46 <rhaen> nirik: C has something like a mentoring workflow - which is great for beginners (like me)
19:35:50 <nirik> yeah, we talked about that a bit too...
19:36:17 <kubo> rhaen, +1 :)
19:36:18 <nirik> I'd be more interested in that if any of those CM things were packaged and sane. ;)
19:36:18 <rhaen> mdehaan: we are using Jenkins that way at work for puppet - which works well.
19:36:32 <rhaen> CM?
19:37:20 <nirik> sorry, CI (Continuous integration)
19:37:44 <rhaen> k. Well, Jenkins provides a yum repo which is ...well, working
19:37:59 <rhaen> however nowhere near Fedora packaging guidelines
19:38:05 <nirik> yeah, but... jenkins: not packaged the way we require and java.
19:38:14 <tflink> buildbot is a bit behind in the repos, but its been working well for me thus far
19:38:18 <rhaen> nirik: yep. :(
19:38:33 <nirik> I've not looked at buildbot too much
19:39:30 <rhaen> nirik: what are the guys using for the buildvms?
19:39:59 <nirik> so, we don't need to decide this now, but good to think on. Anything else in the short term list people would like to discuss or have me expand on?
19:40:05 <tflink> nirik: I have playbooks for buildmaster/buildslave setup if you're ever interested
19:40:05 <nirik> rhaen: libvirt/kvm?
19:40:47 <rhaen> nirik: oh, sorry - thought of testing stuff and the test driver for it. nm.
19:40:47 <nirik> tflink: ok. I can take a look.
19:40:59 <handsome_pirate> tflink:  playbooks for buildbot/taskbot?
19:40:59 <Smoother1rOgZ> nirik: do you have all of this written down somewhere or on the wiki?
19:41:04 <rhaen> tflink: interested! :)
19:41:05 <nirik> currently we aren't using ansibles 'roles' much at all. we should look at that.
19:41:06 <tflink> handsome_pirate: yep
19:41:11 <handsome_pirate> tflink:  I'd like those, if possible
19:41:12 <nirik> Smoother1rOgZ: not yet, but I can.
19:41:13 <Smoother1rOgZ> so we can add thoughts/comments/etc
19:41:39 <tflink> handsome_pirate: I'm waiting to figure out where all this ansible-in-infra stuff is going before I make them public
19:41:39 <nirik> #topic Upcoming migrations
19:41:51 <nirik> so, things that are migrating or looking to soon:
19:41:56 <rhaen> nirik: should we run several more meeting on this or shall we include this in the misc meeting?
19:41:59 <mdehaan> nirik: roles make your pathing life a lot easier and playbooks a lot cleaner
19:42:04 <nirik> backup server (I'm almost done)
19:42:04 <mdehaan> recommended
19:42:24 <jamielinux> yeah, roles are super awesome
19:42:24 <handsome_pirate> tflink:  Roger
19:42:26 <nirik> tflink: you are looking at qa machines migrating right?
19:42:44 <nirik> mdehaan: yeah. I need to look and see what we could/should move. Suggestions welcome.
19:42:51 * pingou would be interested in migrating fedocal (good exercise)
19:42:55 <handsome_pirate> nirik:  We are
19:42:56 <tflink> nirik: yeah, in the process of. have the base done, waiting to figure out whether we're going to be maintaining our own repo or merging in with infra
19:43:08 <nirik> cool.
19:43:21 <tflink> virthost, autoqa and taskbot-demo playbooks are done and (as far as I know) working
19:43:23 <nirik> are there any other good short term targets?
19:43:57 <pingou> tagger shouldn't be too hard as well and I guess badges will go directly to ansible, right?
19:44:06 * pingou looks at threebean
19:44:21 <nirik> yeah, the only wrinkle is that we don't have proxies moved yet...
19:44:30 <nirik> so proxy config for applications still is in puppet.
19:44:33 <threebean> yup, tagger should be pretty simple.  badges already in ansible.
19:44:47 <rhaen> collectd setup?
19:45:05 <nirik> rhaen: already moved... well, the client part anyhow.
19:45:07 <rhaen> I've found something in the repo about it - should be fairly easy
19:45:18 <nirik> but log02 should be not too bad to migrate, yeah
19:45:19 <rhaen> k
19:45:30 <smooge> I was thinking of standing up a second proxy in phx2 with one of our extra ips and use it for setting up ansible against
19:45:43 <nirik> smooge: not a bad plan...
19:46:31 <smooge> that way it doesn't intefere with things and we can use it as a compare
19:46:39 <nirik> another good way to do things is to split off some app from app* servers in puppet and migrate to new app specific instances in ansible.
19:47:04 <nirik> for the backup server I have been using --check and --diff a lot... see what changes are there and if they are ok or not.
19:47:24 <pingou> we could decoupled stg from prod for the migration no?
19:47:54 <nirik> pingou: we could... but that might leave us in trouble if we are trying to test something in stg and it doesn't fully test it the way it is in prod
19:48:00 <pingou> as in, move fedocal/tagger/... stg to ansible, check if that works, nuke puppet and deploy w/ ansible
19:48:38 <nirik> pingou: yeah, would work.
19:48:58 <pingou> the only problem is if we need to hotfix in the middle of the migration and want to test on stg first
19:49:01 <nirik> ideally we would create new instances over just switching them from puppet.
19:49:10 <nirik> but we could do that if needed.
19:49:31 <pingou> well stg can be easily rebuilt no?
19:49:36 <nirik> in theory. ;)
19:49:41 <pingou> fair enough :)
19:49:57 <nirik> #topic Longer term fun things
19:50:09 <nirik> Some longer term plans once we have moved over things:
19:50:25 <nirik> nagios integration - Just make it add/deal with nagios when we add new hosts.
19:50:41 <nirik> fireball mode - so we can run over everything quicker.
19:51:10 <mdehaan> nirik: technically I want to rip the 0mq out of fireball and make it a socket server, and chunk files in it
19:51:11 <nirik> script common tasks - like make new hosted project, or clean denhosts entry or whatever. Just have an ansible script that does it all for you
19:51:16 <mdehaan> not sure if anyone is bored and would want to help
19:51:33 <nirik> mdehaan: yeah, saw that, but in the end it will still have the same effect right? just with fewer deps?
19:51:37 <mdehaan> yes
19:51:44 * nirik nods.
19:52:02 <nirik> #topic Questions / Open floor
19:52:19 <nirik> ok, any questions about our setup or plans or anything at all? :) toss em out. ;)
19:52:29 <tflink> I assume that the discussion about what to do with qa is for a later date?
19:52:45 <abadger1999> So if want to create a new service on a new host -- what directories/files would we need to touch?
19:52:59 <nirik> it can be. I guess from our talks I was thinking it would be better in infra repo, but I'm open to whatever works best with you.
19:53:30 <tflink> the sooner we get it figured out, the better
19:53:46 <tflink> I need to start cleaning stuff up and writing docs
19:53:54 <nirik> abadger1999: dns, 2fakeys, inventory, inventory/host_vars/FQDN, inventory/group_vars/groupname, playbook, any new tasks for the playbook.
19:54:21 <nirik> short term I guess I will look at the ansible-playbook wrapper script to allow people to run playbooks.
19:54:33 <nirik> if that works we can use it until we have something better
19:54:53 <nirik> if folks want to help me test that and/or review/fix it, I would be happy to have the help.
19:54:54 <Smoother1rOgZ> oky
19:55:21 <smooge> I will help on this
19:55:29 <abadger1999> do we want to migrate existing services off of app servers onto service-oriented boxes?
19:55:31 <nirik> tflink: if thats in place and working you could start running the playbooks yourself.
19:55:36 <threebean> abadger1999: I think so, yes.
19:55:37 <nirik> abadger1999: yes.
19:55:50 <abadger1999> Cool.  I think we could do that with the new elections code pretty soon.
19:55:55 <pingou> cool
19:55:55 <nirik> excellent.
19:56:03 <nirik> mirrorlists is a good example of this.
19:56:15 <nirik> we moved that off from the app* boxes to mirrorlists-* instances.
19:56:28 <tflink> nirik: sure, that's been the biggest problem so far
19:56:48 <nirik> yeah, I want to lower the barrier there.
19:56:59 <Smoother1rOgZ> nirik: sure thing for the help
19:57:09 * nirik notes the arm folks have the room in a few minutes... anything else?
19:57:15 <pingou> nirik: with apps* box, sudo will be handled by fas group right?
19:57:18 <nirik> we can continue over in #fedora-admin. ;)
19:57:42 <nirik> pingou: for split out app? or ?
19:57:54 <pingou> nirik: thinking of the sudo to run playbook question
19:58:10 <nirik> ah, yes, it would/will be.
19:58:23 <nirik> ok folks, lets move to #fedora-admin... thanks for coming everyone.
19:58:26 <nirik> #endmeeting