18:00:02 #startmeeting Infrastructure (2016-02-04) 18:00:02 Meeting started Thu Feb 4 18:00:02 2016 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:02 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:02 The meeting name has been set to 'infrastructure_(2016-02-04)' 18:00:02 #meetingname infrastructure 18:00:02 #topic aloha 18:00:02 #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk pbrobinson 18:00:02 The meeting name has been set to 'infrastructure' 18:00:02 Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pbrobinson pingou puiterwijk relrod smooge threebean 18:00:03 #topic New folks introductions / Apprentice feedback 18:00:49 morning folks. ;) any new people like to give a short one line introduction? 18:00:57 or apprentices with questions or comments or ideas? 18:01:19 morning.. 18:01:49 hi 18:01:59 * doteast present 18:01:59 * sayan is here 18:02:13 * danielbruno is present 18:02:23 * aikidouke c'est ici 18:02:34 * smdeep is also here 18:04:18 ok. 18:04:27 lets go ahead and move to status / info dump... 18:04:33 hold on to your irc clients. 18:04:50 topic announcements and information 18:04:51 #info More work on ansible network config in base role - kevin/doteast 18:04:51 #info Apprentice nag email went out for this month, please answer it! - kevin 18:04:51 #info consolidated our virt-install commands down to just a few - kevin 18:04:51 #info New installs should be able to dynamically grow cpu/mem and have watchdog - kevin 18:04:52 #info Setup our virthosts to use qemu-kvm-rhev now - kevin 18:04:53 #info Cleaned up a bunch of depreciated syntax in ansible, just a few more to go - kevin 18:05:27 yay clean syntax 18:05:42 indeed. a bit more to go, but a good start. ;) 18:05:58 * nirik replaced all the calls to 'sudo' with the new 'become' statement. 18:06:22 anyhow, anything else in there people would like to discuss or additional stuff to note? 18:06:41 you sent out a warning of some sort this week or last about a bug in ansible? 18:06:45 * aikidouke trying to find 18:07:05 yeah, with handlers... I think it may be already fixed upstream. It's known about at any case. 18:07:36 ok 18:08:18 otherwise things are going pretty smoothly. We do have some more scripts to port over to the v2 api. 18:08:55 ok, lets move on to the discussion items... 18:09:15 #topic Infrastructure Year In Review Article 18:09:15 #link https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org/thread/7OVY7QBL5OZPYLK5V7EMWEZH2L5M6HOI/ 18:09:30 we have a gobby doc up for this... if everyone could give it a look over. 18:10:05 when do you want to send it off to comm-ops? 18:10:31 (or when do you want it sent/posted) 18:10:37 we could send it later today or tomorrow I guess? threebean is traveling... but I think he got all his adds in 18:11:09 ok sounds good, i'll send it out tomorrow 18:11:21 aikidouke++ 18:11:31 :) 18:11:48 sounds good. I think it looks fine as it is, but we can give to tomorrow for any last minute edits. 18:12:31 anything else on YIR? 18:12:57 alright, moving along 18:13:00 Thanks for getting to this, guys! :) 18:13:28 thanks for bringing it up jflory7 :) 18:13:31 #topic Spring cleaning - kevin 18:14:02 so, we have been making a number of improvements in our ansible deployment story of late. Including (but not limited to): 18:14:02 * pingou late 18:14:57 * I have our vmhosts using qemu-kvm-rhev now. This should alow us to do live migrate without shared storage 18:15:16 * we have setup kvm watchdog by default so hosts that go unresponsive will reboot. 18:15:45 * I have set our install to set maxvcpus and maxmemory, this allows us to dynamically add cpus or memory to guests without having to reboot them 18:16:12 * we are almost ready to land changes that make ip/network config done by ansible as well instead of just on install and manually adjusted. 18:16:29 * I plan to make some kickstart changes to make hosts leaner, etc. 18:16:59 So, with all these changes (once they are in and tested) I think it's a good time to look at reinstalling everything. 18:17:16 this will also make sure that what we have in ansible is correct and working, etc. 18:17:59 Just wanted to give everyone a heads up and also solicit any other big changes we want to land in vm deployment before we start reinstalling. 18:18:15 and I'll probibly look at trying to reinstall most everything before alpha freeze. 18:18:49 do we have any rhel6 hosts remaining to move to 7 ? 18:18:49 any questions, comments? 18:19:12 a few 18:19:30 would this be a good time to do that or no? 18:19:38 the collab03/hosted-lists01 will just go away once we finish our mailman3 migration 18:19:47 nirik: will we increase 01, 02 to 03, 04 or rebuild one by one and keep the numbers equal? 18:19:48 well, they are stalled for various reasons 18:20:07 ask: needs someone to move askbot packages to 7 18:20:18 fpaste: needs someone to find or write a paste replacement 18:20:53 * aikidouke nods 18:21:07 fas: needs moving to fas3 18:21:26 wiki: needs puiterwijk to finish migration 18:21:41 packages: needs porting to 7 18:21:48 thats about all of them I think. 18:22:06 well, theres a few colo virthosts, but we should re-do them as part of this rebuild. 18:22:42 pingou: not sure, but good question. I guess I'd say to leave the numbers and just redo them one at a time... but that would mean downtime for some things I guess. 18:22:44 k - should there be tickets opened for that list or is there a 6 - 7 wiki page for infra that can be updated? 18:23:03 or a sticky note on a monitor :) 18:23:25 nirik: we could bump the number for the instances that are single 18:23:41 there is a wiki page, I cant seem to find it right now, but will look. 18:23:51 * aikidouke i can poke around 18:23:55 pingou: still likely requires some downtime when we switch to it ? 18:24:26 nirik: likely, but minimal no? when we switch the ip/host no? 18:24:40 https://fedoraproject.org/wiki/Infrastructure/RHEL6_hosts 18:25:06 pingou: true, unless they are db hosts... those will still need downtime to migrate the db's 18:25:20 ah yes, definitely for those 18:26:03 I'll post to the list and we can come up with a plan. 18:26:15 It would be nice for it to be pretty automated/easy and not so much downtime 18:26:43 we do have a ton of hosts now... over 500. 18:26:55 wow, nice growth 18:26:56 wow 18:27:14 but reinstalls on a lot of them should be trivial. Destory, run playbook, next 18:27:40 wow. 18:28:06 but that should get us to a nicer place on all of them before the f24 cycle really begins. 18:28:18 anyhow, will post to the list, just wanted to bring it up here. 18:28:32 lmacken: you around to teach us a bit about bodhi2? 18:28:38 nirik: sure sure 18:28:54 #topic Learn about bodhi2 - lmacken 18:29:00 take it away. :) 18:29:06 Hi folks! I'm going to quickly go over how bodhi works from an infrastructure perspective. 18:29:35 If you don't really know too much about bodhi or you want a higher level overview of where we've been and where we are now, I recommend checking out my slides from a couple of Flocks ago: https://lmacken.fedorapeople.org/bodhi-flock2014/ 18:29:55 So, let's start with our Ansible configuration :) 18:30:07 Here is the bodhi2 ansible role: https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/bodhi2 18:30:29 We have both a base and backend roles here. The base role is pretty boring. It ensures the proper deps are there, installs the configuration, and tweaks some SELinux booleans. 18:30:42 This sets up the standard configuration for our basic mod_wsgi web frontend nodes, which are currently bodhi03 and bodhi04. 18:30:51 The backend roles are for our two fedmsg "consumers". 18:31:02 bodhi-backend01 is where "the masher" lives, which is responsible for the entire updates push process. 18:31:18 It's basically just a fedmsg consumer that listens for a `masher.start` fedmsg from the `bodhi-push` tool that releng runs. 18:31:29 You can see all of the steps that it has to take in the docstring here: https://github.com/fedora-infra/bodhi/blob/develop/bodhi/consumers/masher.py#L73 18:31:53 The Masher is also currently responsible for updating the Atomic OSTrees, which uses the fedmsg-atomic-composer tool that I created. https://github.com/fedora-infra/fedmsg-atomic-composer 18:32:35 We're planning to pull the mashing and rpm-ostree work out of Bodhi at some point in the near future, and ideally have our build farm do most of that heavy lifting. Then we'll have bodhi (or another tool like "relengotron"/outhouse) simply handle orchestration. 18:32:50 Next is bodhi-backend02. This runs the "updates handler" which reacts to new updates as they are created. https://github.com/fedora-infra/bodhi/blob/develop/bodhi/consumers/updates.py#L14 18:33:01 This currently handles updating bugs in Bugzilla, as well as querying the wiki for test cases, which tends to take a long time. 18:33:12 aving these steps run outside of the new update creation process speeds things up dramatically. 18:33:38 Next up in ansible are the staging-sync and upgrade playbooks... 18:33:49 The staging-sync playbook is used to sync our production database into staging. https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/playbooks/manual/staging-sync/bodhi.yml 18:34:00 And the upgrade playbook, as you would expect, upgrades bodhi. https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/playbooks/manual/upgrade/bodhi.yml 18:34:14 Not only does it upgrade the RPMs, but it will also perform Alembic database schema migrations automatically :) (threebean++) 18:34:29 So that's it from the ansible side of things... 18:34:38 As for CI, we have the bodhi test suite running on Jenkins, http://jenkins.fedorainfracloud.org/job/bodhi/ 18:34:58 We also have a tool called Rube, which uses the Selenium browser-automation library to interact with the bodhi staging web interface: https://github.com/fedora-infra/rube 18:35:15 Our development workflow currently revolves around git-flow & github pull requests. https://github.com/fedora-infra/bodhi 18:35:30 However, I'd like to plan a mass-triaging day where a few of us get together and consolidate tickets from bugzilla, fedorahosted, and github and migrate everything over to Pagure :) 18:35:37 hurray! 18:35:49 So, that's pretty much all I had for today... 18:35:59 If you want to help out with the bodhi deployment... 18:36:09 We still need to get everything packaged and built in EPEL7, since we're still building various deps in copr. https://github.com/fedora-infra/bodhi/issues/142 18:36:10 there are efforts started to move from trac/github to pagure automagically 18:36:16 I'd also like streamline the git->pagure PR->jenkins tests->staging deployment/upgrade->rube tests continuous integration+deployment workflow :D 18:36:20 pingou++ 18:36:20 but we're hitting a pb on the github side 18:36:27 vivek++ 18:36:27 pingou: Karma for vivek changed to 1 (for the f23 release cycle): https://badges.fedoraproject.org/tags/cookie/any 18:36:34 If you're interested in helping out with Bodhi development, I recommend standing up a local development instance, joining #fedora-apps, and checking out some of the open issues on GitHub. 18:36:49 Feel free to ping me anytime with questions :) 18:36:52 Thanks all! 18:36:52 EOF 18:36:59 excellent. Thanks lmacken 18:37:02 lmacken, thanks 18:37:17 thanks 18:37:39 lmacken++ 18:37:39 jflory7: Karma for lmacken changed to 8 (for the f23 release cycle): https://badges.fedoraproject.org/tags/cookie/any 18:37:52 lmacken: note that there's a idempotency issue between the bodhi base and backend roles... the base role copies in some config and then the backend copies over a file or something... should sort that out sometime 18:38:08 nirik: yeah, I forgot about that :\ 18:38:26 it fights with file ownership or something 18:39:07 yeah, we can fix it, just will take some looking at it. 18:39:22 #topic Open Floor 18:39:40 anyone have items for open floor? is there some thing you might like to learn about next week? or teach about? 18:41:12 hi everyone, late but here :) I would like to ask someone to help me with ticket 3294 18:41:25 .ticket 3294 18:41:26 nirik: #3294 (Enable varnish caching for applications) – Fedora Infrastructure - https://fedorahosted.org/fedora-infrastructure/ticket/3294 18:42:11 yeah, puiterwijk is our varnish person... I'll try and nag him to really look at it. ;) 18:42:16 there is patch, please can you check it and posibly push to staging? 18:42:54 thanks 18:43:44 yeah, and we could look at pushing to stg to test sometime sure. 18:43:56 Sorry for the delay there, I know it's a drag when something sits that long. ;( 18:44:37 I'll push to get it sorted... and do keep bothering us about it! :) 18:44:51 anyone have anything else? 18:45:06 nirik: thank you 18:45:23 ok, I'll give you all back 15min of your day. ;) Enjoy it! 18:45:26 #endmeeting