18:00:02 #startmeeting Infrastructure (2015-07-16) 18:00:02 Meeting started Thu Jul 16 18:00:02 2015 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:02 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:02 #meetingname infrastructure 18:00:02 The meeting name has been set to 'infrastructure' 18:00:02 #topic aloha 18:00:02 #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk pbrobinson 18:00:02 Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pbrobinson pingou puiterwijk relrod smooge threebean 18:00:03 #topic New folks introductions / Apprentice feedback 18:00:08 morning all. 18:00:10 here 18:00:15 morning 18:00:16 * puiterwijk is here 18:00:25 any apprentices with questions or new folks who would like to introduce themselves? 18:00:47 Well I can introduce myself 18:00:54 please do. :) 18:01:16 * tflink is here 18:01:30 My name is Joshua and I am a developer, the past year or so I have been doing PHP development for ecommerce sites and maintaining a web and DB server 18:01:45 I am 19 years old and am just looking to increase my exposure :) and contribute 18:02:10 Been programming in many different paradigms since I was young, it's just fun :P 18:02:17 * pingou here 18:02:18 nerdsville: welcome! 18:02:27 nirik: thanks! :) 18:02:32 hey nerdsville :) 18:02:34 I take it you are interested more in application development side of things? 18:02:39 pingou: heyyy 18:02:46 * rahulrrixe_ is here 18:02:48 I am interested in both infra and dev 18:02:49 :) 18:02:57 nerdsville has already made some contributions to election 18:03:04 hey all 18:03:04 excellent! 18:03:08 (and I heard might be working on more :-p) 18:03:09 morning dgilmore 18:03:22 Yup I finished part of it, just finishing up the other parts of the revoting 18:03:22 hey nerdsville. 18:03:23 :) 18:03:27 Should be finished soon 18:03:33 Hey Aaditya :) 18:03:37 nerdsville: welcome again and do ask questions as you run into things. :) 18:03:47 Most definitely 18:03:47 afternoon nirik 18:03:58 any other questions or new folks? if not, moving on to GSoC checkins... 18:04:06 just a quick one 18:04:12 pcreech|work: fire away 18:04:16 when messing with ansible scripts, what do you test on? 18:04:43 we don't have any good way to completely test them... you can use --check and --diff 18:04:47 and --syntax-check 18:05:05 k. I'll hopefully be starting on the darkserver one soon... been learning ansible 18:05:07 typically stuff is commited for staging and run there, then production is added. 18:05:31 cool. :) Do ask if you run into any questions... we will try and answer 18:05:37 If I may chime in 18:05:40 Ansible is amazing 18:05:41 :P 18:05:54 nirik: will do! 18:06:19 we are getting really close to finishing our puppet -> ansible migration. :) Of course the last handfull of hosts are always the hard ones. 18:06:26 #topic GSoC student update - kushal 18:06:40 kushal: you around? any GSoC folks around who would like to check in on progress? 18:06:46 prth: want to give a status update to everyone? 18:07:07 sure pingou, i've been working on cropping the wallpaper on server side 18:07:30 (for nuancier) 18:08:25 nice. anyone else have updates? 18:08:31 yup 18:08:40 Hi, I have been working on integrating the styles with my askbot instance set up in Openshift. I have also written a blog post about how to integrate them here in my blog: http://anuradhanotes.blogspot.com/2015/07/gsoc-weekly-update-how-to-override.html 18:09:05 This week I have worked on integerating comments mdoule in the review process. There are some bugs in it and will update soon when it get fixed. 18:09:47 I has set up ssh server and gitish shell on vps (thanks for that btw). It worked fine, I will be moving on to implementing api for glittergallery 18:10:11 shell will use api for authorization 18:10:59 sonalkr132: glad to hear it. ;) 18:11:16 rahulrrixe_: do you have a blog? with screenshots? :) 18:12:45 I wonder about that... would it be worth seperating out the simple comments? we could reuse them for some other things like copr... 18:13:14 having a comment app? :) 18:13:27 pingou: I haven’t written updates about this week as my thesis presentation is on friend? 18:13:38 pingou: well, or a flask module many apps could use? 18:14:07 I know how to package python apps, if I can be of any help 18:14:12 pingou: After the git part I was working on review comments. 18:15:00 nirik: I wonder if we could do something like this 18:15:23 could be cool 18:15:32 pingou: BTW the last blog link is here https://medium.com/@rahulrrixe/becoming-git-pro-by-getting-into-under-the-hood-417054b3f4aa 18:15:42 just a thought, as copr was talking about what to do for user feedback and was looking at disqus. ;( 18:16:06 #idea make simple comments flask module for apps to use. 18:16:09 yeah, saw this discussion :( 18:16:11 I also have some interest in comments but haven't gotten to it 18:16:15 Sorry thesis presentation is on friday. 18:16:27 rahulrrixe_: ok, good luck with this :) 18:16:47 nirik: I am interested in helping with that 18:16:51 Cool. Any more GSoC updates? 18:17:06 nerdsville: cool. :) we will need to sort out what we have currently and what we want first 18:17:19 no prob :) 18:17:41 ok, on to announcements/infodump: 18:17:43 #topic announcements and information 18:17:43 #info Various fixes to people01 after migration - kevin 18:17:44 #info Outage template updated - kevin 18:17:44 #info lots of cloud instances moved to new cloud - kevin 18:17:44 #info lots of no longer needed cloud instances killed in old cloud - kevin 18:17:45 #info osbs01.stg setup for releng - kevin 18:17:46 #info backups migrated for hosted04->03 and ongoing for collab04->03 - kevin 18:17:48 #info inode count increased for backups volume when it hit 100% - kevin 18:17:50 #info download01 iDrac fixed - patrick 18:17:52 #info migrating floating ips from old to new cloud (backporting fix) - patrick 18:17:54 #info Authentication infrastructure upgraded to Ipsilon - patrick 18:17:58 #info Koschei is now monitoring f24, a few minor problems fixed - mizdebsk 18:18:00 #info Planning to migrate qadevel from old cloud to infra-proper on Friday - tflink 18:18:02 #info UMDL re-write (aka umdl2) is getting there - pingou 18:18:04 #info pagure work on Fedora 22 - pingou 18:18:06 anything in there anyone would like to especially note or talk about? 18:18:13 (we have no discussion items listed in gobby) 18:18:16 #info PHX2 trip in 2 weeks for physical items 18:18:33 sorry I forgot to pu tthat in gobby 18:18:33 ah yes, good reminder smooge 18:18:42 also, flock is coming up fast. 18:19:09 I am hoping we can retire our old cloud by the end of next week, but we will see 18:19:20 nirik: how far are we? 18:19:34 I killed a ton of instances yesterday. ;) 18:19:40 \ó/ 18:19:49 I do want to point out one thing about the Ipsilon migration: If people report any (external) apps that don't work after the update, have them restart it, as it's likely a stale cache on their end 18:19:49 I have about 5-6 to make on new cloud today that we can migrate. 18:19:51 nirik: is there any way to hold off for at least a few days? 18:20:12 tflink: sure, there's no hard deadline. I just want to get it done. ;) 18:20:28 I'm planning to migrate one of our last VMs on friday and while I don't anticipate problems, I'd rather have something to fall back to if something goes horribly wrong 18:20:42 next friday? or tomorrow? 18:20:47 tomorrow 18:21:04 sure. I was hoping for next friday... so that would leave a week? 18:21:14 oh, i thought you meant tomorrow 18:21:15 there's 33 instances left in the old cloud right now. 18:21:28 that is more than are left in puppet correct? 18:22:18 puppet currently has 14: https://fedoraproject.org/wiki/Infrastructure/PuppetToAnsibleMigration 18:22:29 however, later today I am going to kill 3 more, taking us down to 11. 18:22:38 cool! 18:22:48 Destruction 18:23:04 2 more should also go soon (bapp02/app01), down to 9 18:24:01 I am pondering the idea of migrating lockbox01 at that point and just making sure we have good backups of the rest and stopping puppet, but perhaps thats a bad idea, still thinking about it. 18:24:37 kinda tempting 18:25:07 collab03/hosts-lists01 will go away in favor of mailman01/02 as they finish migrating to mailman3 18:25:17 oh rebuilding bodhost01.... 18:25:20 releng04/relepel01 will go away in favor of bodhi2 stuff 18:25:40 This is all greek to me lol 18:25:44 hosted03 we need to really migrate, but might take a bit with packages. 18:26:00 nerdsville: sorry. :) 18:26:05 lol np 18:26:10 anyhow, lets move on to the learn section? 18:26:16 nirik: what's with hosted03? 18:26:25 ah trac 18:26:27 yeah :/ 18:26:28 pingou: we need to branch all the trac stuff 18:26:38 and build it and make a ansible playbook for it. 18:26:44 not hard, just time consuming. 18:26:57 yeah, I have a lot of packages already branches and built 18:27:10 will check the rest and start the playbook soon 18:27:11 puiterwijk: oh? cool.... might not be as much as I thought then 18:27:33 #topic Learn about: backups with rdiff-backup - kevin 18:27:43 ok, I thought I would talk a bit about our backups today. 18:27:57 We currently have a backup machine (backup01) in our main datacenter. 18:28:19 it uses rdiff-backup to reach out to machines with data we care about on them and backs them up to a local netapp volume 18:28:55 it runs backups daily. All machines that are backed up have /etc and /home backed up and many have additional dirs like /srv or the like backed up. 18:29:37 We don't currently have off-site backups anoyingly, but we have plans to add that in the 4th Quarter (netapp sync to another datacenter) 18:30:04 rdiff-backup is sadly not very active upstream anymore, but it works pretty well overall. 18:30:16 is it like rsync? 18:30:37 nerdsville: it uses librsync yeah, but it can do incrementals 18:30:56 so it stores just the changes in each days backup and you can restore as of anytime in the past you have backups for 18:31:09 ooh 18:31:19 * pingou needs to set it up for himself 18:31:24 me too lol 18:31:34 backup01 runs a cron job for doing the backups. 18:31:47 Do we have a continuity plan or something like? 18:31:58 It uses ansible actually. It pulls our ansible repo to find out what needs to be backed up and then runs rdiff-backup commands over those hosts. 18:32:12 nice* 18:32:15 nice!* 18:32:28 http://infrastructure.fedoraproject.org/cgit/ansible.git/tree/playbooks/rdiff-backup.yml is the playbook. 18:32:48 jcvicelli: not sure. what do you mean by that? ;) 18:32:53 what is the git seed stuff 18:33:27 A plan for what to do if something bad happens 18:33:29 that is a checkout of all our pkgs git repos people can download to have them all. 18:33:35 ah 18:33:43 we don't want to back up that as we back up the git repos. 18:34:16 jcvicelli: well it would depend on what that bad thing was. ;) 18:34:41 if that datacenter is unreachable there's not much we can do... but after we have offsite backups later this year we could restore from them and bring some things up. 18:34:42 Like, if an airplane crash on the datacenter, what steps to follow 18:34:52 Lol or if y2k2 happened 18:35:08 jcvicelli, there isn't much we could do in that case. 18:35:23 Got it... 18:35:24 we do not have alternate backup centers 18:35:29 yeah, not much there. Actually we do have some offsite stuff already 18:35:52 all the dl.fedoraproject.org released content and all of koji is mirrored already 18:35:59 we have mirrors and clone of the ansible repo :) 18:36:00 it's just not the backups space (not enough room yet) 18:36:15 yes sorry .. I meant more like "alternate servers to restore to" 18:36:18 the DB would be the hardest 18:36:23 smooge: yeah, that too. 18:36:36 Why would the db be hard? 18:37:05 well, if we don't have a dump of it? 18:37:06 nerdsville: that's probably what would be the hardest to reconstruct if we were to loose the datacenter 18:37:12 ah lol 18:37:35 I would expect that would be a "Hey everyone gets a new account." day 18:37:35 Get a schema dump at least backed up 18:37:47 Yes, so that is the purpose of a bcp plan, so it should be easier to put services back 18:37:50 nerdsville: that's upstream :) 18:37:50 we have backups, they are just in the same datacenter 18:38:20 once we have offsite copy of our backups volume we should be in much better shape. 18:38:34 it is just a lot of space needed 18:38:34 unless of course _both_ datacenters go away. 18:38:35 Sure 18:38:42 lol amazon? :P 18:38:45 there's always a worse case. :) 18:38:49 could not afford amazon 18:38:56 I will donate 18:39:10 Let's get donations 18:39:11 :P 18:39:12 nerdsville: not enough :) 18:39:53 we may also do some more offsite at ibiblio once our new machine there comes on line 18:40:03 (just critical stuff) 18:40:25 any other backup questions? 18:40:49 What happens if all internet connectivity is lost! :O 18:40:54 we currently have 14TB of backups (but that also includes gnome folks): /fedora_backups 26T 14T 12T 55% /fedora_backups 18:41:04 nerdsville: we might have to go out into the big blue room. ;) 18:41:13 lol 18:41:22 #topic Open Floor 18:41:27 anyone have anything for open floor? 18:41:50 less than a month to flock. Everyone work on their talks/slides. ;) 18:42:06 about the current fedora-23 repository status 18:42:16 I can give an update of the current situation 18:42:22 adrianr_: sure. please do. 18:42:35 umdl was not picking up the new release in development/23 18:42:43 (is still not) 18:42:51 that part of the MM2 rewrite has not been required since deployment 18:43:04 ok. ;( 18:43:25 I have added a few fixes to umdl and I am pretty positive this run could be the one to create the repositories correctly 18:43:36 let's see 18:43:44 oh nice. ;) 18:43:47 adrianr_: fixed repomap? 18:43:57 pingou: yes, also 18:44:41 the problem with fixes in repomap is that it is always hard to tell how much in influenced by change that looks pretty simple 18:45:11 yeah. 18:45:25 and I have prepared a PR for the crawler which supports continent based crawling 18:45:38 we talked last week about it 18:45:50 * pingou looked at it 18:45:52 I would say the code is ready 18:45:56 ok. 18:46:15 this was with the idea that we might make a crawler in or nearer eu and scan eu stuff from there? 18:46:19 so, if there is the possibility to have a crawler in europe we can try it 18:46:39 yeah, I can look into that. :) If you like can you file an infra ticket on it so we don't forget? 18:46:49 nirik: will do 18:47:11 cool. 18:47:40 oh, does someone want to teach about something else next week? 18:47:48 I added a --start-at to umdl2 18:47:54 so you can run it on a part of a tree :) 18:48:11 what is umdl 18:48:11 excellent. also helpfull. 18:48:12 pingou: that is really helpful 18:48:22 nerdsville: it's part of mirrormanager. 18:48:27 update master directory list 18:48:30 ah 18:48:41 it runs and looks for repos that we have and if they have changed (we pushed updates, etc) 18:48:57 then the crawler runs against mirrors to see if they are up to date with that info 18:49:10 ah thanks! :) 18:49:38 hey 18:49:40 did I miss the meeting? 18:49:46 hi 18:49:48 Cydrobolt: just about. we are in open floor. ;) 18:49:49 Cydrobolt: just finishing up :) 18:49:54 oh, cool! 18:50:48 Cydrobolt: you have anything to bring up? :) 18:51:01 nirik, nope! 18:51:08 I pushed an update to mote earlier this week 18:51:11 but nothing important 18:51:22 cool. 18:51:30 has anyone reported any issues with it recently? haven't been on IRC as often these past couple of weeks 18:51:56 not that I am aware of. 18:52:24 anyhow, if nothing else will close out the meeting in a minute... 18:52:29 I wish to bring something up 18:52:43 https://fedoraproject.org/wiki/Category:Infrastructure_SOPs 18:52:45 AadityaN1ir: ok, fire away 18:53:02 Many of the links here are broken. 18:53:12 yes, because we moved from .txt to .rst 18:53:22 if someone could fix those up that would be great. ;) 18:53:36 i could do that 18:53:39 cool 18:53:48 that would be lovely. ;) thank you. 18:53:53 can you tell me where it is hosted ? 18:54:16 see the note at the top 18:54:20 http://infrastructure.fedoraproject.org/infra/docs/ 18:55:07 ok, I see it. 18:55:12 Thanks 18:55:27 thanks for pointing it out 18:55:32 thanks for coming everyone! 18:55:34 #endmeeting