20:00:40 #startmeeting 20:00:40 Meeting started Thu Dec 10 20:00:40 2009 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:40 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:00:47 #topic Who's here? 20:00:47 * mmcgrath is 20:00:52 * dgilmore 20:00:56 smooge: ping 20:01:08 here 20:01:10 * a-k is 20:01:22 putting clothes in dryer 20:01:50 So the outage is this weekend, I figured we should all go over the timelines and such 20:01:53 jwb: This might interest you 20:01:57 Oxf13: ^^ 20:02:00 * skvidal is here 20:02:06 * jwb is semi here 20:02:16 So here's the basics 20:02:31 #action - Tonight we'll migrate db1, db2 and the vpn to PHX2 20:02:43 * nirik is sitting in the back in the cheap seats. 20:02:47 mmcgrath: #topic? 20:02:50 #action - Sometime before Sat. Morning mail will start to go through bastion3 in PHX2 20:02:54 #topic Timeline 20:02:55 #topic movement to PHX 20:03:00 #action - Tonight we'll migrate db1, db2 and the vpn to PHX2 20:03:01 #action - Sometime before Sat. Morning mail will start to go through bastion3 in PHX2 20:03:07 * mmcgrath doesn't get this bot stuff sometimes 20:03:16 The mail part is still in an unknown state to me 20:03:25 I've got a ticket open, I've pinged people about it's priority, I'm waiting to hear back. 20:03:49 #action - Smooge is working on some resolvers for us, those should be in place ASAP but isn't a blocker for the move. 20:03:54 it is a blocker to start turning things on though. 20:04:02 mmcgrath: do we have the new bastion host up? 20:04:10 dgilmore: yes, bastion3.fedoraproject.org is up and running 20:04:26 Ok, this takes us to Friday morning. 20:04:42 #action - Smooge will email out named scheme for PHX2 resolve domain 20:04:47 #action jwb/ someone in releng will tell me as soon as they are done writing to /mnt 20:05:03 #action at that time I'll mount the public mirrors as read only. 20:05:07 mmcgrath, will likely be two sets of things there 20:05:12 updates, and rawhide 20:05:18 jwb: totally fine, just let me know when it's done. 20:05:24 k 20:05:28 Ok, this is where the coordination stuff happens. 20:05:35 I'll be flying to PHX at noon on Friday. 20:05:39 smooge: when do you leave? 20:06:16 I leave at ~0900 tomorrow morning 20:06:32 that's localtime? 20:06:37 smooge: MST 20:06:51 k 20:07:05 We both have access to PHX1 and PHX2 from Friday to Wed I believe 20:07:25 * SmootherFrOgZ is around 20:07:37 sorry 0800 tomorrow mornign ariving at ~0900 MST 20:07:42 k 20:07:56 So I'll be heading to PHX1 in the afternoon to give everything a final look over 20:07:59 this is still Friday. 20:08:12 mmcgrath, when do you get there 20:08:14 I'll probably shut down stuff like db1 since it's not being used at that point. 20:08:20 nm.. see now 20:08:35 I get in at 3:30 20:08:48 03:30 or 15:30 20:08:51 15 20:08:55 what's the plan for mail at this point? 20:09:05 * mdomsch is here finally 20:09:05 skvidal: I said that already :) 20:09:14 oh 20:09:17 it had just scrolled off 20:09:22 so I missed it on the divider :( 20:09:24 sorry 20:09:26 skvidal: I've got a ticket open and have pinged people about it's priority 20:09:29 no worries. 20:09:48 I'm hoping for that to get done tonight, if it's not done by then I'm going to need to hand that off to someone to keep bugging people until it gets done because I won't be around on Friday 20:09:56 skvidal: but the good news is we aren't the blockers on that AFAIK. 20:10:02 bastion3 is setup and listening and ready to forward mail. 20:10:14 Ok, so here's what is left 20:10:27 #action Friday night they'll be configuring the new netapp head unit in PHX2 20:10:41 Then sleep 20:10:59 #action Smooge and Mike will meet at PHX1 at 05:15 am 20:11:12 #action Disable nagios 20:11:34 #action make a final backup of sigul and a few other bits (this will probably be done the night before, just forgot to meniton it) 20:11:38 #action power everything down 20:11:50 #action movers will arrive at 05:45 to move 20:11:56 action pray 20:11:58 :) 20:12:00 The movers will actually be helping us unhook cables and such. 20:12:01 oh 20:12:16 mmcgrath: buildsys we should disable friday night 20:12:17 #action I'll be taking a sigul drive, and smooge will have the other. 20:12:30 ah, good 20:12:30 dgilmore: we can do it that morning if we want or the night before 20:12:35 mmcgrath, make sure its got both as being bootable. 20:12:53 The movers themselves will actually be loading the entire racks onto the truck. 20:12:55 sorry thats meant more as I need to ... 20:13:03 mmcgrath, they will be? 20:13:15 yep, I have no idea how that'll work but I bet it's going to be wicked awesome. 20:13:18 we aren't getting new racks in PHX or they will move them from one to another there? 20:13:21 mmcgrath: lets do it at night make sure all in progress builds complete 20:13:28 dgilmore: sure 20:13:46 #action meet the movers at PHX2 where they'll unload the racks 20:14:02 #action we'll then be moving servers from the old racks (which I guess are now sitting near the new racks) and re-racking everything. 20:14:10 * mmcgrath will bring bandaids 20:14:22 now this is where part of our story has changed a bit 20:14:32 we were under the impression contractors would be on site to re-wire everything 20:14:35 that's not true 20:14:42 bandaids? consider stopping by autozone and getting some work gloves. 20:15:07 Jonathan (the guy normally onsite at PHX2) is going to be pre-wiring some of the racks. 20:15:17 depending on how far he gets we may have a lot of cabling to do ourselves or none. 20:15:19 pjones, I have had a rack go through teflon armored gloves.. they need blood to work 20:15:28 :) 20:15:41 as electronics need smoke 20:15:42 surely you mean kevlar? 20:15:57 so lets see where that leaves us... 20:16:04 #action start powering services back on 20:16:10 let me get the priority list... 20:16:11 (and in any case; knives go through kevlar more easily than they go through canvas. the high velocity of something like a bullet is where kevlar gets its strength from...) 20:16:46 #action get the app servers online first 20:16:49 #action then the buildsystem 20:17:00 I only put them in that order because the app servers should be very straight forward. 20:17:05 do we have dhcp? 20:17:12 dgilmore: yeah, and we run it 20:17:19 should we reconfigure all the builders first for the new networks 20:17:30 * mmcgrath is skipping the network config stuff for a later part of the meeting 20:17:34 ok 20:17:37 because some of it is quite a bit different 20:17:40 but we'll get to that. 20:17:55 Now it's while we're re-racking this stuff that we will need help from people. 20:18:07 I've already worked on a checklist via CSI 20:18:14 basically I'm looking for people to re-certify these hosts 20:18:17 what does that mean? 20:18:21 pjones, I meant kevlar.. but teflon would expect why I couldn't hold anything 20:18:29 verify remote management, power and cyclades works, as well as network. 20:18:37 this is stuff anyone in sysadmin-main can do. 20:19:01 Ok, so any questions about that? 20:19:09 mmcgrath, pointer to checklist 20:19:22 smooge: I'm going to be adding that info to the ticket in a moment - 20:19:25 https://fedorahosted.org/fedora-infrastructure/ticket/1845 20:19:30 ok cool 20:19:57 Ok, so as far as timeline goes and the physical work of things are there any questions? 20:20:10 Sometime after the move we'll be moving the db hosts back to their normal machines. 20:20:35 Ok, so the next topic is about what the PHX2 world will look like. 20:20:46 #topic PHX2 - what does it look like? 20:21:04 So in PHX2 we're going to have 3 networks and might add a 4th one later. 20:21:22 The 3 networks are as follows 20:21:32 1) public network - 10.5.126 20:21:37 2) build network - 10.5.125 20:21:43 2) storage network - 10.5.127 20:21:56 the majority of our services will end up on the buidl network 20:21:59 that should be 3) storage :) 20:22:12 and the majority of our services will end up on the public network, not build. 20:22:15 * mmcgrath needs more sleep. 20:22:19 the build network will be mostly for releng 20:22:29 and the storage network will be for nfs traffic and possibly backups. 20:22:46 I'm going to be doing tests to find out where backups best work and have the least impact on other things. 20:23:12 This also means that several of our hosts will now have multiple IP addresses. 20:23:33 for example relengX will be on the build and storage network. 20:23:38 we won't be routing them. 20:23:46 Anyone have any questions on this? 20:23:49 hello 20:24:28 Ok. so that's really it. 20:24:46 I'm working on a spreadsheet right now that I'd appreciate some second looks at. 20:24:59 I presume we'll only have DNS A records for the inbound service IPs on those dual-homed boxes 20:25:12 mdomsch: I believe so. 20:25:18 e.g. build1's IP on network 3 won't have an A record 20:25:20 we'll be running our own DNS servers in PHX2, smooge is working on that now. 20:25:49 mdomsch: correct but it will probably have a reverse. 20:25:53 smooge: what do you think? 20:25:56 i am going to have two boxes eventually but at the moment just one.. ns001.phx2.fedoraproject.org 20:26:12 mmcgrath, makes sense 20:26:25 reverse is good to keep track of allocated addresses at least 20:27:01 the hosts will just be on the phx2 subdomain which will be a 'hidden' domain (not on public dns since it will contain 192 and 10.x ips) 20:27:47 smooge: also on the meeting yesterday I think it was requested that we RH to still be able to do lookups and reverse lookups so we might be stuck doing fedora.phx2.redhat.com 20:27:52 unless there's an easy way to just mask it for them? 20:27:55 * mmcgrath hasn't thought about that much. 20:27:56 I am going to keep it simple in version one because we don't ahve a lot of time.. but dual homed would be later 20:28:33 mmcgrath, I think in either case they will be able to do zone transfers from our servers 20:28:37 smooge: wise idea 20:28:44 Ok, so any questions about this? 20:28:50 so we could call it fedoraproject.int and it should be good 20:29:12 at present there is no firewall preventing access to or from any of the 3 networks I've listed. 20:29:17 we'll be doing that later, just ran out of time. 20:29:20 i would suggest not using .int because .int is a real TLD 20:29:30 nb it was a joke sorry 20:29:36 smooge, oh ok :) 20:29:45 Ok, so anyone else have any questions about this? 20:29:51 I really have no estimate for how long this will take. 20:30:08 I've never moved this many machines before and I have no idea how much is already done and how many people we'll have helping 20:30:19 mmcgrath, you are leaving on Tuesday correct? 20:30:41 smooge will be leaving on Wednesday so we wont lose coverage afterwords 20:30:52 and we have problems like "Oh crap that disk drive didn't make it." 20:30:58 smooge: I'm not sure actually I need to look at my trip info. 20:31:28 in case of real emergency I will just get back in the car and drive back til we are stable. 20:31:46 so that's really all I have to discuss 20:31:51 anyone else have any questions or comments? 20:31:59 if not I'll get back to my IP list and let smooge get back to DNS stuff. 20:32:14 anything non-main people can do to help? 20:32:20 #topic open floor 20:32:33 nb: I'm hoping the non-main people will help with logistics and troubleshooting 20:32:38 ok 20:32:47 when the services start coming back online and interacting with eachother I suspect there will be lots of little bugs 20:32:56 like say, app2's network is behaving poorly 20:33:18 nb: but also keeping watch in #fedora-admin and #fedora 20:33:42 * stickster wants to say a big "thank you" from all the people in Fedora that won't know to whom they should say it... for working so hard to keep us in business during the move. 20:34:01 nb: but yeah, when stuff comes back online smooge and I will probably say stuff like "the wiki is back" 20:34:05 we'll need verification of that :) 20:34:26 ok 20:34:28 i know i've missed most of the meeting.. but its this weekend, right? 20:34:33 sijis_afk: correct. 20:34:46 stickster: :) 20:34:49 any other questions? 20:34:50 i should be around and help if possible 20:35:07 when I'm done with this IP list I could use some extra eyes 20:35:07 * nb will be around saturday and sunday afternoon/evening 20:35:23 Anyone that'll be around in the next hour or so just look, try to find duplicates or where I've done something stupid. 20:35:29 mmcgrath, ok 20:35:47 If no one has anything else, I'll close the meeting in 30 20:36:39 alllrighty 20:36:46 #endmeeting