20:00:23 #startmeeting 20:00:23 Meeting started Thu Aug 27 20:00:23 2009 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:23 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:00:28 #topic who's here? 20:00:31 * ricky 20:00:49 * tmz is in the cheap seats 20:00:57 * nirik is in the back as well. 20:01:05 * jpwdsm is here 20:01:09 * bochecha is just lurking, first infra meeting :) 20:01:09 fchiulli is in nose-bleed territory 20:01:30 smooge: ping 20:01:32 abadger1999: ping 20:01:34 dgilmore: ping 20:01:37 skvidal: ping 20:01:39 here 20:01:40 whoever else isn't here ping 20:01:42 yah 20:01:43 I'm here 20:01:48 first infra meeting for me 20:01:52 Welcome! 20:01:54 cdelpino: hello 20:01:55 pong 20:02:01 Hi 20:02:08 * hiemanshu is here 20:02:09 Hello 20:02:19 Ok, so this is going to be a little bit of an odd meeting. I'd like to cover everything normal in the first 10 minutes, then leave the rest to this PHX1->PHX2 move 20:02:26 which is the first time we've talked about it really. 20:02:28 so here we go 20:02:36 #topic Infrastructure -- Alpha Release 20:02:40 how'd this go? 20:02:58 AFAIK it went fine, I think releng is going to bit flip the night before for the next release. 20:03:07 ricky: how'd it look from your view? 20:03:11 * sijis is here 20:03:19 Good, thanks to a lot of help from tmz and sijis :-) 20:03:42 it seemed to go well from the emails on mirror-list 20:03:45 * mmcgrath is happy to hear that. 20:04:01 Ok, next toic 20:04:08 #topic Infrastructure -- qpidd 20:04:19 what not hornetq? 20:04:20 I'm continuing work on qpidd, it's moving along and should be in staging fairly shortly. 20:04:33 smooge: ewww 20:04:44 mmcgrath: We going with the multiple-broker setup J5 proposed? 20:04:51 abadger1999: probably not at first. 20:04:54 20:05:02 abadger1999: I've gotten messaging speeds in the tens of thousands per second. 20:05:12 what is qpidd? 20:05:14 I like the sound of that :-) 20:05:31 If we're hitting that sort of traffic in staging then I might look closer. 20:05:34 but one step at a time. 20:05:39 sijis: It's an AMQP message broker. I'll dig up some links for you later. 20:05:42 sijis: a messaging system 20:05:43 mmcgrath: why not? 20:05:56 ah... for that request that came in earlier this week.. gotcha 20:05:58 J5: well, simplicty is the why not. The other question is why would we? 20:06:00 mmcgrath: it is more about security and is easy to setup 20:06:27 mmcgrath: you don't want external clients hooking up to the main AMQP broker 20:06:33 J5: yeah there's still some questions in my mind about all of this stuff for you guys so I may just not fully get the requirements yet. 20:06:41 J5: I thought we weren't doing external clients at first? 20:06:54 mmcgrath: browsers are considered external 20:06:57 * mmcgrath thought 1.0 was all us -> us communication. 20:07:19 K, well I'll re-look at the requirements and see how things go. 20:07:31 Any other questions on this? 20:07:44 k 20:07:46 not for now 20:07:49 #topic Infrastructure -- Zikula 20:08:03 I've got zikula in staging now, ke4qqq is going to re-package the fas module for it which seems to be broken at the moment. 20:08:11 as I understand it, HEAD works. 20:08:16 not really anything major to report there. 20:08:22 #topic Infrastructure -- Open Floor 20:08:31 Anyone have anything else they'd like to talk about over the next 2 minutes? 20:08:39 cdelpino: you can say hello now and introduce yourself 20:08:43 Everyone whose new, say hi :-) 20:08:44 as can anyone else that is new to the group 20:08:56 Hello everyone 20:09:00 * hiemanshu is not new, but most people dont know me 20:09:02 hi ! 20:09:03 Hello everyone 20:09:04 mmcgrath: does that have some of the off licensed modules still? I need an instance to check on the intended behavior while replacing bits. 20:09:12 I just joined the list earlier this week 20:09:28 J5: you talking about zikula or qpid? 20:09:33 zikula 20:09:42 Excellent -- I know some of you were looking to do web application development? 20:09:54 I'm helping replace the javascript components which are cc-by-nc 20:09:55 we should talk after the meeting in #fedora-admin. 20:10:06 abadger1999, I am 20:10:07 J5: I'm not totally sure actually, I've followed this - https://fedoraproject.org/wiki/How_to_set_up_a_Zikula_sandbox 20:10:09 * abadger1999 is toshio 20:10:26 k, I'll have a look around then 20:10:27 Cool 20:10:27 J5: mchua is probably the best person to tap there, I'm largely just a tool in that process. 20:10:32 mmcgrath, i got oVirt build with SELinux disabled 20:10:36 :) 20:10:37 hiemanshu: heh 20:10:50 Ok, hard stop guys, we're at the 10 minute mark. 20:10:53 now for the juicy stuff. 20:10:59 #topic Infrastructure -- PHX1 -> PHX2. 20:11:00 * hiemanshu stops 20:11:00 nothing from me on the open floor 20:11:07 ah darn 20:11:14 Ok, so we've kind of been talking about this for the last year or so 20:11:19 lots of stuff has been up in the air. 20:11:34 but the current requirement is that we be out of PHX1 by the end of november. 20:11:48 This wouldn't be much of a problem except that we have a small window between when F12 ships and the end of november. 20:12:13 *if* we don't slip, we'll be unfrozen on November 11th. 20:12:29 leaving only a 19 day window to do this. 20:12:42 Which, if we plan it out properly, should be enough time. 20:12:50 So primary point people on this are me and smooge. 20:12:58 * nirik wonders how long travel time is between the two sites... 20:13:15 and it's seeming very likely we'll both be onsite to do this so we'll need 1 or 2 other people on the ground keeping an eye on this 20:13:16 20 minutes 20:13:21 wiht little traffic 20:13:27 nirik: we're moving from Mesa -> Chandler. 20:13:29 1-2 hours with traffic 20:14:00 are you gonna try and move things a bit at a time? or all at once? 20:14:06 nirik: a little of both. 20:14:12 Unfortunately our window is pretty small. 20:14:27 but we've also got a pretty redundant setup thanks to some of our sponsors... like Tummy.com 20:14:45 * nirik notes in some facilities moves it's been helpfull to have a openvpn bridge link between sites, so you can not worry about routing/network until things are physically moved. 20:14:59 nirik: yeah 20:15:14 So we do have a gobby document open - Fedora_Infrastructure_PHX_Move 20:15:41 nirik: here's what I'm hoping we'll be able to do. 20:15:48 1) move db1 and db2 early. 20:15:59 as well as bastion and app1. 20:16:30 bastion, being our vpn box, can connect our remote services to the databases. 20:16:47 app1 just so transifex and the wiki stay up as long as possible. 20:17:01 I'm pretty sure the buildsystem will be completely down during this time. 20:17:09 What about the wiki/other data on the netapp? 20:17:17 I still do not have an estimate on how long (start to finish) this will take. 20:17:26 ricky: that's something I need to talk to Facilities about. 20:17:32 Our current main netapp won't change. 20:17:37 it'll stay in PHX1. 20:17:48 connected by a GE link which we have between the two sites. 20:17:56 Ah 20:17:58 that has the potential to cause a couple of issues. 20:18:01 oh thats different from what I thought 20:18:11 smooge: me too, that's something new as of yesterday. 20:18:16 I think the problem is two fold 20:18:23 1) the PHX2 mirror won't be online yet (for some reason) 20:18:37 well the problem is 1 fold 20:18:45 but not moving our current netapp causes another problem 20:18:53 in that so much of our data lives there. 20:18:54 And the xen guests we have on the netapp :-/ 20:19:00 exactly. 20:19:08 ok I am now triply confused 20:19:21 smooge: whats up? 20:19:27 the reason I thought they wanted us out by November was the cage was going away then 20:19:27 question: normally freezes end at release... but I expect some things to still be busy around release time, like mirrormanager. Might be worth trying to list and make sure any apps that are busy around release move last or something? 20:19:33 so the netapp has got to go somewhere 20:19:48 smooge: ah, so I think technically RH has those two cages there. 20:19:49 mirrormanager crawling might stop for a while, but it'll definitely stay up 20:19:54 1 where our servers are, and one where the netapp is. 20:20:03 they're joined by that door in the middle. 20:20:11 I *think* only our server cage is going away right now. 20:20:30 ok again different from what I thought. 20:20:34 nirik: yeah we're goign to pay special attention to that stuff. 20:20:40 but thats what happens when one assumes 20:20:49 smooge: me too, I'm not sure if eric was just unaware or if that was a change or if I just misunderstood. 20:20:59 but that netapp is something we want to verify and reverify in our meeting next week. 20:21:26 A couple of other things which will impact this. 20:21:35 jlaska: has some hardware for QA that we're going to run for him. 20:21:41 that's going to get shiped _directly_ to PHX2. 20:21:52 hopefully we can get it up, online, in our IP space and well tested before we start moving stuff. 20:22:13 ok how much and what were the power requirements on it? 20:22:15 that will also help us know what kind of speeds we're talking about between the sites and if we have to relocate trays or get new storage or something. 20:22:36 smooge: 4 1U x3550's and 2 2U ppc boxes. 20:22:41 they're going to go into their own rack for now. 20:23:20 ok cool 20:23:37 My understanding is we get 5 racks there (though I've also heard 4, but I think it's 5) 20:23:55 I'm hoping to stick secondary arch and "other" type services (like qa) into one rack. 20:23:58 and split the other 4. 20:24:04 Jon was sure only about 4. 20:24:04 we should have room for this, as well as proper power. 20:24:12 Rack 1 has Cloud stuff in it 20:24:25 smooge: yeah, we can group the cloud stuff with other things if we want. 20:24:27 Rack 2 would be good for testing 20:24:49 * mmcgrath wonders how much power is being taken up in the cloud rack right now 20:24:58 With the network switches/drops we can have about 8 machines per rack. which is good because that will be about the power limits for the newer systems 20:25:03 10 amps 20:25:24 but I could not see if that was a full 10 or 5-8 because one line is turned off when iwas there 20:25:28 that's for 2 x3560 and 5 x3550's 20:25:36 smooge: full 10 20:25:44 which means our other PDU has no load on it atm. 20:25:51 correct 20:26:08 I wanted to know if we turned on the other PDU what we have for it 20:26:12 but still, the rule we're looking to follow is half of the overload on each PDU right? 20:26:34 that way I can get an idea of what will happen if one PDU failes 20:26:52 yeah 20:27:04 I need to do that this week with you/luke to see what our estimates 20:27:04 oh, actually this is probably a better way to view it 20:27:29 we don't want it to go over 16 amps with only one power strip. 20:27:30 total used A between the 2 pdu's must not be greater then the overload A on either of the PDU's 20:27:35 yeah 20:27:45 man, it's amazing how limiting that is. 20:27:50 can you guys get current (ha) info from your existing racks on usage? 20:28:04 nirik: yeah 20:28:17 nirik: if we lose either PDU in our main rack, we're in trouble. 20:28:19 yeah.. its you see special 1U horizontals with short power cords in some 1 U racks 20:28:22 in our other rack, it's just got one PDU anyway. 20:28:32 bummer. thats no good. 20:28:42 nirik: yeah, *but* the new place is freaking amazing 20:28:47 so we won't have to worry about silly stuff like that. 20:28:48 mmcgrath: pong 20:28:56 excellent. 20:28:58 mmcgrath, if we turned off the old CVS hardware and possibly get one of the PPC's off :).. we would be ok 20:29:11 smooge: we should give a good hard look at it and see exactly how many additional horizontal PDU's we'd have to put in to fill up a rack. 20:29:14 smooge: :) 20:29:17 * ricky wonders what kind of stuff makes a "freaking amazing" datacenter :-) 20:29:20 dgilmore: hey, just talking about this move we've got coming up. 20:29:52 ricky: well, the dc itself is great, and the people doing the wiring to the racks and stuff makes me feel all warm and fuzzy inside when i see it. 20:29:54 mmcgrath: cool. soryr was doing $DAYJOB stuff 20:30:00 dgilmore: silly $DAYJOB :) 20:30:01 Ah, cool 20:30:01 ricky: the security guards are all norwegian supermodels. 20:30:07 Hahaha 20:30:14 ricky: and it's cold in there... real cold. 20:30:20 :: cough cough :: 20:30:21 anyway 20:30:30 The IBM machine heat issues? :-) Hehe 20:30:33 spot: so you know that im needed to help physically move things right ;) 20:30:39 smooge: oh, another thing that came up was confusion over open ports in the racks. 20:30:55 dgilmore: *laugh* 20:30:56 I want to re-verify that because if we are stuck with the 24 port dropdowns that are in the racks already that might impact our plans. 20:31:37 mmcgrath, ah 20:31:40 but my understanding as of yesterday is that we're going to throw a dedicated switch into each rack. 20:31:55 maybe 2 if we have really high density in the rack. 20:32:23 If we move to having a seperate storage network, we're looking at a minimum of 3 ports / server. 20:32:27 and that fills up a switch pretty quick. 20:32:57 and that makes a good segway into another thing we need to figure out... 20:32:59 network design. 20:33:03 Woah :-) Why all the separate networks? 20:33:20 ricky: one for normal data, one for storage and one for the RSA-II (management) ports 20:33:33 dgilmore: we've talked about it in the past very briefly but did we still want to put the builders on their own network? 20:33:37 Oxf13: ping 20:34:35 * mmcgrath assumes they're busy. 20:34:35 mmcgrath, I don't think we will need that many ports. 20:34:45 smooge: how many do you think we'll need? 20:34:48 Oxf13: is at a dentist, I think :-/ 20:34:52 one sec I am working it out 20:35:04 We've already got 2 ports / server, if we add another that's 3 ports / server. 20:35:25 mmcgrath: id be ok if they were 20:35:36 dgilmore: would we want the releng stuff on that network or on their own network 20:35:42 * mmcgrath knows thats more of a Oxf13 question 20:35:52 mmcgrath: likely the same network 20:36:00 they need access to the same resources 20:36:11 mmcgrath: we probably want to pus cvs on it also 20:36:12 yeah 20:36:21 we can put 12 1 U servers in a rack. If we went with a NFS, network traffic, maintenance topology we would need 3 connections per server. 20:36:26 36 ports 20:36:44 smooge: so that's 12 / server with 2 PDU's or did that not take power into account? 20:36:59 and there's some exceptions to that rule 20:37:03 db3 for example 20:37:07 and the builders really 20:37:26 large numbers of disk drives and such use more power and count as 1.5-2 1 U's 20:37:36 yeah, that's true too. 20:37:44 *and* we've got that blade center to figure out. 20:38:25 is it an IBM blade center? 20:38:29 sijis: yeah 20:38:35 and we need to replace it next year. 20:38:40 perhaps with another blade center. 20:38:54 what model? (we have one in our office that we aren't sure what to do with in a few months (i think) ) 20:39:31 sijis: an H series. 20:39:45 i'll have to check what we have.. i'll ping you later on it. 20:40:33 I forget the db3/builders use less power? or more power? 20:40:50 db3 would use more, the ppc builders more 20:41:01 dgilmore: is xenbuilder4 the last non-blade x86 builder? 20:41:09 mmcgrath: yup 20:41:20 smooge: and xenbuilder4 which likely uses more. 20:41:20 we have 9ppc builders in use 20:41:30 and 9 x86_64 20:42:00 ppc 2,3,4 are non blade boxes 20:42:13 and 5-10 are blades 20:42:42 I'd like to get 2 x3550's installed in PHX2 prior to the move. 20:42:44 we have xenbuilder4 non-blade and x86-1-8 which are blades 20:42:53 as well for testing and temporarly locations for things. 20:43:07 dgilmore: so we have the same number of ppc and x86_64 builders now? 20:43:08 * mmcgrath didn't realize 20:43:18 mmcgrath: we do 20:43:26 mmcgrath: well not really 20:43:32 x86-8 is the compose box 20:43:39 yeah 20:43:45 and ppc1 is a compose box 20:43:56 so there is 1 more ppc than x86_64 20:44:10 20:44:43 smooge: so as it stands there's still a lot of oustanding questions I have for RHIT, how about you? 20:45:18 smooge: also I think we'll gain a U for each rack since I don't think we'll have KVM there though I could be wrong. 20:45:29 I think they're starting to move towards prefering crash carts. 20:45:37 there were KVM's in som eof the racks 20:45:46 in PHX2? 20:45:50 perhaps that's up to us. 20:45:51 no crash carts allowed in the place that I could tell 20:45:58 I'm conflicted about it. 20:46:06 they've got them. 20:46:14 unless they had them and had to get rid of them. 20:46:18 I'll add that to the question list. 20:46:27 I think I'd almost prefer crashcart to having KVM hooked up. 20:46:35 hidden then. Most of the racks I saw were KVM'd.. 20:47:00 we want networked KVM :) 20:47:31 smooge: we have it in PHX now actually 20:47:34 but we almost never use it 20:47:46 because it's pretty kludgy and the RSA-II + cyclades have always been enough. 20:48:11 mmcgrath: RSA,ALOM,ILO etc and cyclades is all we need i think 20:48:29 smooge: just so I know, what is your opinion of blade centers? 20:48:39 welll..... 20:48:44 I have only dealt with 3 20:48:53 smooge: you can feel free to be honest, some people love them some people hate them. 20:48:57 pain, blood, and this one 20:49:07 That's been my experience as well. 20:49:17 mmcgrath: HP's blade centres are nice 20:49:17 although our current bladecenter hasn't been too terrible over the last year or so. 20:49:19 the main issues I have had with them were getting them 20:49:22 IBM's is ok 20:49:26 to work 20:49:36 smooge: yeah, I totally hear you there. 20:49:37 once you got them to work they were pretty easy. 20:49:44 well we as a team have some decisions to make then. 20:49:58 because BC's would help us augment our PDU and network issues. 20:50:13 perhaps mitigate is the word I wanted there :) 20:50:21 but the x3550's have been very good to us 20:50:24 the other issue is that I usually end up with ones that have been End of Lifed right after you buy them 20:50:53 smooge: hahaha, ours sat on the ground for 2 years or so before it got racked. 20:50:56 because the company making them (Compaq I think) couldn't handle the returns 20:50:58 though I don't think that will be a problem going forward. 20:51:12 Wow, 2 years! 20:51:17 IBM is more of a company that supports stuff for centuries 20:51:25 (computer centuries that is) 20:51:35 ricky: we couldn't figure out how to get power to it, the datacenter we were at was refusing new power drops because they didn't have it. 20:51:45 Crazy. 20:52:08 The Dell blade thing they got at UNM when I was leaving looked really nice and didn't have any issues.. but I only dealt with it after setup 20:53:04 Well, most of our environment is going to grow soon, not as much replacement as last year. 20:53:22 so we should decide if we want to continue with the x35[56]0's or look at something else. 20:53:27 But that's in the future. 20:53:36 smooge: do you have anything else you want to discuss on the move at the moment? 20:53:54 no. I will try to get a wrtie up of ideas out tomororw. 20:54:04 man that was badly spelled 20:54:24 :) 20:54:31 anyone else have any questions or concerns about this? 20:54:38 it's going to be really bad if we slip the release :( 20:54:57 So spread the word, if F12 slips, it'll make baby mmcgrath cry. 20:55:15 mmcgrath: you mean baby rpm 20:55:22 heh 20:55:34 Hahah 20:55:46 Ok, if no one has anything else we'll close the meeting in 30 20:55:48 he will probably cry anyway 20:55:53 * dgilmore has nothing 20:56:31 ok 20:56:36 #endmeeting