20:01:40 #startmeeting infrastructure 20:01:40 Meeting started Thu Mar 10 20:01:40 2011 UTC. The chair is smooge. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:01:40 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:01:47 #meetingname infrastructure 20:01:47 The meeting name has been set to 'infrastructure' 20:01:57 #chair nirik skvidal 20:01:57 Current chairs: nirik skvidal smooge 20:02:15 ok, over here? 20:02:35 I think this will be short and it seemed what people wanted from the emails 20:02:54 my left hemisphere is pounding so lets just do this 20:03:38 ok. ;) 20:04:05 goozbach should have the agenda. 20:04:14 #chair goozback 20:04:14 Current chairs: goozback nirik skvidal smooge 20:04:18 #chair goozbach 20:04:18 Current chairs: goozbach goozback nirik skvidal smooge 20:04:41 ok #topic roll call 20:04:49 #topic roll call 20:04:49 here 20:04:51 * skvidal ishere 20:04:58 here 20:04:58 * nirik is around 20:05:42 ok goozbach could you run the agenda. my head hurts too much to be on top of it 20:06:33 smooge: on it 20:06:37 * abadger1999 here but will be back in session in a moment 20:07:02 #topic serverbeach outage 20:07:11 who has the update about the outage? 20:07:30 it's serverbeach1/2/3 machines. 20:08:01 and what's on them? 20:08:07 which is ns1, asterisk1/2, and collab1 20:08:07 who needs to be notified? 20:08:12 * nirik doublechecks. 20:08:15 has the notification happened? 20:08:36 I started writing the outage last night and got 20:08:39 sidetracked 20:08:41 serverbeach2.fedoraproject.org:asterisk1:running 20:08:41 serverbeach2.fedoraproject.org:collab1:running 20:08:41 serverbeach2.fedoraproject.org:ns1:running 20:08:42 serverbeach3.fedoraproject.org:asterisk02:running 20:08:42 serverbeach3.fedoraproject.org:collab2:running 20:08:52 serverbeach1 is a backup downloader for some things 20:09:36 the email was going to go to fedora-announce, fedora-devel, and fedora-infrastructure. 20:09:41 so, mailing lists (collab*) and talk will be down. 20:09:51 ns2 should be able to handle our ns needs? 20:10:01 nirik, I don't know. 20:10:19 nirik: in terms of scale - I bet so 20:10:21 and we probibly don't have time to add a ns3 and add it to whois? 20:10:36 how long is the outage? 20:10:37 the actual outage is pretty short... 15min somewhere in the 6 hour window. 20:10:37 nirik: correct - it wouldn't make it out ot the world 20:10:42 goozbach: theoretically? 20:10:44 not that much 20:10:48 what nirik said 20:10:55 I see 20:11:13 who's gonna be on hand to hold hands to get it back up if things go south? 20:11:14 * CodeBlock here - why are we not in -admin? 20:11:20 CodeBlock: last min change 20:11:22 :) 20:11:32 I'm being the slave driver today so we'll end on time 20:11:35 :) 20:12:03 what kind of outage is it? reboot? power? networking? 20:12:08 networking 20:12:10 goozbach: networking. 20:12:22 goozbach: it's not ours - its serverbeach's 20:12:24 so the hosts themselves dont need babysat to reboot 20:12:24 upgrading switches. 20:12:27 gotcha 20:12:54 so, we send outage annouce today and moving forward we look at adding a ns3 that is elsewhere? 20:13:00 nod 20:13:11 #info serverbeach outage affects asterisk1, collab1, ns1, asterisk02, and colab02 20:13:15 collab and asterisk will be harder to multihome, but we should look at that long term 20:13:39 #info outage is switch upgrade so servers don't need babysitter other than check they are alive 20:14:10 * nirik thinks thats all on this. 20:14:14 #action -- multi-home asterisk 20:14:21 umm 20:14:22 no 20:14:27 kick asterisk in the face 20:14:30 and be done with it 20:14:33 well, or remove. ;) but thats another topic. 20:14:39 * skvidal watches the clock 20:14:44 tic tic tic 20:14:56 #action blow asterisk away 20:14:59 topic++ ? 20:15:05 +++ 20:15:07 next 20:15:12 +++ATH0 20:15:15 NOCARRIER 20:15:18 ha. 20:15:19 #topic kick cvs in the head 20:15:28 nirik: you again? updates? 20:15:30 ok, so I looked at updates on this... 20:15:48 there are 2 CVS using projects that don't want to go to fedorahosted because the loves the cvs. 20:15:59 I think we should tell them to go to sourceforge or whoever still does cvs... 20:16:05 and set a deadline. 20:16:05 #action multi-home collab 20:16:17 goozbach: multihoming collab isn't happening this week 20:16:26 .ticket 1519 20:16:26 so it is kinda a silly action item 20:16:28 nirik: #1519 (Add packages to infrastructure repo) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1519 20:16:29 no, but I'm recording minutes of things we brought up 20:16:44 should prob be an #idea instead 20:16:46 nirik: nod. I suspect we'll get some other push back from them 20:16:46 goozbach: might be better to "#info look into ways of multihoming collab" 20:16:52 nirik: but here's another option 20:17:07 nirik: if they do not need cvsweb 20:17:07 I'm still getting the hang of this meetingbot thing 20:17:07 also, there are 'elvis' changes? 20:17:08 or something 20:17:15 goozbach: no worries. 20:17:19 nirik: put the cvs repo on people01 20:17:21 in a shared dir 20:17:25 and tell them to do it via ssh 20:17:34 #idea look into building ns3 and getting it out to the world 20:17:51 nirik: I think I mentioned this before - possibly in the ticket 20:18:02 skvidal: yeah, that could work, but someone who doesn't want to move from cvs, probibly doesn't want to change their workflow at all. 20:18:26 nirik: this is not our problem 20:18:34 but we don't want to keep cvs forever, and times change. ;) 20:18:35 nirik: we are not here to provide perpetuity of their workflow 20:18:54 i'll ask in the ticket 20:18:56 * nirik nods. 20:19:01 tell them CVS over ssh, or move to git? 20:19:08 or to svn 20:19:09 or bzr 20:19:12 or hg 20:19:14 * skvidal doesn't care 20:19:15 yeah 20:19:18 just NOT CVS 20:19:22 There is one more project to move to git, the 2 in cvs, and the elvis stuff and then I think we can turn out the lights. ;) 20:19:41 #info two projects in CVS, Elvis, then lights out 20:19:52 #info one more to move to git 20:20:03 nirik, skvidal anything else? 20:20:10 nope. Move on. 20:20:18 nope 20:20:23 #topic server updates 20:20:28 skvidal: take it away! 20:20:51 okay 20:20:53 so here you go 20:21:03 very basic update info we have 20:21:03 http://skvidal.fedorapeople.org/hidden/update-info-2011-03-10/ 20:21:12 these boxes get a kernel update 20:21:12 http://skvidal.fedorapeople.org/hidden/update-info-2011-03-10/kernel-updates 20:21:26 these boxes have something that someone has labeled as 'security' 20:21:27 http://skvidal.fedorapeople.org/hidden/update-info-2011-03-10/security 20:21:35 and here is what they are 20:21:35 http://skvidal.fedorapeople.org/hidden/update-info-2011-03-10/security-updates-by-host 20:21:41 a lot are just git-related 20:22:06 and here are all the boxes with updates 20:22:07 http://skvidal.fedorapeople.org/hidden/update-info-2011-03-10/update 20:22:21 all the boring ones will get applied 20:22:29 like x86## 20:22:33 ;) 20:22:35 and the xen## boxes 20:22:49 when I get to the app## and db## servers I'll be pinging people 20:22:55 to find out what I'm about to light on fire 20:23:07 I'll probably wait until toshio is back intown to light the app servers on fire 20:23:17 yeah. where is he? 20:23:23 20:23:26 Atlanta for pycon 20:23:27 smooge: pycon 20:23:30 what about reboots? 20:23:44 nirik: well all the kernels will need them - but not right away 20:23:44 cries a tear at not being able to go 20:23:49 I'll be back at the end of next week. 20:23:52 nirik: it's not like we remove the old ones 20:23:58 abadger1999: END of next week? 20:23:58 yeah, should we schedule those sometime? 20:24:01 abadger1999: pycon is a week long? 20:24:15 nirik: well - that's the reason to apply the updates - to schedule them all 20:24:22 nirik: our biggest problem is so many of them are on xen instances 20:24:26 and we need to reboot those.... 20:24:28 yeah... 20:24:32 skvidal: conference is 3 days. Then a language summit before and hackfests afterwards. 20:24:36 so we end up in a reboot cacauphony of doom 20:24:38 #info skvidal to start applying updates and will ping when hitting special boxes 20:24:40 abadger1999: ha 20:24:58 also, it would be cool if someday we could get to the point where we could do mass reboots of everything and not have it service impacting at all... 20:25:05 ie, enough redundency. 20:25:20 #idea get enough redundancy to have mass reboots without service impact 20:25:22 :) 20:25:22 nirik: it would be 20:25:27 nirik: but that's not today most of the time 20:25:37 nirik: a lot of that is b/c we're doing our own hosting 20:25:41 yeah... with the db boxes, and lots of stuff on xen, etc. 20:25:44 so we have to reboot the xen boxes 20:25:47 and not just the host 20:25:51 s/host/guest/ 20:26:00 if I only had to reboot guests this would be much simpler 20:26:20 well, if we could migrate guests around that would do it too... 20:26:28 but that usually wants shared storage. ;) 20:26:43 nirik: for some of the hosts 20:26:46 it is not important 20:26:54 but for a bunch of that - migrating would be expensive 20:26:57 super-duper expensive 20:27:00 b/c of all the disk they have 20:27:05 anyhow, back to updates... should we send an outage announcement on this? make a window? or ? 20:27:11 for the updates? 20:27:13 nah 20:27:23 justapplying them is normally danger-free for most of the cases 20:27:26 well, if they had sheepdog or cloudfs in the backend, it would be not expensive. ;) 20:27:28 the weird cases are app* 20:27:42 nirik: I meant time-expensive - but I agree 20:27:50 nirik: which is another reason to pursue those paths in the future 20:27:54 * nirik nods. 20:28:03 I meant outage for the reboots. 20:28:03 and as time goes on 20:28:10 I think right this moment 20:28:14 we can delay on the reboots a bit 20:28:28 until we have something to pair it with 20:28:32 or a better time 20:28:39 alternatively we can do them a bit at a time 20:28:47 invariably, whenever we do reboots 20:28:52 we get a new kernel update the next day 20:28:59 sure, the kernel update doesn't seem too critical to me. 20:29:06 https://rhn.redhat.com/errata/RHSA-2011-0303.html 20:29:26 it's not 20:29:36 https://rhn.redhat.com/errata/RHSA-2011-0329.html 20:29:50 go forth and update. ;) 20:29:53 * nirik has nothing more on this 20:30:01 nod 20:30:06 skvidal: probably want to reboot before next freeze though right? 20:30:15 seems reasonable 20:30:19 that's 3/28 iirc 20:30:22 yeah 20:30:26 #idea reboot before next freeze 20:30:37 ツ 20:30:42 skvidal: anything further? 20:30:50 not at the moment 20:30:54 very well 20:30:54 thx 20:31:09 #topic "gather community feedback" 20:31:21 do we want to persue this further? 20:31:40 or did we leave it as "give us a RFR and we'll setup a team on it"? 20:31:45 I don't think there is anything for infra to do yet here. 20:32:01 agreed there 20:32:04 I thought as much, but didn't want it to fall off the radar 20:32:12 If someone wants to run with a webapp that gathers feedback, great... 20:32:23 #agreed community feedback will wait till we recieve a RFR 20:32:29 but until then, I think we have enough to keep busy. 20:32:29 good good 20:32:40 fresh cup move down 20:32:44 #topic meeting tickets 20:33:05 I'm not certian which of these tickets are most pressing, but I've picked the ones which look the most important to me 20:33:23 #url https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=milestone&keywords=~Meeting&order=priority 20:33:33 .ticket 2502 20:33:36 goozbach: #2502 (Retrace Server) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2502 20:34:02 retrace server RFR 20:34:04 Ok this is basically going to need for hardware to be bought by the team to meet their larger goals 20:34:26 #info needs more hardware to meet the goals 20:34:32 smooge: so it's in a holding pattern? 20:35:15 yes 20:35:21 ok 20:35:28 I suspect they need lots and lots of disk 20:35:33 .ticket 2517 20:35:35 goozbach: #2517 (Need mod_evasive for EL6) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2517 20:35:56 anyone working on this? what's the priority? 20:35:59 do we actually need this? 20:36:15 * nirik doesn't know. 20:36:52 we'll move on then 20:37:04 .ticket 2531 20:37:05 goozbach: #2531 (DB03 update) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2531 20:37:42 this is the postgres upgrade, the plan has been laid out in the ticket, but nothing further 20:38:03 skvidal? 20:38:11 is this something we can get to before the next freeze? 20:39:01 I don't know 20:39:05 the plan went away 20:39:28 mainly due to notenoughtime 20:39:30 * nirik notes the schedule of freezes is at http://rbergero.fedorapeople.org/schedules/f-15/f-15-infrastructure.html 20:39:40 I don't think we can do this by next freeze 20:39:55 It requires dgilmore who is busy with family for a bit 20:40:08 We are going to have enough issues with moving systems next week 20:40:11 if we have reached NOTENOUGHTIME and the big part of help is busy, then it's gonna have to hold 20:40:12 yeah. 20:40:23 so, lets push it out to between beta and final sometime? 20:40:27 #agreed needs to be held as not enough time before freeze 20:40:37 and/or ask dgilmore when 20:40:37 #idea move between beta and final 20:40:46 nirik, since it affects the db for builds.. I think it will have to wait til after final 20:41:02 smooge: good point 20:41:09 yeah, possibly... although it's never a good time to stop builds. ;) 20:41:17 someone will always want to be bulding something. 20:41:35 nirik: but more so when we are alpha or beta 20:41:41 ok next 20:41:54 .ticket 2539 20:41:55 goozbach: #2539 (decom xb-01 and reallocate bxen01) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2539 20:42:07 * nirik shrugs. It's always the case... security issue comes up, or a bad bug is found. :) There's never a good time. 20:42:23 ok this needs to be another dgilmore needs time issue. 20:42:50 #info another dgilmore item 20:42:53 I can rebuild xb-01 on bvirthost01 but I don't know how useful that will be. 20:42:59 nod 20:43:04 got it, and moving on... 20:43:04 what is xb-01 for? 20:43:09 or not... 20:43:09 good question 20:43:12 :) 20:43:20 action item to find out? 20:43:22 no idea. it is a xen build host. 20:43:59 ok. 20:44:01 I don't know if its there for testing whehter builds work on xen or something else. I don't think it is used much 20:44:10 because its ssssllllloooooowwwww 20:44:49 ok, move on, we will find more info. ;) 20:44:51 well we'll look later 20:44:59 #action find info xb-01 20:45:12 .ticket 2574 20:45:13 goozbach: #2574 (Perform regular inactive account prunings and possibly a password reset policy.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2574 20:45:16 that's the pruning ticket 20:45:18 any updates? 20:45:56 I keep getting sidetracked 20:46:23 I need to talk with abadger after meeting on a better way to get the info I need out of fas because its all using zodbot at the moment 20:47:02 smooge: You can start here: https://fedorahosted.org/releases/p/y/python-fedora/doc/existing.html#fas 20:47:06 is 2544 and 2542 on your list 20:47:19 yup 20:47:22 let's do those 20:47:28 .ticket 2544 20:47:31 * nirik notes we have about 13min left. 20:47:35 goozbach: #2544 (migrate autoqa01 elsewhere) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2544 20:47:39 ok 2544 goes down next week 20:47:42 last two tickets, then open floor 20:47:53 I will not be at the meeting most likely as it will be my whole focus 20:48:00 [if I get network data ever] 20:48:14 #info ticket 2544 will be happening next week, smooge will be doing that during meeting time 20:48:18 we have racks, we have people to move things. I just ened ips and firewall changes 20:48:36 smooge: you going out on site? 20:48:42 or just wrangling from there? 20:48:42 no. no money to do so. 20:48:46 ok. 20:48:56 I would feel better if I could but it owuld be out of pocket 20:49:05 if something goes wrong, I will do so though 20:49:31 #info just needs IP address and firewall changes 20:49:42 things that I am waiting on to get this done: 1) network ips 2) word from the s390 and ppc community people that I can get their help 20:50:06 other than that its ready 20:50:26 as it is tied in with autoqa as they are going to the same network 20:50:32 ok I am done on that. 20:50:39 .ticket 2542 20:50:40 goozbach: #2542 (reinstall fas01 on rhel6 kvm host) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2542 20:51:04 we have 3 rhel6 kvm hosts now, right? 20:51:06 We got the hardware finally (it was supposed to be ready before fudcon) I am hoping to get it networked and ready next week 20:51:15 then we will have 3. 20:51:25 cool. 20:51:58 I am really hoping it can happen tomorrow as I would like to have monday and tuesday set up of moving stuff off of old xen boxes. 20:52:14 #info hardware finally here, will have three RHEL6 kvm hosts when finished 20:52:24 and I am done 20:52:27 the other two are internetx and ? 20:52:43 very well 20:52:47 #topic open floor 20:52:51 wait 20:52:55 #topic meeting time 20:53:08 next week is DST change for the US 20:53:23 oh yeah, I was going to suggest we move back an hour (or two) and meet here again moving forward. 20:53:25 we also have been bumping up against the cloud meeting here in this channel 20:53:39 yeah, thus I was suggesting we move back an hour. ;) 20:53:41 so moving the meeting to 1900 UTC 20:53:51 +1 from me 20:53:56 +1 from nirik 20:54:03 anyone else want to chime in? 20:54:17 +1 as long as I get nirik to take over soon 20:54:34 nirik: you want to run meeting next week? 20:54:55 * skvidal is fine 20:55:00 whatever 20:55:02 I'll still do agenda unless I'm at the hospital delivering my baby 20:55:03 goozbach: doesn't matter to me... 20:55:03 my criteria are 20:55:05 not interrupted 20:55:10 and not pushed out of a virtual space 20:55:29 skvidal: having someone sheperd the meeting helps with the second 20:55:34 having an agenda helps too 20:55:37 ok 20:55:39 so it's 20:55:42 no I meant more in nirik taking over infra team soon 20:55:44 goozbach: I can... is your baby due soon? 20:55:53 goozbach: I don't like timelines 20:56:09 #agreed meeting moved to Fedora-meeting at 1900UTC 20:56:24 nirik: any day now actually 20:56:35 #topic open floor 20:56:38 anything else? 20:57:07 * nirik looks forward to working with folks more moving forward. ;) 20:57:38 * CodeBlock looks forward to working with nirik more too :D 20:57:58 oh, I note that I will probibly be somewhat off line the first week of april. 20:58:33 * CodeBlock wishes nirik the best of luck with the new position btw 20:58:34 a minute to spare for the cloud folks. ;) 20:58:39 yay! 20:58:46 ok nothing else I move to 20:58:50 #endmeeting