19:59:45 #startmeeting Infrastructure 19:59:45 Meeting started Thu Sep 17 19:59:45 2009 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:59:45 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:59:51 #topic Who's here? 19:59:52 * ricky 19:59:56 * nirik is in the back. 19:59:59 * SmootherFrOgZ is 20:00:02 * skvidal nods 20:00:07 * collier_s here 20:00:09 * a-k is here 20:01:25 Ok, doesn't look like we have any meeting tickets so we'll just get right into it 20:01:51 #topic FAS accounts and capcha 20:02:05 One thing I wanted to make known is we recently implemented a capcha in FAS. 20:02:08 by we I mean abadger1999 20:02:18 the problem was mostly caused from spammers we think. 20:02:39 STill more research to do but as many as 2 in every 3 account we have in FAS was created by a spammer. 20:02:46 or at least someone who didn't verify their email. 20:02:54 But! no more :) 20:03:01 :-) 20:03:06 here 20:03:11 we re-consider freeing up those account names as well. Something worth discussing. 20:03:20 they'd not have been in any groups or anything so it might be safe. 20:03:24 might not be though, more research required. 20:03:37 * ricky is all for freeing up accounts that have never been verified (ie never logged on) 20:03:48 yo 20:03:53 mdomsch: hey 20:03:59 * ianweller rolls in for da lulz 20:04:08 anyone have any questions or comments on that? 20:04:28 wow, that's crazy 20:04:52 mdomsch: indeed :) 20:04:55 so, does this screw up all our account/registration growth stats? 20:05:11 notting: it does as far as what we have done in the past but we can get new stats I think. 20:05:12 mmcgrath: It'll be interesting if our initial registration drops 20:05:22 abadger1999: I'm almost positive it will. 20:05:23 Did we ever really assume that # accounts = # of active people? 20:05:33 ricky: depends on 'we' 20:05:37 It'll only drop significantly if the accounts are being created by a machine. 20:05:46 I really don't consider a contributor a contributor unless they have a fedorapeople account. 20:05:54 I guess the question is if those specific numbers are usually quoted to the press or anything 20:06:13 Same here - I get my counts from ls /home/fedora | wc -l on fedorapeople.org :-) 20:06:18 ricky: some are. *but* generally we encourage people to use 'contributor' when siting those numbers. 20:06:23 People that sign up but never verify email would still get through the captcha. 20:06:25 and we define a contributor as cla_done + one group 20:06:28 which is the fedorapeople count. 20:06:37 and that count will stay the same 20:07:31 So that's really all there is on that. 20:07:43 One thing I wanted to talk about was with affix 20:08:04 but I don't see him around right now so we'll wait. 20:08:09 he's looking for search engines for us. 20:08:24 Ok, so next topic 20:08:28 #topic PHX1 to PHX2 move 20:08:55 So smooge and I have been working to get the various network maps, inventory and other related directions to RH's IT department. 20:09:04 Much of that stuff is available in gobby. 20:09:21 makes gobby gravy 20:09:25 smooge also put some network diagrams up http://smooge.fedorapeople.org/ideas/ 20:09:43 It looks like we're going to be moving into 5 racks. 20:09:49 and spread properly so power isn't an issue. 20:10:00 we might expand into the 6th rack but I'm not counting on it, at least not for November. 20:10:05 and look at smaller pdu's as we expand 20:10:13 * Oxf13 20:10:15 here 20:10:18 smooge: yeah, are you a fan of those 1U horizontally mounted PDUs? 20:10:45 mmcgrath, more that I am a fan of PDU's that aren't treated like checkbooks :) 20:11:02 most PDU's have too many plugs for what you can use with modern equipment 20:11:35 maybe we should start getting servers with 4PS's in them :) 20:11:43 then double the PDU's 20:11:49 oh wait, that's the same as just doubling the pdus 20:12:07 anywho :) 20:12:20 So I still have some outstanding questions, I'm sure smooge does too 20:12:27 but it looks like this is going to be about a 48 hour outage. 20:12:39 mmcgrath, when? 20:12:40 no we should look at DC racks 20:12:53 mdomsch: after F12 ships, before the end of November. 20:12:54 mdomsch, depends on how much F12 slide there is 20:13:01 smooge, DC still doesn't buy much 20:13:12 Oxf13: whats the latest "what are the odds of F12 slipping"? 20:13:21 no change 20:13:25 k 20:13:30 still looking good 20:13:36 So does anyone have any questions or concerns about this move? 20:13:40 mmcgrath, can we migrate bapp1 to another datacenter before then? 20:13:44 it's going to be a lot of planning, then a week or two of hell. 20:13:51 mdomsch: yeah we can add that to our list. 20:14:12 I've moved stuff I had budgeted for the last quarter of this FY into a more major purchse just thi sweek. 20:14:21 * ricky will probably be in his own hell week of at that point :-( 20:14:23 we'll have some servers in PHX2 hopefully in a couple of weeks that we can use for that. 20:14:36 ricky: sucker, at least we don't get graded :) 20:14:55 mdomsch: so right nwo the list of things we want to 'pre-move' is db1, db2, bastion3 (new) app1 and bapp1. 20:14:57 mmcgrath, sure we do 20:15:19 mdomsch: I always thought this was more pass fail :) 20:15:27 ricky its just midterms... you should skip them and come on out 20:15:29 failure is not an option 20:16:00 failure is always an option.. in fact most of my academic career is based on that. 20:16:11 but then again one should not take Zonker Harris as a role model 20:16:18 if we slip, does that mean we relocate fudcon to mesa and do the move then? 20:16:24 Haha 20:17:29 notting: actually not the worst idea I've ever heard :) 20:17:30 hey, it'll be warmer than Toronto 20:18:00 smooge: zipper, then? 20:18:01 One other thing we wanted to make people aware of is in PHX2 we're going to start seperating our network segments. 20:18:12 We're basically going to move to a 3 network system 20:18:14 nah zipper is a real loser 20:18:21 1) Buildsystem network 20:18:28 2) Combined nfs / storage type network 20:18:31 3) public network. 20:18:47 For those familiar with the environment, it's not too hard to figure out what will go where. 20:19:04 Where does something not public like bapp1 go? 20:19:11 ricky: on the public network 20:19:15 Oh, OK :-) 20:19:24 smooge: You're just too old to appreciate him. 20:19:32 the public network isn't so much "public IP space" as it is just for all general network stuff like the 834 network is now in PHX. 20:19:48 Ah, OK 20:19:50 Ok, anyone have any questions on this? If not we'll move on. 20:19:55 actually general network sounds better 20:20:10 smooge: and sargent router? 20:20:15 Ok 20:20:17 ricky, http://fpaste.org/oPMp/ 20:20:20 #topic Favicon.ico 20:20:25 a-k: whats the poop on this? 20:20:59 There are 4 favicons in puppet, but only 2 get sourced, and none of them are actually used by existing html, as far as I can tell so far 20:21:15 a-k: what ticket number was that? I forget? 20:21:26 * a-k checks 20:21:53 .ticket 1669 20:21:54 a-k: #1669 (the old favicon must die.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/1669 20:22:06 danke. 20:22:10 a-k: and you saw the note about smolts.org? 20:22:37 Yes. Did you want to have no favicon at all on smolts? 20:22:49 We don't have one so just leave it null. 20:22:56 OK 20:23:00 a-k: anything else on that? 20:23:30 a-k: favicon doesn't have to get referenced in our html 20:23:38 I had a couple suggestions in the ticket and Ricky liked one of them. If anybody has other ideas, let me know. 20:23:38 it's automatically looked for by browsers 20:23:59 mdomsch: yes. In document root as favicon.ico. 20:24:09 a-k: alrighty, thanks. 20:24:11 That was one of my suggestions. 20:24:20 #topic mod_evasive 20:24:40 So, some of you have probably seen make the mod_evasive module a bit mo' betta 20:24:58 This is largely because cvs1 has had some load issues recently 20:25:07 RH's internal search engine has been very agressive. 20:25:18 and google's crawler doesn't honor Crawl-Delay 20:25:34 stupid bots 20:25:35 you can force google's crawler to crawl more slowly but it only lasts for 90 days. 20:25:39 So 20:25:49 I've decided to setup mod_evasive to be a bit more agressive about banning people. 20:26:16 Does it have whitelists like denyhosts? ;-) 20:26:20 It's actually pretty hard to predict how or when someone will get banned because the various apache children don't talk to eachother. 20:26:22 ricky: it does. 20:26:25 Ah, cool 20:26:48 I still want the content on cvs.fedoraproject.org to be searchable or I'd have banned it altogether. 20:26:56 it might be worth looking at alternatives to viewvc though. 20:27:09 or at least figuring out why it's so freaking slow. 20:27:12 s/viewvc/cvs/ :-D 20:27:13 migrate all projects away from cvs? 20:27:19 cough get rid of cvs cough 20:27:22 not our problem 20:27:33 One side effect of viewvc is that it creates a bunch of junk rcs* files in /tmp 20:27:39 I wasn't able to figure out why though 20:27:56 ouch 20:28:05 It's probably time for someone with time and courage to try starting the SCM Sig back up again. 20:29:02 Anywho, any questions about that? 20:29:42 svn! 20:29:54 Or should that be !svn ? 20:29:56 :-) 20:30:04 moving on :) 20:30:07 #topic pgpool 20:30:08 bitkeeper 20:30:21 welp, we tested pgpool in staging and it's been working on db2 for a while with no issues (and no connections) 20:30:33 mdomsch: after the meeting do you want to enable pgpool in production with mirrormanager? 20:30:40 no connections? 20:30:52 It could be good to start off with making the crawlers use pgpool 20:30:57 smooge: it's been deployed but the firewall's been active. 20:31:00 Then the the mirrormanager and transifex apps 20:31:02 mmcgrath, if you wish 20:31:12 mmcgrath, I'll have to go to another meeting, so can't be there if it breaks 20:31:19 mdomsch: you won't have to do anything for that but it'd be good to have you around incase the sky falls. 20:31:23 the revert is simple enough and all. 20:31:34 and I've got a minor MM update to push, to fix the can't-create-a-netblock bug 20:31:45 ah, well goodie. 20:31:51 trying to to test that on stg 20:32:07 but that's separate; so do your thing 20:32:21 mdomsch: will do. 20:32:24 and remember it won't take effect for all the crawlers for a couple hours 20:32:33 20:32:49 FWIW, I saw a measurable (not major) performance increase in my tests. 20:32:53 I can't explain that 20:32:56 but it was there. 20:33:17 perhaps the logging in phase of psql is slower then it could be :) 20:33:25 Ok, anyway. With that 20:33:28 #topic Open Floor 20:33:34 Anyone have anything they'd like to discuss? 20:33:38 Can we get a quick overview of how the ipv6 stuff went? 20:33:50 Sure. 20:33:52 Are all the major problems ironed out? 20:33:54 I'll do the quick version 20:34:11 we started, some people couldn't reliably use TCP traffic (and probably other types of traffic) 20:34:22 the iptables rule on the list won't work for us because RHEL5 doesn't support it yet. 20:34:32 but by specifying a lower MTU, I've not heard a single complaint since. 20:34:38 and we're *STILL* waiting on the glue record AFAIK. 20:34:43 I'm not sure why that is though 20:34:47 Cool - that's something that happened on the person's side and not our side? 20:34:59 mmcgrath, right, no glue yet 20:35:11 ricky: well it could be done in either place actually. 20:35:17 but by doing it on our server, others don't have to do it. 20:35:38 Ah, good 20:35:43 oh and mdomsch did some neat stuff as he mentioned on the list wrt MM and geo ip 20:36:01 mdomsch: I saw you prodding warthog9 again about geoip dns. We starting to think that over again? 20:36:19 mmcgrath, not really; i was just looking for how to implement a better backend for bind 20:36:30 the zonefile for doing BGP lookups takes 1GB RAM 20:36:49 ah 20:36:49 so I didn't do that; I custom-coded it in MM, takes 7MB 20:36:58 wow thats big.. 20:37:07 mdomsch: when you figured out how to do it in 7M did you do a dance? 20:37:07 and then its small 20:37:08 * mmcgrath would have. 20:37:12 oh yeah 20:37:18 celeste was scared 20:37:22 Heheh 20:37:26 hahah, it's like you've invented fire 20:37:34 well good work on taht too 20:37:40 removed all the 0's? 20:37:41 that's not in production yet 20:37:47 but coming along 20:37:49 related to dns and geoip, I'd still like to see that as a TODO sometime. 20:37:51 but not urgent. 20:38:01 ok, on that note 20:38:07 we have an offer for hosting from China Unicom 20:38:14 we're probably getting to the point where we need to re-think our content distribution network. 20:38:15 cool. 20:38:20 they're looking for size estimates for servers 20:38:32 can I go to do the buildout? 20:38:33 what do we want to put there, and what server resources do we need? 20:38:59 mmcgrath: wikipedia apparently uses powerdns for geodns 20:39:07 mdomsch: do you think they were thinking about providing servers as well? 20:39:08 smooge, catch Ivory on IRC 20:39:16 Maybe something to take a look at (and it has a bind-style zone file backend) 20:39:23 mmcgrath, yes, I believe so. 2 asks: 20:39:26 ricky: yeah I think those are the big winners right now in that market. pdns or bind + a patch 20:39:28 1) they set up and run a mirror 20:39:36 2) they give us dedicated hosting and Xen guests 20:40:01 So we wouldn't have to worry about the hardware at all with the xen guests? 20:40:04 I'd totally go for that 20:40:11 that's the ask. We'll see. 20:40:11 it's worked out well for us in BU 20:40:14 I think we would like xen dom0 if possible . domU's is nice 20:40:17 yeah 20:40:31 so if anyone sees Ivory on IRC, be nice 20:40:37 heheh 20:40:39 np. 20:40:39 and if anyone speaks Chinese, that would be a big plus! 20:40:45 as I don't 20:40:49 crap. neither do i 20:40:51 yeah, do we have any native chinese speakers in the house? 20:40:55 I can barely order it off a menu 20:41:01 * ricky wishes he could read/write, but nope :-( 20:41:02 his english is pretty good, but he's concerned about it. 20:41:11 mdomsch: I'd know how to ask Ivory to order spicy tofu. 20:41:16 and then tell him it was good :) 20:41:18 his english is probably better than mine 20:41:32 Something like "la dou fu" :-) 20:41:34 mdomsch: I'll get some specs to you about what would be good to have over there. 20:41:52 on another unrelated note 20:41:57 I'll be at LinuxCon all next week 20:42:18 mdomsch, cool 20:42:27 ditto 20:42:41 where is that this year? 20:42:48 Portland 20:43:23 cool. and wet 20:43:48 blackberries are really good right now I have been told 20:44:08 on my note, xen13 is now RHEL-5.4 and should not reboot as often 20:44:22 smooge: cheers! 20:44:23 bastion1 should also really really think its bastion1 20:44:34 Nice :-) 20:45:03 Did it go pretty smoothly, or were there any bumps along the way? 20:45:28 rhel 5.4 domU went pretty smoothly 20:45:29 xen13 needed /etc/grub hand edited. 20:45:39 Ah, yow 20:45:49 so I had to reboot twice.. as I forgot the first time 20:46:00 and fas1 took 3 xen creates to start up 20:46:11 smooge: what needed to be altered in grub? 20:46:11 and I didn't see why in the logs 20:46:27 it was booting by default to an old kernel. 20:46:31 Strange 20:46:37 Ah, then I take my "yow" back :-) 20:46:38 so when I updated it moved from booting from 1 to 2 20:46:50 so I moved it to 0 20:46:58 ahh 20:47:19 smooge: /etc/sysconfig/kernel ? 20:47:47 It's set to yes on xen13 20:48:02 ricky: what about DEFAULTKERNEL 20:48:04 Not sure why it wouldn't have gotten updated automatically. Maybe we had manually edited things before? 20:48:05 is it kernel or kernel-xen 20:48:08 That's set to kernel 20:48:15 that should probably get set to kernel-xen 20:48:22 Ahh 20:48:30 Ok, anyone have anything else to discuss? 20:48:40 If not we'll close the meeting in 30 20:48:46 ricky, I think it was manually edited in the past when trying to figure out which kernel worked 20:48:52 nothing else from me. 20:49:08 One more thing 20:49:20 Should we email mirror-list-d about the i2 mirror soon? 20:49:32 what i2 mirror? 20:49:32 ricky: good question 20:49:32 The networking issues there seem to be resolved for the most part 20:49:41 at rdu? 20:49:43 mdomsch: the RDU i2 mirror seems to be up and running well 20:49:44 Yup 20:49:50 ah, good to know 20:49:50 mmcgrath: how's kvm testing going? 20:49:51 ricky: what was the last speed test? 20:49:51 (And same question for sync1/sync2) 20:50:03 abadger1999: oh, good question. I'll get to it in just a sec. 20:50:11 ricky: sync1/2 isn't ready for public consumption yet. 20:50:36 so we don't need static routes for RDU i2 anymore? 20:50:46 mdomsch: that's my impression 20:50:49 ricky: what IP was it at? 20:50:59 download-i2.fedora.redhat.com? 20:51:18 I'm getting excellent speeds from osuosl1 right now 20:51:49 (it showed >140 MB/s in the beginning and eventually returned to >7 MB/s which is still good) 20:51:50 mdomsch: ^^ 20:51:58 Oh, and I wanted to talk about cloud stuff too 20:52:01 Ok, real quick. 20:52:13 abadger1999: the kvm stuff is going ok. We were having horrible IO performance issues on app7 20:52:23 Although starting a download from ibiblio1 at the same time caused the osuosl1 one to drop down to <500 KB/s :-( 20:52:24 after changing some drivers around it's better but still not as fast as some of the app servers. 20:52:27 await has been a big problem. 20:52:32 So it's not very balanced it seems 20:52:45 abadger1999: but research continues :) 20:52:56 the good news is we're in no rush to switch to it so we can make it behave exactly was we want to. 20:53:02 SmootherFrOgZ: you around? 20:53:09 The cloud stuff has been going well, SmootherFrOgZ has been hard at work. 20:53:14 mmcgrath: yep 20:53:14 we keep running into lots of little issues. 20:53:26 a million paper cuts causing us to be unable to use cumulus.fedoraproject.org 20:53:41 SmootherFrOgZ: any luck with the nics? 20:54:17 we're currently debuging them to get them work properply 20:54:36 actyually nic show up with wrong interface name 20:54:45 SmootherFrOgZ: ok, well good work on it so far, I know this issue in particular has been quite irksome 20:54:59 Ok, anyone have anything else they'd like to discuss? 20:55:03 if not we'll close the meeting in 30. 20:56:40 Allright 20:56:42 #endmeeting