20:00:42 #startmeeting Infrastructure 20:00:42 Meeting started Thu May 27 20:00:42 2010 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:42 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:00:45 #topic Who's here? 20:00:48 * ricky 20:00:51 * nirik is lurking around. 20:00:52 here 20:01:15 * abadger1999 here 20:01:33 * sgallagh lurking 20:02:32 Ok, lets get started 20:02:37 #topic Infrastructure Fedora 13 release 20:03:07 #link https://fedorahosted.org/fedora-infrastructure/report/9 20:03:13 lets go through and make sure these are all closed 20:03:17 .ticket 2137 20:03:18 mmcgrath: #2137 (New website) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2137 20:03:23 ricky: mind closing that one? sijis has it currently 20:03:43 Closed 20:03:56 .ticket 2138 20:03:57 mmcgrath: #2138 (Verify Mirror Space) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2138 20:04:00 smooge: mind closing that one? 20:04:02 .ticket 2139 20:04:03 mmcgrath: #2139 (Release day ticket.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2139 20:04:07 this one's mine, closing 20:04:08 done 20:04:24 .ticket 2141 20:04:25 mmcgrath: #2141 (Mirrormanager redirects) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2141 20:04:25 here 20:04:27 mdomsch: that one all done? 20:04:30 * mmcgrath assumes so 20:04:38 .ticket 2146 20:04:39 mmcgrath: #2146 (Enable wiki caching.) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2146 20:04:41 .ticket 2147 20:04:43 mmcgrath: #2147 (Disable wiki caching) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2147 20:04:46 smooge: those two done? 20:04:50 * mmcgrath assumes so :) 20:04:53 .ticket 2166 20:04:54 mmcgrath: #2166 (mirrors.fp.o/releases.txt) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2166 20:04:56 yep. and should be closed 20:04:59 that one's done, closing now 20:05:17 And that leave just one more thing 20:05:20 .ticket 2145 20:05:21 mmcgrath: #2145 (Lessons Learned) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2145 20:05:37 sorry i'm late 20:05:42 i'm split between this and $dayjob 20:05:44 so all in all I think this release went well, we did launch pretty close to the target time. 20:05:47 jokajak: no worries. 20:05:55 ricky: how'd things go from your view with the website? 20:06:18 Good, had to stay up the night before fixing up some docs stuff 20:06:31 yeah, I think docs was the one wildcard that wasn't great. 20:06:32 Went smoothly on release day though, didn't delay release with the build for once :-) 20:06:40 but it didn't seem to impact users that much so not a big deal. 20:06:51 One thing we did see this time around that we had not seen in previous releases is http code 103. 20:06:59 mmcgrath, yes 20:07:03 what is that you ask? It's not well defined but related to apache and caching. 20:07:13 It's not clear to me what the users were seeing when issued a 103 20:07:22 I do know it generally happened with larger(ish) content like images. 20:07:30 I couldn't recreate it. 20:07:33 Images on fp.o? 20:07:38 ricky: yeah. 20:07:39 Or do you mean ISO images? 20:07:42 pictures 20:07:44 Yow, didn't know about that 20:07:55 it was reasonably rare compared to the number of served requests. 20:08:24 based on what I saw on the boxes, and from googling around my theory was that these were images in the process of loading that users either clicked the "stop" button killing the transfer. 20:08:35 or more likely, found the link they wanted before the whole page loaded. 20:08:48 Ahh 20:09:05 but not being able to completely recreate it, it's hard to say. 20:09:44 anyone have any other questions or comments on this, the release, or anything? 20:09:55 When I investigated a bunch of 103s happen for a while at work; that's what caused it. Recreating it is more reliable when you have an extremely short trip between a testing script and the server. 20:10:10 gholms: good to know. 20:10:23 I suspect it also helps if the server's under a lot of load and is slower than normal to load those things. 20:10:50 I would try doing some testing from a machine in the datacenter if you can; it might be more reliably reproduced. 20:10:59 20:11:19 Ok. 20:11:21 so next topic 20:11:23 * ricky will write a test script in a bit 20:11:26 #topic CDN 20:11:39 Thanks to nb I think our dnssec issues are re-fixed. 20:11:48 I think the next step is going to be geodns implementation. 20:11:52 which we've been testing on publictest8. 20:12:19 The CDN though is going to be a much more detailed project requiring quite a bit of our time to get in place and maintain. 20:12:22 but I think it will be worth it. 20:12:30 our end users should see much better performance then previously. 20:13:11 Anyone have any questions on this? 20:13:16 concerns? 20:13:18 want to help out? 20:13:30 one sec 20:13:48 CDN? 20:13:51 Any idea what kind of technologies this will involve other than geodns? 20:13:54 content distribution network. 20:13:59 ricky: everything we have now 20:14:06 geodns was the last bit that'd make it worth while. 20:14:16 the work though is going to be making sure our caching layer is functioning properly. 20:14:21 so an opensource akamai? 20:14:22 Ah, cool, I'd definitely be happy to help out 20:14:24 smooge: yeah 20:14:28 the big thing is metrics. 20:14:45 for example, we may want to look closely at serving static content from the proxy servers directly 20:14:56 like pkgdb's images and css. 20:15:15 we've had some issues in the past wrt caching our admin.fp.o content when someone is logged in. 20:15:35 I'd be happy to help out as well, just tell me what I'm doing 20:16:30 CodeBlock: k 20:16:37 does anyone have any experience actually setting these up? 20:17:05 not for production use :-) 20:17:15 :) 20:17:20 well this will be an adventure for all of us. 20:17:30 k, moving on if no one has anything else. 20:17:42 mmcgrath, no my experience has been more in breaking them 20:17:48 I think we've fixed the caching/logged in problem 20:17:55 smooge: :-D we'll need some of that too. 20:18:05 (By setting no cookies on the particular content) 20:18:13 abadger1999: k, I may work with you on that to verify. Because when that happens... that is some scary crap. 20:18:19 :) 20:18:20 20:18:26 we have a decent staging environment now too so that should help. 20:18:32 For those that don't know what I'm talking about.... 20:18:59 when we first enabled caching on admin.fedoraproject.org. Login cookies were getting cached. So if toshio logged in before me. Then I tried to log in. It's possible i'd find myself logged in as toshio. 20:19:14 Boy was that a fun day. 20:19:22 Hehe 20:19:24 ok, anyone have any other questions or comments on that? 20:19:26 I wanna be toshio! 20:19:40 ok, that transitions into 20:19:45 #topic Internetx -- new sponsor 20:19:59 I'm happy to say we have a new machine in the EU (hydh hooked us up) 20:20:05 so are they all ipv6 all the time? 20:20:07 it has ipv6, good connection. 20:20:15 if so then it's nice to see puppet (and func) both work sanely 20:20:25 they have both ipv6 and ipv4. 20:20:37 native ipv6 20:20:37 skvidal, I tried to be toshio but my feet grew 3 feet 20:20:44 and 1gbit uplink 20:20:45 I'm thinking at a minimum we're going to need to move noc2 out there. 20:20:54 Yeees! 20:20:54 yes 20:20:55 because with 2 ipv6 connections, it's time to start actually monitoring them. 20:21:06 and I'm honestly not sure if nagios supports it. 20:21:10 though I'd imagine it does. 20:21:35 nagios 3 does.. [isn't that the default line for anything that nagios currently doesnt?] 20:21:41 hahahahah 20:21:46 does nagios bake cookies? 20:21:48 nagios 3 does... 20:21:55 mmmmmm coookies 20:22:13 * gholms has a nagios plugin alerting him to lunch time as it approaches 20:22:14 mmcgrath: if you combine nagios 3 with butrfs it cures cancer! 20:22:19 Ok, so we'll be getting that brought online soon. 20:22:19 mmcgrath: InterNetX from Germany is sponsoring Fedora? 20:22:21 smooge: Ponies! 20:22:31 rsc: yup 20:22:43 rsc: so we can finally close some of your "X is slow to load" tickets :) 20:22:46 ricky: how's proxy2 going btw? 20:22:57 So far so good, done with puppet, getting it on func 20:23:08 mmcgrath: that would be great. Because InterNetX has powerful infrastructure and even well IPv6 connectivity :) 20:23:11 ricky: ever figure out what was causing that error? 20:23:21 Nope, but I'm definitely not done looking :-) 20:23:27 rsc: it's a done deal. It's already handed over to us, we're just still in the process of building it :) 20:23:32 ponies and cookies. can my day get any better 20:23:34 ricky: did you just end up using my.. eh hem... hack? 20:23:41 Yeah 20:23:51 Shocking how fast it still was.. 20:25:03 yeah 20:25:07 anyone have anything else on this? 20:25:22 not me 20:25:32 ok. next one 20:25:34 #topic koji backups 20:25:48 so I've been working to move koji backups from the tape drives to dedicated storage. 20:25:57 in this case 6U of storage including a 2U server and 2 2U disk trays. 20:25:57 ricky: getting it on func? doesn't puppet do that automagically now? 20:26:04 ricky: it's the same cert 20:26:28 skvidal: Thank you - you just saved me the time I was going spend debugging nothing :-) 20:26:31 It's been backing up for about a week and a half now. 20:26:46 ricky: if you login to puppet1 20:26:48 you can run 20:26:57 is it because of disk speed OR just churn of whats in koji? 20:27:01 sudo func 'proxy02*' call test ping 20:27:09 ricky: which will tell you if it works 20:27:14 smooge: well it doesn't seem to be the disks on the backup server. Must just be the /mnt/koji speed. 20:27:18 damn it 20:27:24 I need to write up the docs on using func in FI 20:27:26 * skvidal makes a note 20:27:44 skvidal, the sticky note from last week fell off? 20:27:47 anyone have any questions or comments on that? 20:28:21 k 20:28:23 next topic 20:28:26 #topic /mnt/koji 20:28:31 from backups, onto the real deal. 20:28:33 smooge: be nice 20:28:36 :) 20:28:38 dgilmore: when do you want to move /mnt/koji over? 20:28:38 mmcgrath: it took me over a week to rsync /mnt/koji/packages onto the equalogix 20:28:47 I assume we're going to want to wait until the other equalogicx is on site? 20:28:52 mdomsch: you know we own two of those now right? 20:28:55 mmcgrath: need to do a full sync again 20:29:05 mmcgrath: and i want to test my rm -rf 20:29:17 mmcgrath, no, that's great! 20:29:17 that will free up 1.2T or so 20:29:29 mdomsch: they're not both on site yet but they will be. 20:29:32 mmcgrath: and id kinda like to wait till we get the second unit in place 20:29:34 dgilmore: I have some port concerns. 20:29:40 * CodeBlock needs to head out for a while, sorry. back later 20:29:42 dgilmore: makes me wonder if we can use crossover cables. 20:29:45 CodeBlock: no worries. 20:29:52 mmcgrath: how we are using it we will only use a single port ever 20:30:11 mmcgrath: it doesnt do port bonding 20:30:16 dgilmore: k, we may need to communicate that to the network team because last time I was there I swear we had like 6 ports plugged in. 20:30:25 mmcgrath: with a single client we are really only using a single port 20:30:27 oh that's right, or it does do bonding, just not the type our network team can work with. 20:30:30 yeah. 20:30:40 it doesnt do bonding period 20:31:02 dgilmore: it'd be nice to have the two units talking to eachother over a dedicated link though 20:31:06 its designed to balnace client load by sending different clients to different ports 20:31:12 or is that not how it's to be setup? 20:31:27 mmcgrath: not really how its designed 20:31:43 k, well when the time comes I'll leave it to you. 20:32:00 I would like to have this all up and running asap. only because it takes so long to get going, I'd hate for this to bump up into the alpha. 20:32:09 our useage is really outside of there normal use case 20:32:10 dgilmore: do you know when the equalogic will ship? 20:32:18 mmcgrath: let me ask my boss 20:32:20 when will we get it out there? and do you need me to be there physically? 20:33:08 smooge: someone will need to rack it 20:33:14 smooge: I don't think we will, jonathan setup the last one. I suspect he will this time too. 20:33:18 and hook up serial port 20:33:32 okit dokit 20:33:35 but i hope johnathan can do it 20:34:27 k, any other questions on that? 20:34:31 with the second unit i want to do raid10 over the 32 1tb drives in the 2 units 20:34:32 if not we'll move on. 20:34:41 but im done now 20:35:07 alrighty 20:35:08 with that 20:35:11 #topic open floor 20:35:15 anyone have anything they'd like to discuss? 20:35:34 mmcgrath, did you ever #endmeeting the meetbot in -admin ? 20:35:43 * ricky checked this morning and it was gone 20:35:49 ok 20:35:59 mdomsch: I did :) 20:36:13 towards the end of the day after traffic was nearing back down to normal I ended it. 20:36:21 I should have sent the logs to the list, but they are available 20:36:26 np 20:36:35 it wasn't a very exciting day - just like I like it 20:36:52 either our processes have gotten so good that release day is a non-event 20:36:56 or our traffic was down, or both 20:37:10 no doubt. 20:37:17 * mmcgrath thinks it was a little of both. 20:37:18 thanks to Oxf13 for getting the bits to the mirrors so early 20:37:24 but if things aren't broke, we're doing what we can :) 20:37:35 yeah, that helps too I bet quite a bit 20:37:37 I didn't hear anyone complain about not getting the bits; I did hear about slow torrents 20:37:45 when people show up from the release and can't get to it, I suspect there's a lot more re-loading on our servers. 20:37:48 but that's also because they aren't advertised so much anymore 20:37:59 mdomsch: I had good luck with the torrents but I wasn't paying attention to all of them. 20:38:29 is so used to having torrents blocked that I forgot to do anything with them 20:38:38 :) 20:38:47 ok, anyone have anything else to discuss? 20:38:50 not me 20:38:55 all quiet here 20:39:03 going to just shoot the telemarketers who keep calling 20:39:04 alrighty, we'll close in 30 20:39:26 smooge: ive had a bunch of them recently 20:39:48 :) 20:39:49 ok 20:39:50 election people wanting me to vote for someone.. but I unregistered from parties last year so they are wasting their time 20:39:50 #endmeeting