20:01:09 #startmeeting Infrastructure 20:01:09 Meeting started Thu Feb 4 20:01:09 2010 UTC. The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:01:10 Useful Commands: #action #agreed #halp #info #idea #link #topic. 20:01:11 who's here? 20:01:15 * lmacken 20:01:22 I'm here, but distracted with lunch and another meeting 20:01:22 * jaxjax is here 20:01:28 here 20:01:35 here 20:01:57 * dgilmore is present 20:01:59 * hiemanshu 20:02:00 * skvidal is here 20:02:05 * nirik is around in the cheap seats. 20:02:11 is here 20:02:21 * a-k is here 20:02:23 no one creates meeting tickets anymore so we can skip that :) 20:02:24 * abadger1999 here 20:02:29 #topic /mnt/koji 20:02:35 So I'm just making everyone aware what's going on here. 20:02:42 1) we have a try'n buy from Dell on a equallogic 20:02:54 2) /mnt/koji is 91% full 20:02:57 so we need to get cracking. 20:03:07 the equallogic is in a box in the colo so hopefully it won't take long. 20:03:08 I still have a script running from last night that is cleaning /mnt/koji/mash/updates 20:03:21 lmacken: oh that's good to know, any estimate on how much it'll clean up? 20:03:43 mmcgrath: I'm not quite sure... 20:03:48 no worries. 20:03:50 there are a *ton* of mashes to clean up 20:03:53 hard to say with the hardlinks 20:03:53 i expect little 20:03:59 * ricky is around 20:04:02 since it should be mostly hardlinks 20:04:05 I'm *hoping* to have that thing installed and pingable by early next week. 20:04:08 the plan is going to be this. 20:04:17 once it's up and running I'm going to drop everything I'm doing and try to get it up and going. 20:04:26 dgilmore has promised a significant portion of his time as well. 20:04:30 * dgilmore will be focusing on it also 20:04:45 we're going to be focusing on testing, speed, what works, virtualized, unvirtualized, etc. 20:05:02 the equallogic will be exporting an iscsi interface, it's our job to figure out what to do with it. 20:05:05 I've also promised some of my time to help generate traffic for testing 20:05:12 Oxf13: excellent 20:05:30 mdomsch: I know you enjoy the equallogics, do you have any interest in being involved in this? 20:06:04 mmcgrath, I probably shouldn't, just so you feel it's a fair eval 20:06:16 but if have questions, hit me and I'll try to help 20:06:25 mdomsch: fair enough :) 20:06:36 Ok, so that's really all I have on that for the moment, any other questions? 20:06:42 * mdomsch wants to bias the decision, but :-) 20:07:04 mdomsch: you just want to make it a 'n buy? :) 20:07:16 Ok, moving on. 20:07:23 #topic PHX2 network issues 20:07:41 so there's been just a lot of strange things at the network layer in PHX2. 20:07:47 our data layer traffic has been fine so that's good. 20:07:55 mmcgrath: scooby doo sees odd things - phx2 is downright haunted 20:08:02 It seems much of that has been fixed at least as of right now. 20:08:03 skvidal: :) 20:08:15 Still, the way things are is just too bad for releng and QA to do their work 20:08:22 here.. 20:08:25 so I'm working on setting up alternate sites to grab their snapshots and test info. 20:08:43 one example is on sb1: http://serverbeach1.fedoraproject.org/pub/alt/stage/ 20:09:00 * mmcgrath thanks the websites-team and likely ricky for getting that all properly branded. 20:09:13 * ricky passes the thanks onto sijis :-) 20:09:24 The good news is so far this setup hasn't required a change in releng's workflow. 20:09:42 the eh' news is that we haven't fully tested it yet but over time as people use it if it's working we'll keep it and/or add additional sites. 20:09:47 Any questions on that? 20:10:24 alllrighty 20:10:29 #topic Fedora Search Engine 20:10:32 a-k: whats the latest? 20:10:43 The news this week is that I've made Nutch available at 20:10:50 #link http://publictest3.fedoraproject.org/nutch 20:11:06 I intend to see how much both Xapian and Nutch can crawl before they break 20:11:15 With Nutch, I expect the time it takes will just become unacceptable eventually 20:11:24 Nutch takes longer than Xapian to crawl 20:11:36 I still intend to keep looking for/at other candidates, too 20:11:36 a-k: what content are you pointing it at right now? 20:11:39 Does Nutch make any smart decisions about crawling? 20:12:02 I point both at just http://fedoraproject.org 20:12:07 a-k: FWIW, one of the test things I've been doing is searching for "UTC" I've found it's a good way to determine a good engine from a bad one on the wiki 20:12:11 for example: 20:12:12 https://fedoraproject.org/wiki/Special:Search?search=UTC&go=Go 20:12:14 CRAP 20:12:16 mmcgrath: what do you mean by smart? 20:12:20 http://publictest3.fedoraproject.org/nutch/search.jsp?lang=en&query=UTC 20:12:21 not bad 20:12:34 well, nutch found the UTCHowto 20:12:41 instead of all the ones below it. 20:12:53 * mmcgrath just sayin. 20:12:59 cool 20:13:00 It's important nor to confuse searching with indexing 20:13:00 a-k: how long are we talking about for crawling with nutch? 20:13:15 a-k: you might also try meetbot.fedoraproject.org and see how it does with irc logs. 20:13:46 Nutch crawled in about 16 hours what Xapian crawled in 8 20:14:03 Neither crawls are the complete site yet 20:14:17 are there tunables? is this as simple as 'add more processes' ? 20:14:57 Nothing is especially tunable. It might be limited by bandwicth. 20:15:02 yeah 20:15:03 crawler needs more systems badly... 20:15:17 don't shoot the url! ;) 20:15:20 16 hours is a lot but might be acceptable. 20:15:40 Although part of Nutch's problem could be an inherent inefficiecy in it's Java code 20:15:50 Xapian is compiled C 20:16:12 a-k: what did we get with that 16 hours exactly? 20:16:29 About 44k documents indexed 20:16:50 and Xapian crawled the same thing? 20:17:05 Nutch and Xapian crawl differently 20:17:09 a-k: where was the Xapian url again? 20:17:41 #link http://publictest3.fedoraproject.org/cgi-bin/omega 20:17:56 As always, I keep notes on the wiki page 20:18:02 #link http://fedoraproject.org/wiki/Infrastructure/Search 20:18:23 a-k: You also had the unicode thing you posted in #fedora-admin 20:18:32 Were you able to find a fix for that? 20:19:01 No fixes. Non-Latin characters hasn't really been something for which there's a requirement yet. 20:19:07 I thought Nutch was a little funky with non-Latin characters, e.g., переводу, compared to Xapian 20:19:15 But I've found Xapian examples that handle non-Latin just as bizarrely 20:19:28 Neither Xapian nor Nutch claim to handle non-Latin characters 20:20:22 We breifly mentioned non-Latin (non-UTF8) in a previous meeting 20:20:22 20:20:32 Should there be a requirement around it? 20:20:46 mmcgrath: What do you think? 20:20:53 I suspect any requirement would eliminate ALL candidates 20:21:01 We have a lot of non-native English users. 20:21:26 it probably should be a requirement. 20:21:41 ññ 20:21:42 a-k: I'd think most engines have support for it, if not we should contact them and find out why 20:22:15 A requiirement as opposed to something we take into consideration when choosing finally? 20:22:22 actually its really really really slow to do non-ascii at times 20:22:43 a-k: handling all languages should be a requirement 20:23:04 Both seem to handle searching by expanding DBCS into hex 20:23:16 Most of the time it seems to work 20:23:26 Some of the time the results look screwed up 20:24:31 Anyway I don't think I've got much more to add right now 20:24:42 a-k: thanks 20:24:55 We'll move on for now 20:25:00 a-k: try to find out what the language deal is 20:25:02 a-k I remember old search engines had problems where language formats got combined on the same page. 20:25:16 mmcgrath: ok 20:25:18 Anyone have anything else on that? 20:25:40 k 20:25:44 #topic Our 'cloud' 20:25:52 so I'm trying to get our cloud hardware back in order. 20:26:02 I've been rebuilding the environment and getting it prepared for virt_web 20:26:06 yeah 20:26:08 which should be at or near usable at this point. 20:26:10 what can I do to help 20:26:17 oh you already did it 20:26:27 smooge: not sure yet, we have a new volunteer working with me, sheid 20:26:34 and I'm sure SmootherFrOgZ as well. 20:26:38 cool 20:26:38 setting things up initially won't take long 20:26:51 it's getting them working and coming up with a solid maintanence plan that will be the tricky part. 20:27:04 mmcgrath: what base are we using? 20:27:14 dgilmore: RHEL 20:27:17 and xen at first 20:27:23 mmcgrath: ok 20:27:35 though the conversion to kvm should be quick 20:27:41 did we sort out the libvirt-qpid memory leaks? 20:27:50 dgilmore: nope, I've got a ticket submitted upstream 20:27:56 mmcgrath: any reason not to start with kvm? 20:27:58 * mmcgrath is hoping to find some C coders to submit patches for me. 20:28:04 dgilmore: not really 20:28:21 * dgilmore set up new box in colo with centos 5.4 and kvm 20:28:25 its working great 20:28:47 i use kvm with my rhel 5.4 box and it works much better than xen ever did 20:28:48 the memory leak *might* be limited only to libvirt-qpid installs that can't contact the broker. 20:29:13 jokajak: that's weird, we've had generally the opposite experience. Performance has either been terrible or as good as xen but never better. 20:29:24 i never got the deps sorted out to get libvirt-qpid running on my new box 20:29:27 i had stability problems with xen 20:29:42 mmcgrath: I think it depends on what you install into the vm, and whether or not virtio is used 20:29:49 dgilmore: yeah, I need to come up with a long term plan for that too. 20:29:51 without virtio, kvm is going to be slower than paravirt xen 20:30:00 Oxf13: yeah for us most issues were cleard with different drivers 20:30:10 I think we have most of it figured out now, our app7 is kvm 20:30:16 kvm works great here, but I am using fedora hosts. ;) 20:30:23 nirik: yeah 20:30:28 so anyone have any questions on this for now? 20:30:48 the most recent rhel kernel fixed some clock issues i was having 20:31:06 k 20:31:10 #topic Hosted automation 20:31:14 jaxjax: you want to talk about this? 20:31:30 yep 20:32:03 I'm currently in the process of installing a full environment on a kvm v machine 20:32:39 testing on my desktop was a bit crap and I expect to have it ready by end of this week so I can test properly 20:32:48 some questions about fas integration 20:32:49 20:32:58 sure, whats up? 20:33:39 Can I work in the automatic creation of groups when required? 20:33:52 or we would have to do it manually? 20:33:56 jaxjax: yeah, and it'll be required almost every time. 20:34:00 ricky: you still around? 20:34:03 Yup 20:34:27 ricky: would you be interested in writing a CLI based fas client that creates groups? 20:34:31 I don't think we have write methods exposed in FAS yet, so that will require FAS extra 20:34:35 **extra FAS support 20:34:53 once you're logged in couldn't you just post? 20:35:24 Well... I guess you can use the normal form and skip past having a JSON function for it 20:35:36 You will probably just have hacky error handling in that case. 20:35:50 Ricky: Do you mind if I contact you 2morrow or Sat for this? 20:36:01 ricky: well, should we focus on getting SA0.5 out the door so we can continue working on stuff like that? 20:36:20 Yes 20:36:45 jaxjax: Sure, eitiher of those is fine 20:36:53 thx, will do. 20:37:01 k 20:37:10 we'll have to meet up and figure out exactly what is still busted 20:37:33 There's currently a privacy branch in the git repo 20:37:53 (privacy filtering is the current main broken thing) 20:38:34 There's basically one design decision I'd like to make before we can refactor all privacy stuff :-) 20:38:47 ricky: is that something you can work on in the comming week? 20:39:18 Yeah, I'll get started on that this weekend 20:39:26 ricky: excellent, happy to hear it 20:39:30 anyone have anything else on this topic? 20:39:52 k, we'll move on 20:39:53 jaxjax: thanks 20:39:58 #topic Patch Wed. 20:40:01 smooge: want to take this one? 20:40:02 Haha 20:40:13 * sijis is here late. sorry 20:40:24 yes 20:40:46 Ok I would like to make every second Wednesday of the month patch day 20:41:16 we would run yum update on the systems and reboot as needed 20:41:33 which lately has been, we will be rebooting every 2nd wednesday of the month 20:41:47 smooge: do you want to alter when our yum nag mail gets sent to us? 20:41:54 right now I think it's on the first day of the month 20:42:12 yes. I will change it to the first weekend of the month 20:42:19 close enough for government work 20:42:32 in the case of emergency security items, we will patch as needed 20:42:43 yeah 20:42:47 * mmcgrath is fine with that 20:42:50 anyone have any issues there? 20:43:02 usually systems will need to be rebooted per xen/kvm server 20:43:06 smooge: It'd be good to get this in an SOP 20:43:13 now that we're getting some actual structure around it. 20:43:14 yes 20:43:43 I have two in mind 20:43:56 update strategy, server layout strategy 20:44:12 Just curious, is this roughly the way big companies, etc. do updates? 20:44:13 making sure we have services on different boxes so we don't screw up things too much 20:44:21 it depends 20:44:42 some big companies will do them at something like 2am every saturday morning 20:44:55 some big companies will do them once a month 20:45:10 and some will rely on their sub-parts to do it appropriately (eg never) 20:45:12 But nothing like "reboot the db server automatically once a month," right? 20:45:23 nop 20:45:26 depends on the db server 20:45:36 if it has a memory leak then yes 20:45:43 heh 20:45:45 you dont do the updates for all servers at the same time 20:45:46 Hahaa 20:45:47 why not use something like spacewalk to better manage updates? 20:45:57 jaxjax: because then we'd be using spacewalk? 20:46:09 doesn't that still require oracle anyway? 20:46:17 we might when its postgres support is ready 20:46:23 yes it does 20:47:03 jokajak, it is a good idea. we are just having to wait for things we have little knowledge of to help with 20:47:20 smooge: got anything else on that? 20:47:28 how does spacewalk help? 20:47:32 jaxjax, yeah you usually schedule the servers into classes and do them per 'class' so that services stay up 20:47:38 sorry was at phone 20:47:52 skvidal, knowledge of what boxes are in what state. 20:47:59 skvidal: it makes it easy to track what servers need updates, send the 'do the update' requirement and see how it went afterward. 20:48:04 smooge: and massive infrastructure to do that 20:48:07 yes normally what you wanna is avoiding downtime because some patches make the system not working properly 20:48:10 I have to say that aspect of satellite did appeal 20:48:10 Is it necessary to reboot the xen machines as often as the other ones? 20:48:18 skvidal: yeah it does have a cost 20:48:23 mmcgrath: a huge cost 20:48:23 ricky: they keep releasing kernel updates. 20:48:28 They don't seem to touch sa much user data, so it's nice to avoid rebooting them if we can :-) 20:48:31 I wouldn't call it a huge cost 20:48:33 mmcgrath: and for more or less 'yum list updates' that's a lot of crap to sift 20:48:42 its pretty minimal compared to some of the beasts I have had to deal with 20:48:44 Ah, I was thinking about the value of security updates on those vs. on proxies, etc. 20:48:45 smooge: you have to run an entire infrastructure and communiucations mechanism 20:48:51 * nirik notes some of the kernel updates lately don't pertain to all machines. 20:48:52 skvidal: updating all of our hosts monthly has become expensive though too. 20:49:08 ie, driver fixes where the machine doesn't use that driver at all. 20:49:10 mmcgrath: how would spacewalk help that, then? 20:49:21 it's just a couple of clicks and it'll go do the rest. 20:49:23 mmcgrath: I'm not arguing against patch wednesday 20:49:31 I'm arguing against spacewalk being the answer 20:49:40 yeah I'm not so sold on spacewalk either 20:49:45 but the way we do updates now is pretty expensive. 20:49:48 skvidal, I didn't say it was the answer. I said it "might" be the answer 20:50:07 smooge: let's talk about other solutions 20:50:21 when the time comes it will be evaluated against what other frankenstein we can come up with to do it better 20:50:27 skvidal: FWIW, no one's actually said "we should use spacewalk" 20:50:41 jaxjax just asked why we don't and we told him :) 20:50:43 I am not against frankensteins.. Its the Unix way 20:50:52 smooge: I'm not talking about frankensteins, either 20:50:59 oh I am. 20:51:03 I see. 20:51:05 s/jaxjax/jokajak ;-) 20:51:08 I'm talking about using the tools we have 20:51:17 oh jokajak 20:51:23 jokajak: jaxjax: wait, you two aren't the same person? 20:51:28 * mmcgrath only just realized that 20:51:28 smooge: do you have a rough set of requirments? 20:51:34 I kept thinking jaxjax was changing his nic to jokajak :) 20:51:34 :D 20:51:43 not at all 20:51:51 skvidal, yes.. and when you assemble them together they become a frankenstein of parts. talk off channel after meeting 20:51:56 smooge: ok 20:52:15 Ok, anyone have anything else on that? if not we'll open the floor 20:52:52 alrighty 20:52:54 #topic Open Floor 20:52:59 anyone have anything else they'd like to discuss? 20:53:04 any new people around that want to say hello? 20:53:25 I think OpenID might (finally) be ready for some testing 20:53:37 hello, i'm new ;) 20:53:39 jpwdsm: oh that's excellent news. 20:53:52 jpwdsm: how far away from it being packaged and whatnot 20:53:59 I can log into StackOverflow and LiveJournal with it, but that's all I've done 20:54:01 jpwdsm: is it directly tied to FAS or is it it's own product? 20:54:10 jpwdsm: test opensource.com 20:54:10 mmcgrath: own product 20:54:10 Nice :-) What publictest are you on again? 20:54:16 mmcgrath: will do 20:54:25 ricky: pt6.fp.o/id 20:54:33 For what it's worth, I haven't had luck with opensource.com and google or livejournal's openid :-( 20:54:37 sheid: welcome 20:54:51 Good to hear - I look forward to dropping openid out of FAS :-) 20:55:00 mmcgrath: I haven't done much packaging, so I'll probably need some help with that 20:55:29 ricky: It uses FasProxyClient, but that's it :) 20:55:37 abadger1999 is our python/packaging guru, and we're all around if you have any questions on it 20:56:02 We'll also want to ask abadger1999 and lmacken about using the FAS identity provider (and if the TG2 one works with pylons) 20:56:21 20:56:44 (disclaimer if you're not aware - this is written in pylons, which is kind of a subset of TG2 I guess) 20:56:57 yeah 20:57:12 Ok, anyone have anything else they'd like to discuss? If not we can close the meeting. 20:58:16 I'd like to point out something for no frozen rawhide 20:58:32 Oxf13: have at it 20:58:36 oh, Infra meeting? 20:58:38 my initial tests of doing two composes on two machines at once was favorable. there was not a significant increase in the amount of time necessary to compose 20:58:39 Oxf13: sure 20:58:43 G: hey 20:58:45 damn, I was awake for it too 20:58:48 heheh 20:58:49 Oxf13: :) nice 20:58:54 this combined with lmacken's testing of bodhi means I think we can move forward with no frozen rawhide 20:58:56 Have koji01.stg and releng01.stg been good for you and lmacken's testing? 20:59:01 which means we will be stressing things more in the near future 20:59:08 and ti's going to cause a lot of confusion amongst the masses 20:59:13 (and cvs01.stg) 20:59:18 Oxf13: that should be fine and we should have more hardware for you 20:59:19 The good news guys, when I'm back in NZ I'll be able to attend them more often 21:00:29 ricky: it was for luke. I wasn't using .stg for my testing 21:00:34 G: excellent :) 21:00:44 ricky: I will be using .stg for dist-git testing soon, but that will require modifications to koji.stg 21:00:48 Oxf13: Ok, well I'm glad that's working out for you 21:01:28 Cool 21:02:00 ok, if that's it I'll close the meeting 21:02:02 #endmeeting