18:01:33 #startmeeting Infrastructure (2012-10-04) 18:01:33 Meeting started Thu Oct 4 18:01:33 2012 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:33 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:01:34 #meetingname infrastructure 18:01:34 #topic Aloha! 18:01:34 #chair smooge skvidal CodeBlock ricky nirik abadger1999 lmacken dgilmore mdomsch threebean 18:01:34 The meeting name has been set to 'infrastructure' 18:01:34 Current chairs: CodeBlock abadger1999 dgilmore lmacken mdomsch nirik ricky skvidal smooge threebean 18:01:43 * skvidal is here 18:01:44 here 18:01:58 hello 18:02:05 * nirik will wait a few for folks to wander in. 18:02:07 * lmacken 18:02:10 * jds2001 18:02:18 hey 18:02:53 it was good to work with CodeBlock at OLF 18:02:56 * threebean 18:03:09 /me is here 18:03:20 * akshaysth is here 18:04:15 * pingou 18:04:19 ok, lets go ahead and dive on in... 18:04:22 #topic New folks introductions and Apprentice tasks 18:04:33 well may I? 18:04:35 any new folks want to introduce themselves, or apprentices have questions or comments? 18:04:40 miguelcnf: go ahead. ;) 18:05:54 * _love_hurts_ late here 18:06:03 no worries. 18:06:39 allright... so the name is Miguel and I'm from Portugal. As I've wrote to the list I've recently been working with l10n pt team and now I'm looking to help out the infrastructure team. I'm a system engineer and work with red hat/centos on a daily basis. It would be great if someone could invite me to the fi-apprentice group so I could start poking around and hopefully get some easyfix work. And I think thats pretty much it... Hi! 18:06:47 * abadger1999 here 18:07:06 miguelcnf: welcome! :) 18:07:09 miguelcnf: welcome! I can add you to that group after the meeting... just see me in #fedora-admin 18:07:22 cool thanks guys 18:07:44 excellent. 18:07:48 not much of an update here but am still working on ticket 3293 18:07:51 .ticket 3293 18:07:55 akshaysth: #3293 ([easyfix] add staging monitoring script) – Fedora Infrastructure - https://fedorahosted.org/fedora-infrastructure/ticket/3293 18:08:02 cool. 18:08:19 still trying to figure out how to get the diffs to show correctly when being emailed 18:08:22 Hopefully most of our staging changes will be gone before we go into the next freeze, but there will be more piling up I'm sure. 18:08:30 * nirik nods. 18:08:46 ok, any other apprentice questions or new folks? 18:09:08 ok, moving on... 18:09:11 #topic Applications Maint/Development status / discussion 18:09:37 any new application devel / maint news this week? 18:10:00 working on datanommer in stg now. should be good to go by the end of the day. 18:10:19 threebean: and hopefully we land that in prod before next tuesday? 18:10:35 I got a running version of a) fedocal b) election 18:10:40 I'd really like to. If its not possible, in the short time, that's cool. 18:10:52 * lmacken has been doing a lot of fedmsg hacking this week, mostly client side stuff though 18:11:01 well, happy to help try and do so... we can see. :) 18:11:01 pingou: also got some jenkins nodes running in the euca cloud. 18:11:02 lmacken++ 18:11:08 https://209.132.184.101/election - http://209.132.184.101/fedocal 18:11:21 oh yeah, the new jenkins slaves are awesome. http://jenkins.turbogears.org 18:11:24 err -- "pingou also got" 18:11:33 :-) 18:11:41 what is fedocal? 18:11:42 we have an EL6 and a F17 nodes running jenkins 18:11:47 still have to hook a bunch of projects into them, but it'll let us test everything across py2.4 - 3.x 18:11:49 smooge: fedora calendar :) 18:11:50 pingou: looks really clean 18:12:07 there was talk about a python build instance a while back... would that be similar to this jenkins thing? 18:12:13 pingou, thanks 18:12:14 pingou: I have a new el6 img I'd like to test - but if what you have is working we can just leave it alone :) 18:12:15 nirik: yes it would 18:12:16 wrt jenkins the question is, do we want to run our own master in the longer term 18:12:35 right now the master is run by me on a RH box at RIT 18:12:44 jenkins is some CI thing, right? 18:12:51 * nirik would like a sop on jenkins setup... and playbooks to do it, so we could easily redo it. 18:12:53 nirik: Talk to dmalcolm about it -- I'm pretty sure it's different software so it might require more than just ssh access on the build-nodes. 18:12:54 skvidal: it's not really working, it complains about the small space available on / 18:13:06 jds2001: yes 18:13:15 I think a master would be good if we want this to be something we do long term. 18:13:18 if we want to give it a proper host, I'm totally down for that. It's all dead simple for me to keep this one up (as long as the power on computer science house stays on) 18:13:21 .ticket 1717 18:13:23 nirik: #1717 (buildbot for upstream python code) – Fedora Infrastructure - https://fedorahosted.org/fedora-infrastructure/ticket/1717 18:13:34 pingou: okay - then we'll talk after the meeting 18:13:54 nirik: I have a "sop" for the nodes, it's dead simple 18:13:59 Having a third-party master with a password to login to our boxes doesn't seem quite right for something we want to keep tight control of. 18:14:06 nirik: for the master I'd need to do more testing 18:14:34 * nirik thinks it would be good to run the master too if we can... because we might want to script this for other projects to, or reuse it for them or the like. 18:14:42 I'm thinking, master in a vm "bare-metal" and nodes on the clouds as we want/need 18:14:52 skvidal: was offering to use ansible to deploy them 18:14:56 if we could get fas login to jenkins that would be nice 18:15:07 lmacken: euh.. :D 18:15:30 pingou: we can write a playbook very easily if you have the steps you took to install it recorded somewhere 18:15:32 abadger1999: note that we can use ssh keys, it just that I don't have a login on the master box 18:15:35 pingou: can you send me your notes? 18:15:51 skvidal: http://www.fpaste.org/siAm/ 18:15:54 It would be both exciting and... somewhat scarey to support this for more people, I think. 18:16:21 yeah, we don't want to bite off too much, but I think it would be perhaps a nice feature/thing. 18:16:23 skvidal: most of the things are due to FedoraReview which requires quite some packages to be installed (mock, rpmlint...) 18:16:23 abadger1999: do we want to stick all of it on instances in the cloudlets and just reroll it all the time? 18:16:57 skvidal: definitely for the build nodes. For the master we might want to be more formal about it. 18:16:57 threebean: thanks bwt :) 18:17:06 skvidal: I think pingou had some ideas. 18:17:11 :) 18:17:15 abadger1999: cool. 18:17:33 abadger1999, pingou: let me know what I can do to facilitate the systems creation/[re]deployment 18:17:46 skvidal: ansible for the nodes sounds really nice 18:17:48 despite what you've heard I can be helpful at times ;) 18:17:48 I don't know that I would want to run master nodes in our regular internal net unless we are sure they are safe to do so... also is everything packaged? I think no? 18:17:53 for the master, I need to set one up 18:18:10 nirik: indeed, it is not 18:18:26 nirik: not a problem for the nodes (jenkins adds its stuff on the nodes) but it will be for the master 18:18:32 nirik: jenkins isn't packaged -- on the jenkins.tg.o master, I think we started with the upstream rpm and then have been using the application's update feature since then 18:18:33 right 18:18:53 abadger1999: jenkins.tg.o == java -jar 18:18:54 we would definitely want backup of the master 18:19:03 nirik: upstream provides and rpm and a repo 18:19:06 lmacken: k 18:19:29 must be the jenkins server I expermineted with at home that started with the upstream rpm. 18:19:35 lmacken: which is what the init script in the rpm do ;) 18:20:02 pingou: oh, nice :) 18:20:28 so, I'd say lets keep discussing things and see what all we need and where would be good to have it... but not advertize or promise anything now. ;) 18:20:32 lmacken: the rpm is just a big .jar and an init script, files are then extracted the first time you load it 18:20:37 I setup that instance at pycon like 3 or 4 years ago, and really haven't had to mess with it much since. 18:20:40 perhaps a thread on the mailing list on it would be good for some general discussion 18:20:46 I do the auto upgrades every few weeks, and it's been very smooth 18:20:48 lmacken, pingou: Can we prevent the master from running any build jobs on itself? 18:20:54 nirik: imho, this is/should be infra restricted 18:21:00 that would make it safer to keep the master around. 18:21:01 abadger1999: yeah, I think so. and go purely with slaves 18:21:05 Cool. 18:21:31 pingou: to start with for sure. 18:22:08 so, lets explore some more and discuss... 18:22:17 abadger1999: seems yes, not true for the nodes though (for obvious reason) 18:23:20 any other applications news this week? hows the fas and pkgdb releases looking? ;) 18:23:41 * pingou wouldn't mind feedback on fedocal 18:23:55 note: I put sysadmin-web as admin of fedocal 18:24:02 pingou: I've been meaning to play with it, but haven't gotten the time. 18:24:05 I'm pretty sure I can get pkgdb out before freeze. 18:24:13 nirik: ok thanks :) 18:24:17 what is the url? 18:24:30 smooge: https://209.132.184.101//fedocal 18:24:32 abadger1999: that would be good. 18:24:46 there are some changes to production that I'm trying to track down and make sure they are applied in the repo first. 18:25:15 that calender is too sethie for this world :) 18:25:22 fas CodeBlock's been handling but everything I've seen looks like it's on track for going to prod too. 18:25:30 excellent. 18:25:42 oh, one other app news: I disabled raffle in production. 18:25:55 how about smolt? disabled, too? 18:25:56 we were not using it and I wanted to rule it out of our httpd issues. 18:26:20 skvidal: I think we have a timetable for smolt... going to announce that in a few days. 18:26:23 Has disabling that changed things yet? 18:26:28 nirik: cool 18:26:36 abadger1999: it's not died since then, but it is very sporadic. 18:26:47 okay. 18:26:53 https://fedorahosted.org/fedora-infrastructure/ticket/3495 18:27:02 if anyone has debugging ideas or thoughts on that ^ 18:27:12 If thatturns out to be it... we'll have to think about how we deploy things in the future. 18:27:16 it's anoying and I want to track it down and fix it. 18:27:32 yeah, less mixing things on general app servers, more specific application servers. 18:27:41 might be we'll need to keep tg2 stuff separated from tg1 stuff. 18:27:47 18:28:09 that's even better (from a developer pov) :-) 18:28:25 yeah, we kinda started moving to that with packages/tagger... 18:28:41 although packages and tagger are on the same host. 18:28:58 well, hosts, but yeah... since they were somewhat intertwined. 18:29:03 #topic Sysadmin status / discussion 18:29:15 abadger1999: packags & tagger are also in the same puppet module :\ 18:29:15 so, we completed a mass reboot this week... went pretty smoothly. 18:29:27 skvidal reinstalled all our builders. 18:29:34 lmacken: :-( That is something to look at cleaning up. 18:29:46 abadger1999: yup, packags also has legacy community stuff in it too :P 18:30:09 smooge: whats the status on new boxes? hopefully soonish? 18:30:16 lmacken: File an easyfix ticket for separating those three? (It's all puppet so a lot could be cargo culted by a relatively new person) 18:30:20 sean got the new keys. 18:30:35 abadger1999: will do 18:30:35 so someone can configure virthost12 at the moment 18:30:43 nirik: and releng01, too 18:30:56 I am working my way through the new IMM UI and key activation for the bkernel boxes 18:30:58 skvidal: oh yeah. which is working fine. ;) 18:31:02 nirik: excellent 18:31:12 nirik: when/how do we want to move the rest of the releng infra? 18:31:37 skvidal: I'd like to wait until we have the private/public setup and everything more organized. 18:31:48 I hope to have bkernel01/bkernel02 up later today 18:32:07 ousosl02 is waiting on network fixes on their side 18:32:17 but I got a sort of install sort of working 18:32:20 nirik: understood 18:32:43 skvidal: so, perhaps we get that all figured and we can start doing more after beta goes out. 18:32:53 nirik: not a problem 18:32:53 does that cover it or am I talking over another conversation? 18:32:54 I concur 18:32:59 smooge: sounds good. 18:33:09 I think thats all the new machines accounted for. 18:33:18 yep 18:33:42 ok, any other new or upcoming sysadmin side stuff? 18:34:24 moving on... 18:34:27 #topic Private Cloud status update 18:34:36 we have clouds 18:34:39 and they are private 18:34:41 w00t 18:34:46 * skvidal wants his on private sun 18:34:54 maybe a private snowstorm or tornado 18:35:06 heh 18:35:07 but not too often 18:35:16 and we have jenkins nodes on the cloud 18:35:16 we've been using/abusing the cloudlets in the last week 18:35:24 so, the openstack cloud has a glusterfs backend... it seems like it's not very fast. ;( 18:35:26 and found places where both cloudlets are suboptimal 18:36:08 euca 3.1.2 is out - I'm going to see how well the cloud fares with an upgrade while running :) 18:36:18 nirik: what's the folsom timeline? 18:36:35 I'm not sure... will be a bit longer for rhel packages I think. 18:36:49 I'll try and find out. 18:36:56 nirik: cool 18:37:09 we could also drop the gluster backend and see how that changes performance. I think reasonably easily. 18:37:35 nirik: but then we lose live migrate, right? 18:37:47 yeah 18:37:48 nirik: can we tug the netapp in there via iscsi? 18:37:51 and have less overall space 18:38:00 ha ha ha ha. 18:38:07 nirik: :) 18:38:30 we will have some more storage options there if our vfilers show up and work as we would hope... 18:39:09 anyhow, lets keep testing and poking at them... 18:39:16 I concur 18:39:28 nirik: things we need to think about with the clouds 18:39:36 - long term HA 18:39:40 - medium term - backups 18:39:55 monitoring 18:39:56 - medium/short - AuthN/AuthZ 18:40:41 nirik: agreed 18:40:55 I hear ldap 18:41:54 there's lots of cute things we could do to make them more nifty for people too on setup... ie, could we have a fas group(s) setup for ssh, a automatic git repo setup to save off data, etc. 18:42:38 anyhow, lets keep poking at them... 18:42:49 #topic Security FAD update 18:43:12 So, some progress on this... we have a conference room reserved now in the tower. 18:43:26 we need to gather final numbers and run it by numbers people. 18:44:07 we should also ping people one last time... make sure they can or cannot make it or ask people who are important to the goal. 18:44:47 so, I will send to the list, and ping folks, and we will send out budget request monday? 18:44:56 anything else we should plan on this? 18:45:08 wfm 18:45:28 wfm 18:46:06 I am trying to figure out if I can fly out early to SC see my parents, drive up for the FAD and then leave from SC. The flights were actually cheaper that way... but have to find out if I can borrow one of the parents cars 18:46:10 the fad faq thing suggests we should also plan some activities/etc... perhaps we could tour skvidal's bike collection. ;) 18:46:36 nirik: I took a survey... no one likes you 18:46:37 :) 18:46:49 ha. 18:46:51 hey his bike shed is better off than some of our houses 18:47:01 skvidal: all the one personn you asked agreed ? 18:47:05 skvidal, got to that line in Portal2 again last night 18:47:06 pingou: yep 18:47:09 anyhow... anything else on this? or shall we move on? 18:47:21 nirik: skvidal 18:47:30 nirik: skvidal's bike collection or the bike shed itself ? :) 18:47:41 pingou: I have lots of various colors of paint 18:47:42 for the shed 18:47:45 cmon down 18:47:57 #topic Upcoming Tasks/Items 18:48:10 get ready for flood. ;) 18:48:12 #info 2012-10-08 purge inactive fi-apprentices 18:48:12 #info 2012-10-08 - announce smolt retirement 18:48:12 #info 2012-10-09 to 2012-10-23 F18 Beta Freeze 18:48:12 #info 2012-10-23 F18 Beta release 18:48:12 #info 2012-11-01 nag fi-apprentices 18:48:13 #info 2012-11-07 - switch smolt server to placeholder code. 18:48:14 #info 2012-11-13 to 2012-11-27 F18 Final Freeze 18:48:16 #info 2012-11-20 FY2014 budget due 18:48:18 #info 2012-11-22 to 2012-11-23 Thanksgiving holiday 18:48:20 #info 2012-11-26 to 2012-11-29 Security FAD 18:48:22 #info 2012-11-27 F18 release. 18:48:24 #info 2012-11-30 end of 3nd quarter 18:48:28 #info 2012-12-24 to 2013-01-01 Red Hat Shutdown for holidays. 18:48:30 #info 2013-01-18 to 2013-01-20 FUDCON Lawrence 18:48:32 anything else we should note or schedule? 18:48:39 * nirik will try putting this into fedcal. ;) 18:49:23 #topic Open Floor 18:49:29 Any items for open floor? 18:50:27 * nirik will close out the meeting in a minute if nothing more. 18:51:51 ok, thanks for coming everyone. 18:51:53 #endmeeting