19:00:04 #startmeeting Infrastructure (2014-02-27) 19:00:04 Meeting started Thu Feb 27 19:00:04 2014 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:04 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:00:04 #meetingname infrastructure 19:00:04 #topic Greetings starfighter! 19:00:04 #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk 19:00:04 The meeting name has been set to 'infrastructure' 19:00:04 Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pingou puiterwijk relrod smooge threebean 19:00:10 howdy 19:00:34 hi 19:00:35 * adimania is here 19:00:41 * pingou 19:00:44 * lmacken 19:01:03 * threebean is here 19:01:05 here 19:01:08 * ausmarton is here 19:01:19 * docent is here 19:01:25 * willo is here 19:01:32 here 19:01:33 morning everyone. ;) 19:01:43 but not for long, need to reload F20 :( 19:01:48 #topic New folks introductions and Apprentice tasks 19:01:54 danofsatx-work: fun times. ;) 19:01:58 * kushalk124 is here 19:02:09 any new folks like to introduce themselves? or apprentices with questions or comments? 19:02:22 new system - old load isn't optimized for it, so I get to redo it, yet again.... 19:03:55 ok, moving along then... as always feel free to chime in with questions or comments anytime. 19:04:02 #topic Applications status / discussion 19:04:14 any application side news this week? 19:04:26 * mirek is here 19:04:26 new fedocal in prod 19:04:32 new nuancier in prod 19:04:33 * fchiulli is here. Sorry for being late. 19:04:46 (and new(er) nuancier in stg w/ fedmsg integration -- need to test this) 19:04:58 pingou: cool. ;) 19:05:08 those problems with copr (caused by createrepo_c) are hopefuly solved 19:05:09 Cool. 19:05:13 pingou: this is full nuancier now right? not lite? 19:05:23 nirik: yup :) 19:05:24 all features included 19:05:30 https://apps.fedoraproject.org/nuancier 19:05:39 and with a nicer frontpage :) 19:05:48 mirek: cool. So it was createrepo_c sucking up all memory? or ? 19:05:50 mirek: nice! 19:06:21 nirik: yes, I even seen process with 10GB RAM. 19:06:29 and with threebean we pushed some commits to summershum (support .gem, more info in the logs and on fedmsg) 19:06:42 thats a pile. ;( 19:06:43 I find the cause and give it to upstream with reproducer 19:07:16 once createrepo_c is semi-stable, getting mash to use it could be a great easyfix task 19:07:16 mirek: I should have arm socs for you before too long... need to get dhcpd to not mess up the cloud dhcp and setup a pxe server to install them, etc. 19:07:42 * mirek is happy 19:07:51 lmacken: theres still stuff missing tho I think... no deltarpms? 19:08:11 (no deltarpms was mentionned at devconf) 19:08:16 nirik: ah, yeah I haven't looked at it too closely, but that's a blocker for sure :) 19:09:02 mirek: those ansible module issues you ran into are really weird. it's like something is modifying your pythonpath, but only sometimes? 19:10:24 yes, it really puzzle me, I want to spend some time, but today it happen on prod so I was in hurry to return it back online 19:10:34 sure, understand 19:11:32 oh, I had one thing to note... 19:11:56 a while back puiterwijk got our reviewboard on fedorahosted back up and running 19:12:14 I keep not having time to poke around on it more... but we should see if it's usable for us for any needs... 19:12:18 https://fedorahosted.org/reviewboard/dashboard/ 19:12:29 it is much faster than before... 19:12:57 not accessible w/o fas account? 19:13:03 nirik: I can assist with administration if there are questions 19:13:09 ah, it's the dashboard link 19:13:10 pingou: openid 19:13:15 nice 19:13:18 pingou: https://fedorahosted.org/reviewboard/r/ is available directly 19:13:28 So you can read but not edit. 19:13:45 puiterwijk elected to make the login page automatically bounce to OpenID. 19:13:49 oh reminds me I need to file a bug on the openid part... 19:14:30 it makes a local FirstnameLastname user for reviewboard. 19:14:47 but... it can't handle users with neat utf8 stuff in name. ;) 19:15:06 ^^ 19:15:23 I'm sure we are shocked. ;) 19:15:35 * pingou looks at abadger1999 19:15:49 I personally wish he'd just elected to mangle the openid for the username 19:15:56 yeah, seems easier. 19:16:00 sgallagh-id-fedoraproject-org would have worked better. 19:16:20 19:16:34 And guaranteed not to be overloaded if we have two John Smiths 19:17:00 anyhow, I know we have github for many application reviewing needs, but if it's nice enough we could look at it for ansible changes during freeze or the like. 19:17:03 That's fixable. Please CC me on the bug report. 19:17:17 sgallagh: where's the best place to fiile? 19:17:24 nirik: FWIW, I'm working on Git hooks to be able to manage pull requests through Review Board. 19:17:42 So you get the nice review UI of RB alongside the process management of github 19:17:52 cool. 19:17:57 nirik: Just use the Infra trac for instance-specific ones 19:18:02 k 19:18:39 ok, any other application news? 19:18:47 kinda application-y: 19:19:04 pushed out a nice error logging config for fedmsg this morning -> http://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=1b875b543fa414c0971941b3bb2951e7523035aa 19:19:17 it's *nice*! 19:19:21 so, we'll get error emails from the badges awarder and the notifications daemon. from summershum too. 19:19:39 oh nice. these are when it can't send? or ? 19:19:52 well, whenever log.error('blah blah') is called. 19:19:59 so its up to each app to catch its own problems and log them. 19:20:16 pkgdb2, fedocal and nuancier also send emails 19:20:31 I was wondering if we should create an alias to receive these emails 19:20:40 yeah, where do they go now? 19:20:51 (the fedmsg ones go to sysadmin-datanommer-members@fp.o 19:20:53 pkgdb2, fedocal and nuancier to me (only) 19:20:56 we do have sysadmin-logs, but thats more sysadminy than applicationy 19:21:14 sysapp-logs? :D 19:21:40 * pingou doesn't dare to propose appy-logs 19:21:48 we had exceptions from fedora-packages coming to lmacken and I for a while.. but there were just too many. 19:22:05 a group is often nice because it's easy to manage who's in it, etc. 19:22:14 no aliases to change, etc 19:22:18 +1 19:23:01 on the app side, I've been working a little on FAS3 today https://github.com/fedora-infra/fas/pull/56 19:23:03 could we, create an alias for each app so you don't have to choose the firehose or nothing? 19:23:28 threebean: we could, but if they are aliases, that means updating them via puppet (or ansible) and more pain in freezes, etc. 19:23:35 maybe use gitproject as for fedorahosted? 19:23:36 * threebean nods 19:24:35 how about: fedmsglogs-applicationname? just tracking groups 19:24:57 wfm too 19:25:18 the git ones might be folks who dont want our specific error logs 19:25:32 oo, wait. I'm not sure how to distinguish fedmsglogs between applications. :/ 19:26:10 then just -logs, fedmsg-logs being just one of them 19:26:14 * threebean nods 19:26:29 sure. 19:27:06 ok, any other apps news? ;) 19:27:30 #topic Sysadmin status / discussion 19:27:43 on the sysadmin side, smooge and I have been having fun with download servers. 19:27:57 turns out the load on them has been at least part of the thing slowing our netapp storage down. ;( 19:27:59 download download 19:28:29 we tried cachefilesd the other day, but it made the machines unstable sadly. 19:28:42 so, now we are limiting rsyncs per download server 19:28:57 we also have a iptables hashlimit to limit ips that hit rsync too much 19:29:00 how big is the traffic (or data transfers) 19:29:02 ? 19:29:19 * danofsatx-work makes a note to alter his rsync scripts 19:29:37 in bytes/packets? a lot. ;) 19:29:53 we did have some ip's hitting 100's of times a day 19:30:20 for the record, that wasn't me ;) I hit it once every 7-10 days 19:30:31 we are likely going to be moving storage for them next week. 19:31:07 i'll be making progress on migration of those servers to ansible this weekend 19:31:10 looks like around 10TB a day or so as a ballpark 19:31:43 perhaps 15 19:31:47 willo: great. ;) 19:32:03 migration of paste module to ansible should be good to go. 19:32:21 I'll pick up another one this weekend probably. 19:32:22 adimania: thanks for working on it. ;) 19:32:40 nirik, thanks for all the help :) 19:32:52 I wonder if we should track a list of remaining module to port to ansible? 19:33:27 pingou, that would be really helpful. 19:33:33 pingou: we could start doing that yeah... it's a bit of a mess tho due to puppet having old junk in it that we arent actually using anymore. 19:33:47 like for example I think talk.fedoraproject.org/asterisk module is still there. 19:34:07 but we could perhaps list machines in puppet only and extrapolate ? 19:34:28 track on a wiki page maybe? 19:34:54 I would go with a trac wiki page :) 19:35:12 sorry my humour is off. rebooting 19:35:16 we could. would somone like to write up at least part of such a thing? I'd be happy to edit it and add info, etc. 19:35:22 smooge: :) 19:35:55 i'll take a stab 19:36:18 willo: cool! 19:36:33 lets see... 19:36:34 i'll email list when outline is done for input 19:36:50 #info download servers and netapp i/o has been a big issue this week. ongoing. 19:36:53 willo: sounds great. 19:37:09 #info more puppet -> ansible conversions are ready to go 19:37:31 I have one of our arm chassis up in the cloud network, just need to get dhcpd working and pxe server to install them... 19:38:01 Oh, I gave a talk to boulder devops monday night on ansible. My slides are at: 19:38:25 http://fedorapeople.org/~kevin/ansible-20140224.odp 19:38:42 for anyone who wants them. Not sure how much sense they make without me gibbering over them, but there they are. ;) 19:38:49 cool 19:39:10 so no vid of the gibbering for posting to youtube ;) 19:39:11 we have some new machines arriving tomorrow (I think) 19:39:24 willo: sadly no, they are looking for a a/v person, but didn't have one. 19:39:51 I have one idea... write something like rbac-playbook but for cloud, so sysadmin-cloud would be able to run euca-* and nova commands. Can someone send me current source of rbac-playbook so I can base it on that please. I just find som e old version on seth site 19:40:40 mirek: not a bad idea... 19:41:04 we aren't setup to run nova commands from there... but the euca ones would work after you source a eucarc... 19:41:20 wonder if that is possible to just do in sudo? 19:41:41 ie "source this, then run command" ? 19:41:57 source is not command 19:42:02 it is bash internals 19:42:32 yeah, but it looks like sudo you can pass a env_file for this. 19:42:33 I normally write a shell script if I have to source stuff 19:42:33 but it can be very similar to rbac-playbook, and easy, I just want to reuse some recent code 19:43:03 http://skvidal.fedorapeople.org/misc/rbac-playbook I find just this 19:43:15 right, I can send you the current one... 19:43:24 it's pretty primitive tho. 19:43:29 thanks 19:43:31 for example, command line args aren't supported. 19:43:41 which may break it for ec2 stuff. 19:43:47 I will keep it primitive for sure :) 19:44:24 ok, feel free to look, but I think env_file with the ec2rc and allowing euca* might be easier... 19:44:43 thats just a change to sudoers 19:45:04 or... hum. 19:45:38 what if we add a acl to the ec2rc file to allow sysadmin-cloud to read it. Then you can just source it and run commands as you. They shouldn't need any privs 19:46:07 it would mean all sysadmin-cloud folks would have the credientals 19:46:42 ahh chacl(1) yes, that should work 19:47:01 anyhow, will ponder on it and try and get something that works. ;) 19:47:28 anything else sysadmin related? 19:47:31 * lbazan here late.. 19:48:40 #topic Upcoming Tasks/Items 19:48:41 https://apps.fedoraproject.org/calendar/list/infrastructure/ 19:48:55 anything upcoming folks would like to note or schedule? 19:49:14 I still haven't done much on FAD organizing. Hopefully more news by next week 19:49:24 nirik: same here 19:50:03 #topic Open Floor 19:50:17 I have been playing with the lookaside cache 19:50:19 anyone have anything for open floor? questions, comments, ideas, favorate pies? 19:50:29 I was wondering how often we have 1 tarball with multiple md5 19:50:36 the results are interesting: http://paste.fedoraproject.org/80881/39350540/ 19:50:57 wow. 19:51:07 but that on all the current tree, so there are some old versions in there 19:51:10 I wonder if thats indicative of uploads that fail... 19:51:25 or if it's upstreams that change stuff and re-release. 19:51:34 I'm afraid for the later 19:51:36 holy.. 19:51:53 I'm not sure yet what to do with this, mail on devel, blog post? 19:52:13 maybe it might be worth asking people to watch out for this 19:52:18 I wonder if we could find out more by looking at commits on those spec files? 19:52:21 accident happens but... 19:52:24 how do they get 2 different md5s? 19:52:28 What does that mean? E.tgz has 10 md5 sums -- does that mean that 10 packages have the same tar.gz? 19:52:35 smooge: two different tarball with the same name 19:52:44 mirek: yup 19:52:47 ah ok. 19:52:51 it means you upload foo-1.0.tar.gz 19:53:00 then upload it again, but with a different md5 19:53:05 ahhhhh 19:53:20 http://pkgs.fedoraproject.org/lookaside/pkgs/389-admin/389-admin-1.1.12.tar.bz2/ for example 19:53:23 either upstream did it, which is bad 19:53:30 so I guess a timestamp,md5sum would be needed 19:53:42 or someone did regenerate the tarball from git, this kind of stuff 19:53:42 misc: yeah, but does sadly happen 19:53:49 smooge: we kinda have the timestamp on the apache page ;-) 19:53:57 or someone modified the tarball, cause patch is too mainstream :) 19:54:03 misc: regenerate the tarball w/o renaming it 19:54:25 I wonder, could we grab all those, then unpack and diff -Nur on them to see how they are different? I guess so, but might take a long time to figure out all of them. 19:54:38 5569 packages had multiple md5 for at least 1 of their version 19:54:43 might be a little much :) 19:54:58 nirik: skip texlive, this will reduce the time to see :) 19:55:01 nirik: smooge but that's the output from my demand from yesterday (install tree on pkgs01) 19:55:04 :) 19:55:15 so what is the problem? the build system grabs the wrong one? we are worried people are uploading different ones 19:55:36 smooge: the build system will grab whatever is in the source file, so we should be fine there 19:55:36 pingou: I'd say devel list I guess. Ask people if they are hitting upload issues (which we could try and fix) or other? 19:55:49 it's more about packager/upstream behavior 19:55:52 well, if it's a upload issue, we should try and fix it. 19:56:01 if it's a upstream issue, we should be very sad, but ok. 19:56:09 if it's a packager issue, we should tell them not to do that. ;) 19:56:21 * pingou is on the list 19:56:43 pingou, but the source file says 389-admin-1.1.12.tar.bz2 and the lookaside cache has 3 of them 19:56:58 smooge: the sources file has md5 too 19:57:03 smooge: 3 different md5, and the source file has the md5 19:57:07 duh 19:57:09 thanks 19:57:11 :) 19:57:19 I deal with budgets and PPC for a week 19:57:31 you're in pretty good shape then! :D 19:57:32 well mostly budgets 19:57:34 anyhow, perhaps devel list and hope we can get folks interested in investigating more so we don't have to? 19:57:46 nirik: ok we'll do that :) 19:58:04 crowdsource all the things! :) 19:58:26 nirik: but I doubt texlive are bad upload, I'm pretty sure they are small sources :) 19:58:46 oh... I wonder if that data would be nice too... size ? 19:58:53 ? 19:59:02 because if there are 4 of them and 3 of them are really small, it sounds like an upload problem? 19:59:19 i was thinking spec file 19:59:22 if all are close to the same size, it sounds more like upstream re-released or packager messed up 19:59:56 pingou: so, for each of those on your list, a 'ls -l' of the same md5sum one? 20:00:09 ls -lR 20:00:13 actually even then it could be a bad upload. We had someone complaining a while back and it turned out about being a bad proxy in front 20:00:31 sure, but it might give some more indications. 20:00:38 nirik: good idea 20:01:38 ok, if nothing else, will close out in a minute. 20:02:03 quick update from me 20:02:28 willo: sure, whats up? 20:02:42 I'm about half way through collating a list of networks before I start on the diagrams 20:03:10 willo: sorry I dropped off on helping you with this - school became harder than I anticipated for this semester. 20:03:22 cool. please do ask me if you have questions. 20:03:29 i'll have something shortly to get input on assumptions 20:03:42 assumputions about purpose etc 20:03:46 but things are stabilizing, so feel free to ping me for anything 20:03:47 nirik: no prob 20:04:00 danofsatx-work: no probs will do 20:04:18 great. ;) 20:04:36 listing it out in spreadsheet and i'll stick it up on fedorapeople and ping mailing list 20:04:48 ok, thanks for coming everyone! Lets get back to it in #fedora-admin, #fedora-apps, #fedora-noc. 20:04:55 willo: sounds goodly. 20:05:11 #endmeeting