18:00:04 #startmeeting Infrastructure (2014-08-21) 18:00:04 Meeting started Thu Aug 21 18:00:04 2014 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:04 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:05 #meetingname infrastructure 18:00:05 #topic aloha 18:00:05 #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk 18:00:05 The meeting name has been set to 'infrastructure' 18:00:05 Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pingou puiterwijk relrod smooge threebean 18:00:10 * threebean is here 18:00:14 * puiterwijk is arround 18:00:20 * relrod here 18:00:25 is here 18:00:29 * banas is here as well 18:00:44 * bochecha is here 18:01:04 * lmacken 18:01:19 * adimania is here 18:01:51 * lanica is here for the infra meeting. 18:02:02 hello everyone. ;) lets go ahead and get started... 18:02:05 #topic New folks introductions and Apprentice tasks 18:02:14 any new folks? or apprentices with questions/comments/ideas? 18:03:08 can there be an IRC class on ansible? 18:03:15 I can't remember if this is the slot I'm supposed to update - but work on GG is going good, as per schedule :) 18:03:17 is here in an un-italicized fashion 18:03:43 mpduty: we had one a while back, we could do another if there's interest, sure. ;) 18:04:49 we had to skip two gg meetings because I was at Flock, and now the whole team is sick, so we just decided to do it next week. 18:04:52 nirik, well I don't know but some newcomers would be able to learn things faster I feel 18:05:08 including myself 18:05:20 * mirek-hm is late, but here 18:05:26 mpduty: sure, we could ask on the list if folks are interested. I'm happy to do some intro one. We also had a nice intro at flock that was recorded... ;) 18:05:29 <-- long-time newcomer 18:05:35 * adimania thinks that it might be helpful to everyone 18:06:04 https://www.youtube.com/watch?v=sCXCgsmQuSY&feature=youtu.be 18:06:18 * lbazan here 18:06:25 thanks I shall go through that 18:06:33 welcome sart. ;) 18:07:06 thanks and ty for the intro vid 18:08:11 ok, any other new folks or questions? 18:08:31 * pjones waves hello 18:08:48 hey pjones. :) 18:09:04 #topic Applications status / discussion 18:09:18 pingou wasn't able to make it but had a few things for me to pass on: 18:09:28 #info new pkgdb2 release in production this week. 18:09:50 #info critpath lists are now working again. 18:10:22 #info infrastructure jenkins is updated to the latest plugins, etc. 18:10:29 also, tflink wanted to share: 18:10:46 #info taskotron in stg/dev is going along fine, probibly won't be in production until after alpha now. 18:11:06 any other applications news? 18:11:32 #info we're almost at 10 million fedmsg messages all time 18:11:36 https://apps.fedoraproject.org/datagrepper/ 18:11:38 :P 18:11:56 #info GlitterGallery release process planned to be started within this week 18:12:09 threebean: nice. :) 18:12:13 also, we started a scratch pad for discussing promoting FMN from an opt-in service to an opt-out service (for packagers) 18:12:15 http://piratepad.net/FMN-opt-out 18:12:20 threebean: woot! 18:12:47 threebean: 10mil & 1TiB+ of traffic? :) 18:12:50 do we need bodhi2 in place? 18:13:07 optimizations pushed out last friday seem to have FMN keeping it's workload under control. 18:13:11 .tiny https://admin.fedoraproject.org/collectd/bin/graph.cgi?hostname=notifs-backend01.phx2.fedoraproject.org;plugin=fedmsg;plugin_instance=hub;type=queue_length;type_instance=FMNConsumer_backlog;begin=-706400 18:13:12 threebean: http://tinyurl.com/le9xywt 18:13:35 threebean: and *very* few of those are me trying to game badges 18:13:44 pjones: almost none 18:13:58 ;) 18:14:35 threebean: didn't the optimizations DoS FAS though? 18:15:22 they did :/ (although only once at startup) 18:15:28 we have been seeing fas servers complain the last few days... 18:15:36 but I haven't had a chance to look at what might be causing it. 18:15:57 many threads all trying to cache fas at the same time - fixed in git http://da.gd/BW60H 18:16:21 nirik: I think only some of those are related to fmn, which should only be doing this at startup. 18:16:39 I'd think the fas problems would have gone down due to the fedmsg-fasclient stuff pingou pushed out.. :/ 18:16:54 yeah. theres a lot less runs of that for sure. 18:17:06 but something is still causing them to start swapping. 18:17:14 more investigation needed 18:18:39 we have also seen recently openvpn hitting cpu limits. 18:18:45 might be related, not sure. 18:19:02 nirik: were you thinking that was related to outbound email? 18:19:18 thought so at first, but I was looking at the wrong thing. ;) 18:19:20 i've been seeing a lot more fedmsg error spam, along with the usual batch of fedora-packages 200k mails :P 18:19:23 oh, okay. 18:19:36 If you sniff the traffic on the openvpn tun device you only see traffic going to the node you are on. 18:19:50 if you sniff the eth device you see all the vpn streams, etc. 18:20:05 it looked last night like proxy02 was pushing a lot more than the others, but that might have just been at that time 18:20:33 I'll keep trying to isolate it. 18:21:16 Any other application news? :) 18:21:21 when do we freeze? 18:21:31 * threebean forgot about freeze 18:22:14 when we have a viable test compose. ;) 18:22:16 some time after dgilmore winds up having working trees for all the stuff. 18:22:23 which is... taking a while. 18:22:24 yeah... hopefully soon. 18:22:32 so many bugs. 18:22:56 nirik: looks like createImage is still busting on cloud-atomic? 18:23:03 http://koji.fedoraproject.org/koji/taskinfo?taskID=7435960 <-- the screenshot is horrifying 18:23:11 pjones: yeah, boggling. ;( 18:23:20 but now I'm just interrupting 18:23:46 we all like a good horror sideshow. ;) 18:24:00 #topic Sysadmin status / discussion 18:24:06 so, on the sysadmin side of things... 18:24:22 #info retrace servers have been handed off to retrace folks. They are setting things up on them now. 18:24:32 thanks to smooge for getting those all installed. 18:25:14 There's the openvpn and fas hiccups we have been seeing, but we already mentioned those. 18:25:25 pjones: if only that used chained exceptions in py3, that screenshot might be a bit more useful :\ 18:25:34 We still have qa09 and virthost-comm03 to setup as new machines. 18:25:50 lmacken: Eh, I think it's just saying the image the python loaded from is corrupt. I doubt the exception is meaningful at all. 18:26:03 pjones: yeah, looks like an exception in the exception handler :\ 18:27:37 there's also a bit of discussion ongoing about netapp space (since we are getting very dangerously full on our koji storage) 18:27:47 hopefully there will be some good news there sometime soon. 18:27:58 hm, turns out that our staging and production environments weren't firewalled off from one another anymore after the switch to ansible. that's fixed up now. 18:28:23 oh yeah, thanks much for fixing that. 18:28:39 np. things might behave funny while it shakes out. 18:28:50 I've so far not seen any fallout from that... 18:28:56 but that doesn't mean we won't hit some 18:30:11 #info memcached is now setup to restart on exit and also is monitored in nagios 18:30:35 #info staging and prod are now once again blocked from talking to each other via firewall rules. 18:31:28 #topic nagios/alerts recap 18:31:33 this should be fun this week... 18:31:37 * nirik digs up link 18:32:11 .tiny https://admin.fedoraproject.org/nagios/cgi-bin//summary.cgi?alerttypes=3&displaytype=3&eday=15&ehour=24&emin=0&emon=5&esec=0&eyear=2014&host=all&hostgroup=all&hoststates=3&limit=25 18:32:11 nirik: http://tinyurl.com/ms5u5qm 18:32:17 I think thats right. 18:32:59 no, not right. 18:33:30 .tiny https://admin.fedoraproject.org/nagios/cgi-bin//summary.cgi?report=1&displaytype=3&timeperiod=last7days&smon=8&sday=1&syear=2014&shour=0&smin=0&ssec=0&emon=8&eday=21&eyear=2014&ehour=24&emin=0&esec=0&hostgroup=all&servicegroup=all&host=all&alerttypes=3&statetypes=3&hoststates=7&servicestates=120&limit=25 18:33:31 nirik: http://tinyurl.com/koaopnc 18:34:03 almost all of those are due to the vpn issue. 18:34:15 the server hits 100% cpu and starts dropping packets. 18:35:00 #topic Upcoming Tasks/Items 18:35:00 https://apps.fedoraproject.org/calendar/list/infrastructure/ 18:35:12 anything upcoming folks would like to schedule or note? 18:35:18 the freeze is hopefully coming up soon. 18:36:19 Also, we are in the early planning stages for another FAD in december... 18:36:45 https://fedoraproject.org/wiki/FAD_MirrorManager2_FAS3_2014 18:36:54 would remote participation be possible? 18:36:59 absolutely. 18:37:41 cool. Flying is very expensive in my part of the world :( 18:37:42 it's very subject to change right now, just sounding out if it will be possible and useful. 18:37:56 just to note: i'm on some on-again, off-again vacation for the next two weeks. (I'll be around on tuesdays and thursdays) 18:38:10 yeah, there's a good chance that I won't be in Denver during that time 18:38:18 threebean: cool. 18:38:24 I'm definitely interested if I'm not at work (or possibly if so but work is completely dead, which it may be that time of year... ) 18:38:29 so a backup location is definitely needed 18:38:29 lmacken: fun. :) So, we might want to pick another place or time 18:39:10 yeah, we will come up with something. ;) 18:39:44 Oh, I should mention that I am likely going to be traveling the latter part of september... 18:40:23 #topic Open Floor 18:40:38 anyone have anything for open floor? comments, suggestions, ideas, favorate cookies? 18:41:47 mmm�cookies. 18:42:10 * adimania eats 18:42:16 encrypted browser cookies? 18:42:16 * adimania nom nom nom 18:42:28 chocolate chip. ;) Or molassass sugar. nom. 18:42:39 oatmeal butterscotch 18:43:46 alrighty. If nothing else will close out in a minute... (to go find cookies) 18:44:11 Good meeting�later all! 18:44:39 Thanks for coming everyone. Lets all continue in #fedora-admin, #fedora-apps and #fedora-noc. 18:44:41 #endmeeting