18:00:00 <nirik> #startmeeting Infrastructure (2014-10-02) 18:00:00 <zodbot> Meeting started Thu Oct 2 18:00:00 2014 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:00 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:01 <nirik> #meetingname infrastructure 18:00:01 <nirik> #topic aloha 18:00:01 <nirik> #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch threebean pingou puiterwijk 18:00:01 <zodbot> The meeting name has been set to 'infrastructure' 18:00:01 <zodbot> Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pingou puiterwijk relrod smooge threebean 18:00:14 * threebean is here 18:00:14 * pingou 18:00:18 * tflink is here 18:00:19 * relrod here 18:00:26 * lmacken 18:00:28 * danielbruno is here 18:00:46 * oddshocks here 18:01:22 * RogerBTX present @ laundromat 18:01:33 * mpduty is here 18:01:58 <Neldogz> neldogz is here 18:02:01 <nirik> #topic New folks introductions and Apprentice tasks. 18:02:13 <nirik> any new folks like to introduce themselves today? 18:02:20 <nirik> or apprentices with questions, comments or ideas? 18:03:24 <nirik> ok. ;) moving along then... 18:03:38 <nirik> #topic Applications status / discussion 18:03:45 <nirik> any application news this week? :) 18:03:52 <pingou> new pkgdb in prod today 18:04:04 <nirik> nice. any list of changes? 18:04:11 <smooge> mgoodmday 18:04:12 <pingou> https://github.com/fedora-infra/pkgdb2/blob/master/utility/pkgdb2.spec#L114 18:04:23 * nirik wonders if we shouldn't try and mail at least infra list on updates? just a short 'pkgdb2 updated, here's changes' 18:04:41 <pingou> I added the update_package_info script that was written a while ago 18:05:01 <pingou> whose goal is to update the package's info (summary and description) from yum's info (thus the rpm) 18:05:20 <pingou> also, now the user orphaning a package will loose his/her commit and approveacl ACL 18:05:34 <nirik> ok. I guess if they need it back they can unorphan. ;) 18:05:40 <pingou> yes :) 18:05:48 <pingou> the rest is mostly cosmetic changes in the UI 18:06:07 <pingou> there will be another release tomorrow to fix something in the emails pkgdb send when it has a problem 18:06:16 <pingou> and I have a new fedocal release coming after that 18:06:23 <nirik> busy busy. ;) 18:06:32 <pingou> (w/ for example the possibility to send a reminder to multiple email addresses) 18:06:40 <tflink> taskotron production is making progress 18:06:52 <pingou> and anitya is ready in prod, just waiting for the DNS to sync in: 18:07:06 <pingou> but I'll call for tester before calling it 1.0 :) 18:07:13 <tflink> waiting on two bits of code, learning the hard way about bits that need to be monitored after restart etc. 18:07:18 <nirik> #info pkgdb2 update in prod today, another minor one tomorrow. 18:07:45 * dgilmore is kinda here but really not 18:07:52 <nirik> #info anitya almost ready for production. needs dns to finish getting setup. 18:08:03 <nirik> #info taskotron almost ready for production. 18:08:28 <nirik> tflink: aside from those issues everything looking ok from a infra standpoint for you? I know we need monitoring still. 18:08:54 <tflink> looking into some vpn issues ATM 18:09:04 * pingou needs to add a nagios check for anitya 18:09:10 <tflink> monitoring, backup (ticket hasn't been filed yet) 18:09:28 <nirik> ah yeah I am already backing up the db... 18:09:34 <nirik> but there's possibly more. 18:09:51 <tflink> there are some files on taskotron01.qa that will need to be backed up 18:10:00 <nirik> ok. 18:10:07 * lanica is here for the infra meeting (sorry I'm late!) 18:10:14 <nirik> welcome lanica 18:10:57 <nirik> ok, anything else pending on the applications side? 18:11:29 <threebean> eh, I've been doing lots of monitoring stuff this week and will be pushing out some other changes for anitya later. 18:11:41 <nirik> cool. 18:12:18 <oddshocks> general improvements to fedimg and ostree/atomic stuff. nothing outstanding. 18:12:23 <oddshocks> making progress 18:12:37 <pingou> cool 18:12:46 <nirik> abompard: whats the current word on hyperkitty (if you happen to be around). I'd be fine moving some lists if there's nothing we are waiting on. 18:12:52 <nirik> oddshocks: cool. 18:13:19 <abompard> nirik: actually there's currently lots going on around Mailman's SQLAlchemy port 18:13:26 <abompard> it may be happenning soon 18:13:42 <lmacken> I got bodhi2 test suite back up and running on jenkins this week. Lots of masher hacking as well, but that hasn't hit infra yet. http://jenkins.cloud.fedoraproject.org/job/bodhi/ 18:13:54 <abompard> so I'm focusing on that, the windows of availability for barry to review code are small 18:13:56 <nirik> abompard: nice. :) do we want to wait for that? 18:14:15 <abompard> nirik: yes, if it happens as soon as I think it can 18:14:25 <nirik> lmacken: nice. 18:14:25 <oddshocks> abompard: as in warsaw? 18:14:30 <abompard> oddshocks: yep 18:14:33 <oddshocks> abompard: niceee 18:14:48 <nirik> lmacken: no weird disappearances on jenkins lately? did we ever figure out what was happening with that? 18:15:03 <pingou> abompard: champagne \รณ/ 18:15:14 <abompard> pingou: hell yeah 18:15:33 <nirik> cool. 18:15:38 <lmacken> nirik: as far as plugins disappearing, I think the ansible playbook overwrites them all, and we don't have the fedmsg plugin in git 18:15:47 <nirik> #info hyperkitty is working on sqlalchemy changes. Will deploy after those land. 18:15:54 <nirik> lmacken: ah, that would do it. 18:15:56 <lmacken> as for projects disappearing, I have no idea how that happened to begin with, but it has yet to happen again 18:16:19 <nirik> #info bodhi2 is in jenkins now and running tests on commits 18:16:21 <oddshocks> jenkins is just trying to get out of work. slacker. 18:16:22 <pingou> yeah the plugins need to be installed via the ansible playbook 18:16:52 <pingou> not via jenkins' UI (or done in both place, the playbook in addition to the UI) 18:17:30 <nirik> ok. I think relrod setup the fedmsg plugin there? relrod and pingou: can you see if you can get that in ansible (if it's not) 18:17:56 <relrod> yeah, we need to ansibleize it 18:18:03 <nirik> cool 18:18:09 <pingou> relrod: there is also a problem with the yum repo that was added 18:18:11 <pingou> (the copr one) 18:18:33 <relrod> pingou: yeah, I know. Jenkins can't talk to copr for some reason. Short term solution is put the RPM from that copr in the infra repo 18:18:36 <nirik> yeah, we talked about that... 18:18:37 <relrod> I just need to actually do it 18:18:44 * nirik nods. 18:18:49 <pingou> relrod: ok cool 18:19:16 <nirik> ok, anything else on the applications side? 18:19:43 <nirik> pingou: oh, side question: does anitya file bugzilla bugs on things? or was that a different layer of our old cnucnu thing? 18:19:44 <relrod> one fedora mobile related thing I guess: 18:20:01 <pingou> nirik: it'll be a different layer 18:20:23 <nirik> pingou: ok, was that something tyll was running? does he know to update? or is that something we want to run? 18:20:30 <relrod> I am working on a system for pushing out nightly updates to it via the Google Play alpha track. This means we can lose the ugly self-updating code that is in Mobile right now. 18:20:49 <nirik> relrod: cool. 18:20:51 <relrod> We were going to do this via Jenkins (https://github.com/fedora-infra/mobile/issues/41) but ran into some issues 18:21:02 <pingou> nirik: we'll probably want a fedmsg-cnucnu as we have fedmsg-fasclient 18:21:09 <relrod> so it's probably going to be a separate process that just pulls the latest APK from Jenkins and sends it to Google 18:21:12 <relrod> which is fine 18:21:20 <pingou> nirik: I'll coordinate w/ tyll 18:21:49 <nirik> relrod: ok 18:21:53 <nirik> pingou: sounds good 18:22:02 <relrod> (the issue is basically "the APK has to be signed, and we don't really want the signing key for it on Jenkins") 18:22:52 <nirik> relrod: hum, ok... where would that live then/ 18:22:53 <nirik> ? 18:23:36 <nirik> or TBD? 18:23:40 <relrod> nirik: Probably ansible private repo, and the nightly pusher can be a cloud node or something? Doesn't need to be very powerful 18:23:47 <nirik> ok 18:24:23 <relrod> If we don't want it in the ansible private repo, open to better suggestions 18:24:40 <nirik> #action pingou to coordinate with tyll on cnucnu bugzilla filing under the new anitya setup. 18:24:47 <nirik> I think that would be ok 18:25:05 <nirik> #info fedora mobile getting setup to be updated from the google play alpha track. 18:25:11 <relrod> I'm not sure if the alpha signing key can be different than the production track signing key (if anyone has Android experience and knows, poke me? :)) 18:25:27 <nirik> no idea off hand. 18:25:39 * nirik waits for the ffos html 5 version. ;) 18:25:48 <relrod> :P 18:25:56 <relrod> anyway, that's all I have. 18:26:00 <nirik> cool. thanks. 18:26:12 <nirik> anything else application wise? or shall I move on? 18:26:27 <nirik> #topic Sysadmin status / discussion 18:26:35 <nirik> so, we did a mass reboot earlier this week. 18:26:47 <nirik> there's a few stragglers we still need to do, but mostly everything is done. 18:26:59 <nirik> #info mass reboot earlier this week. Most things are rebooted/updated 18:27:18 <nirik> I migrated db04 (rhel6) to db-koji01 (rhel7) 18:27:47 <nirik> I killed db04 and keys01. We now have 0 guests at telia01 18:28:04 <smooge> <<plays taps>> 18:28:10 <pingou> I need to check what specific settings we are putting on the postgresql.conf file we install with the postgresql_server role, as this seems to be closely related to the memory of the host 18:28:20 <nirik> I'm also going to retire mirrorlist-serverbeach, because it keeps having problems keeping up. 18:28:26 <pingou> by telia01 18:28:31 <pingou> bye* 18:28:45 <nirik> pingou: yeah, I think we should be more dynamic on it. 18:29:03 <nirik> pingou: also, I think we just copied our old postgresql.conf in... we should get the rhel7 one and adjust it. 18:29:09 <pingou> make it a template and use host_vars or so 18:29:12 <nirik> because we might be missing other things the 7 one can do better 18:29:18 <pingou> also true 18:29:27 <nirik> if you want to work on that that would be great. ;) 18:29:33 <pingou> anitya-backend should have the default el7 one 18:29:41 <pingou> I can at least provide a diff :) 18:29:59 <nirik> ok 18:30:15 <nirik> #info kernel01/02 now have a bunch more memory. Thanks smooge for getting that installed. 18:30:33 <nirik> #info db04 (rhel6) migrated to db-koji01 (rhel7) 18:30:39 <smooge> well thanks jesse. all I did was watch from afar :) 18:30:43 <nirik> #info 0 guests left at telia01 18:30:58 <pingou> nirik: how many guests left in puppet? 18:31:35 <nirik> 78 18:31:45 <pingou> k 18:31:48 <nirik> I'm going to try and move some more next week... 18:32:06 <nirik> a number are virthosts, so we will have to move all the guests, update, move back 18:32:23 <pingou> that'll be fun 18:32:34 <nirik> and of course we still have proxies to convert. thats 8 there 18:33:03 <nirik> but I am going to try and get there before the end of the year if we can. 18:33:28 <nirik> #info 78 hosts left in puppet 18:33:44 <nirik> #topic nagios/alerts recap 18:33:56 * nirik digs up url. where's puiterwijk with it handy when you need him. ;) 18:35:21 <nirik> .tiny https://admin.fedoraproject.org/nagios/cgi-bin//summary.cgi?report=1&displaytype=3&timeperiod=last7days&smon=10&sday=1&syear=2014&shour=0&smin=0&ssec=0&emon=10&eday=2&eyear=2014&ehour=24&emin=0&esec=0&hostgroup=all&servicegroup=all&host=all&alerttypes=3&statetypes=3&hoststates=7&servicestates=120&limit=25 18:35:23 <zodbot> nirik: http://tinyurl.com/pcpaya7 18:35:39 <nirik> download01.mgmt was fixed, was a IMM going wonky 18:36:08 <nirik> the datagrepper I think was the datagrepper db migration that threebean did 18:36:27 <nirik> the collab03 mail queue is due to the way that check is written. We should fix it. 18:36:47 <nirik> it looks for anything mroe than 3 or 4 emails in queue, but thats our list server, so sometimes it has a bunch in there it's sending out. 18:36:57 <smooge> I rebuilt bvirthost10 on the UCS cisco. much swearing and pain 18:37:02 <smooge> oops sorry 18:37:05 <nirik> thanks for that smooge 18:37:26 <nirik> mirrorlist-serverbeach is going away, so that should disappear. 18:38:01 <smooge> <<taps on lone bugle>> 18:38:15 <nirik> not sure about the bodhi02 ones... lmacken: any errors you have seen from it lately? 18:38:26 <nirik> might have been the koji outage for the database move? 18:39:29 <nirik> anything else sysadminy? 18:40:02 <nirik> #topic Upcoming Tasks/Items 18:40:03 <nirik> https://apps.fedoraproject.org/calendar/list/infrastructure/ 18:40:11 <nirik> anything upcoming anyone would like to note or schedule? 18:40:33 <nirik> I'll note we go into Beta freeze 2014-10-14 18:40:45 <nirik> so do try and get anything done before then thats big/disruptive. 18:41:05 <pingou> 2 weeks from now 18:41:08 <nirik> yep. 18:41:09 <pingou> that'll be there soon 18:41:23 <nirik> I'm going to try and get some more rhel7/ansible migrations done, but we will see 18:41:42 <nirik> might do an outage and move db01 over. 18:41:47 <nirik> will have to look. 18:41:58 <nirik> #topic Open Floor 18:42:15 <nirik> anyone have items for open floor? ideas, comments, cookie recipes? 18:42:17 <smooge> I have two cloud servers to rebuild today. 18:42:31 <smooge> and I will be out tomorrow after the EPEL meeting 18:42:55 <nirik> smooge: cool. Let me know when 09 is done and I can play with running the setup on it. 18:43:31 <smooge> will do so 18:43:47 <smooge> next week I will be focusing on I2 and budgets 18:43:48 <nirik> If we can get that all working soon we can look at cloud migrating. ;) 18:43:56 <smooge> or cloud 18:44:09 <lmacken> nirik: no I haven't seen any errors from bodhi02 aside from the koji outage. 18:44:31 <nirik> I think thats mostly just a matter of telling everyone: you have until xxxx-xx-xx to terminate your instance(s) and bring them up in the new cloud. 18:44:43 <nirik> and us doing that on the persistent ones we care about. 18:44:54 <nirik> oh, we still need to get storage working tho. 18:44:58 <nirik> lmacken: ok 18:45:09 <nirik> lmacken: just odd that bodhi02 was in the nagios alerts, but not 01... 18:45:32 <lmacken> nirik: weird, I'll take a look 18:46:11 <nirik> thanks. 18:46:18 <nirik> ok, if nothing else, will close out in a minute or so... 18:47:41 <nirik> Thanks for coming everyone! Lets all continue in #fedora-admin, #fedora-apps and #fedora-noc. 18:47:43 <nirik> #endmeeting