15:00:24 <bowlofeggs> #startmeeting Infrastructure (2019-09-19)
15:00:24 <zodbot> Meeting started Thu Sep 19 15:00:24 2019 UTC.
15:00:24 <zodbot> This meeting is logged and archived in a public location.
15:00:24 <zodbot> The chair is bowlofeggs. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:24 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:00:24 <zodbot> The meeting name has been set to 'infrastructure_(2019-09-19)'
15:00:26 <bowlofeggs> #meetingname infrastructure
15:00:26 <zodbot> The meeting name has been set to 'infrastructure'
15:00:28 <bowlofeggs> #topic aloha
15:00:30 <bowlofeggs> #chair nirik pingou puiterwijk relrod smooge tflink cverna mizdebsk mkonecny abompard bowlofeggs
15:00:30 <zodbot> Current chairs: abompard bowlofeggs cverna mizdebsk mkonecny nirik pingou puiterwijk relrod smooge tflink
15:00:36 <nirik> morning
15:01:45 <mizdebsk> hello
15:01:50 <dustymabe> .hello2
15:01:51 <bowlofeggs> three is a party
15:01:51 <zodbot> dustymabe: dustymabe 'Dusty Mabe' <dusty@dustymabe.com>
15:02:00 <dustymabe> 4 is a ??
15:02:05 <bowlofeggs> bigger party
15:02:06 <dustymabe> crowd
15:02:07 <nirik> hopefully not the donner party
15:02:23 <marc84> Hi
15:02:24 <smooge> party of 5
15:02:27 <smooge> party of 4
15:02:34 <bowlofeggs> #topic New folks introductions
15:02:36 <bowlofeggs> #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
15:02:38 <dustymabe> klump party of 5 please
15:02:38 <bowlofeggs> #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted
15:03:03 <bowlofeggs> any new folks around who want to say "hi", or "bonjour", or "hola"?
15:03:20 <bowlofeggs> ni hao you doin'
15:03:41 * bowlofeggs only knows hello in like 3 languages
15:04:08 * marcdeop is here
15:04:09 * relrod here, sorry was grabbing late breakfast (early lunch?)
15:04:54 <nirik> bunch? lunfast?
15:05:12 <mboddu> brunch - which is actually a thing
15:05:48 <bowlofeggs> maybe no new folks today ☹
15:05:49 <bcotton> elevensies
15:05:59 <bowlofeggs> #topic announcements and information
15:06:01 <bowlofeggs> #info We are looking for people to maintain Fedocal and Nuancier - mkonecny
15:06:17 <mboddu> bowlofeggs: How can you forget "dobry den"?
15:06:21 <bowlofeggs> but if there are no new folks, maybe that announcement is not for this audience ☺
15:06:40 <bowlofeggs> mboddu: heh, well i'm not sure i ever knew that one
15:06:42 <nirik> They might read it in minutes later? we can hope
15:06:51 <bowlofeggs> oh that's true
15:06:57 <bowlofeggs> zodbot: tell 'em
15:07:00 <nirik> #info Fedora 31 beta is released, go get it now.
15:07:00 <zodbot> bowlofeggs: (tell <nick> <text>) -- Tells the <nick> whatever <text> is. Use nested commands to your benefit here.
15:07:06 <bowlofeggs> haha what
15:07:10 <bowlofeggs> i didn't know that was a command
15:07:12 <bowlofeggs> that's great
15:07:18 <bowlofeggs> zodbot: tell nirik this is neat
15:07:19 <mizdebsk> #info Koschei has been ported from fedmsg to fedora-messaging (staging) - mizdebsk
15:07:20 <zodbot> bowlofeggs: Kneel before zod!
15:07:28 <bowlofeggs> mizdebsk: nice!
15:07:35 <nirik> #info Beta freeze is over,
15:08:06 * bowlofeggs hangs his winter coat on the rack but keeps it close for the coming release freeze
15:08:08 <nirik> bowlofeggs: FYI, tell is a private message. it just sent me "bowlofeggs wants me to tell you: this is neat"
15:08:17 <bowlofeggs> haha
15:08:22 <bowlofeggs> that's kinda silly but i love it
15:08:52 <relrod> #info our s3 mirror now syncs releases/test content (and our isos, though that was done a while ago)
15:09:10 <mboddu> zodbot: tell bowlofeggs "dobry den" is hello in Czech :D
15:09:10 <zodbot> mboddu: Kneel before zod!
15:09:52 <dustymabe> nice work @relrod
15:09:53 <bowlofeggs> alright, moving along
15:10:07 <bowlofeggs> #topic Oncall
15:10:09 <bowlofeggs> #info https://fedoraproject.org/wiki/Infrastructure/Oncall
15:10:11 <bowlofeggs> #info Summary of last week: (from relrod )
15:10:40 <relrod> It's actually been mostly quiet, probably due to freeze... let me think
15:12:27 <relrod> I can't think of anything major that's happened. I acked some fedmsg alerts... whatcanidoforfedora.org's ssl is expiring soon and spammed us
15:12:36 <bowlofeggs> cool
15:12:38 <bowlofeggs> quiet is good
15:13:11 <bowlofeggs> #info ???? is oncall from 2019-09-19 — 2019-09-26
15:13:14 <bowlofeggs> i'm willing to take that one
15:13:19 <bowlofeggs> haven't been oncall in months
15:13:21 <mizdebsk> retrace server hung up and had to be rebooted
15:13:29 <relrod> ah yeah that
15:13:49 <bowlofeggs> #info ???? is oncall from 2019-09-26 — 2019-10-03
15:14:00 <relrod> bowlofeggs: ok. Note that I was planning to keep it the rest of today because I had to trade off with smooge the second half of Tuesday
15:14:08 <bowlofeggs> #info ???? is oncall from 2019-10-03 — 2019-10-10
15:14:11 <relrod> so I was going to make up the time by keeping it today
15:14:31 <smooge> .takeoncallus
15:14:34 <bowlofeggs> relrod: sure no objections from me, though that will help me and not smooge ☺
15:14:39 <smooge> .oncalltakeus
15:14:39 <zodbot> smooge: Kneel before zod!
15:14:59 <bowlofeggs> anybody want either of those weeks i listed?
15:15:02 <smooge> oh wait..
15:15:08 <smooge> sorry.. bowlofeggs is taking it.
15:15:14 <smooge> bowlofeggs, put me down for next week
15:15:17 <bowlofeggs> .oncalltakeus
15:15:17 <zodbot> bowlofeggs: Kneel before zod!
15:15:26 * nirik calls bowlofeggs and asks "Is your refrigerator running?"
15:15:38 <bowlofeggs> #info bowlofeggs is oncall from 2019-09-19 — 2019-09-26
15:15:39 <smooge> no
15:15:51 <bowlofeggs> #info smooge is oncall from 2019-09-26 — 2019-10-03
15:16:00 <bowlofeggs> any takers for oct 3-10?
15:16:19 <bowlofeggs> we can force jcline to do it…
15:16:25 <dustymabe> +1
15:16:51 <relrod> bowlofeggs: you just want to force someone to do something because we forced you to take the meeting this week ;)
15:17:07 <bowlofeggs> hahaha
15:17:08 <smooge> did someone write the lottery program
15:17:09 <bowlofeggs> true
15:17:15 <bowlofeggs> well i guess ???? is the winner then
15:17:17 <bowlofeggs> moving along
15:17:19 <bowlofeggs> #topic Monitoring discussion
15:17:21 <bowlofeggs> #info https://nagios.fedoraproject.org/nagios
15:17:23 <bowlofeggs> #info Go over existing out items and fix
15:18:03 * nirik looks
15:18:15 <bowlofeggs> there are some red things
15:18:19 <bowlofeggs> and some orange things
15:18:34 <nirik> the koschei ones are likely because mizdebsk was moving to openshift?
15:18:36 <mizdebsk> koschei stg alerts are because of move to openshift - nagios playbook run should clear them
15:18:51 <mizdebsk> but we were in freeze until recently, so i didn't run it
15:19:26 <bowlofeggs> cool
15:19:34 <nirik> the osbs ones I think I can fix...
15:19:35 <bowlofeggs> should we file a ticket for the whatcanidoforfedora cert?
15:19:35 <relrod> proxy playbook should maybe probably fix the whatcanidoforfedora stuff, and there's an open easyfix to have someone make just one box check that cert (because if it's okay on one proxy, it's almost certainly okay on the others)
15:19:53 <nirik> I am running a master playbook run now.
15:19:56 <nirik> it should clear that
15:20:00 <relrod> cool
15:20:03 <bowlofeggs> sweet
15:20:31 <nirik> 2 machines have drives out
15:20:45 <nirik> autocloud-backend-libvirt2.phx2.fedoraproject.org and qa09
15:20:49 <smooge> yeah.. and one system has a broken fan
15:20:56 <smooge> it is on my list
15:21:15 <nirik> cool.
15:21:38 <bowlofeggs> anything else to discuss on this topic? would it be helpful to file tickets for a few of these things?
15:21:42 <nirik> and we still need to make fas send fedmsgs again... (there's a ticket on it)
15:21:58 <nirik> I think all of them that need tickets have them...
15:22:02 <bowlofeggs> cool
15:22:06 <relrod> what about the badges-backend fedmsg stuff
15:22:10 <bowlofeggs> shall we move along?
15:22:12 <relrod> I've just been acking them
15:22:18 * nirik notes also our ticket count has spiked of late. Need to start actually solving/closing some. ;)
15:22:36 * bowlofeggs files a ticket about how we need to fix some tickets so we remember to fix tickets
15:22:46 <smooge> oh there was one thing.. someone was on #fedora-admin complaining about 10,000 emails from notifications this morning
15:22:47 <nirik> relrod: well, thats a good question... I guess we are kinda waiting to see if people are taking it over or not?
15:23:06 <nirik> yeah, they filed a ticket, someone should investigate. ;)
15:23:27 <bowlofeggs> there are some people who have agreed to take it over, but it sounds like they are going to rewrite it from scratch
15:23:53 <bowlofeggs> which is good, but i'm concerned that will take a long time and i think we should not host the existing badges for a long time
15:23:56 <nirik> hum... and we are going to keep running this one until they have a running app?
15:24:00 <nirik> yeah
15:24:04 <bowlofeggs> yeah that's my concern
15:24:27 <smooge> i laugh, i cry, i hit my head against the table until the pain stops
15:24:36 <bowlofeggs> it would be more ideal if we could get someone to take over the existing badges in parallel to them writing a replacement (doesn' thave to be the same people)
15:25:03 <smooge> agreed
15:25:06 <nirik> I think if we can just get someone(s) to move it to communishift it would help a lot.
15:25:13 <bowlofeggs> agreed
15:25:24 <nirik> but... perhaps we should just remove the nagios checks for now... since they are causing us a lot of work.
15:25:35 <nirik> but then it will be broken a lot
15:25:36 <bowlofeggs> retiring apps is (one of) my new project(s)
15:25:53 <bowlofeggs> and this one is high on my list since it alerts a lot in monitoring
15:26:10 <nirik> or... remove monitoring and add a 'restart hourly' cron job?
15:26:30 <bowlofeggs> as a side note: i'd like to prioritize app retirements that we spend the most time on, so i may ask for input from you all on what apps steal the most of your time
15:26:33 <smooge> nirik, so currently we aren't fixing it when the errors occur. we are just acking
15:26:34 <nirik> or perhaps this is a good topic for the list (to get more input)
15:26:38 <bowlofeggs> nirik: lol
15:26:52 <smooge> so I think the nagios checks going away would keep it as broken as current
15:26:57 <nirik> smooge: I have been going to it and wiping its memory of where it is and restarting it.
15:27:08 <smooge> yeah I was doing that every now and then
15:27:13 <smooge> so I think that is what we do
15:27:13 <nirik> ssh badges-backend01, up arrow, exit
15:27:29 <bowlofeggs> nirik: yeah I will follow up on this. i'm +1 to rewriting badges, but i'm -1 to us hosting badges until that's ready. i'll find a way to get it off our hands ☺
15:27:32 <smooge> we set a cron job which does that and remove the nagios
15:27:54 <nirik> smooge: yeah, thats what I was thinking... " remove monitoring and add a 'restart hourly' cron job?"
15:28:02 <smooge> agreed
15:28:19 <bowlofeggs> nirik: haha i love when you have a ssh up arrow thing - my music server fails to start httpd on boot for some reason, so i have to do that every now and then on it
15:28:53 <smooge> sudo -i ssh-up-arrow needs to be a command
15:29:01 <bowlofeggs> hahah
15:29:12 <bowlofeggs> cool, anything else to discuss on monitoring, or shall we proceed?
15:30:07 <mizdebsk> smooge, ansible -m shell -a '$(tail -1 ~/.bash_history)' hostspec
15:30:15 <relrod> smooge: $(tail -n 1 .bash_history)
15:30:15 <relrod> yeah
15:30:49 <nirik> move on is fine
15:31:31 <bowlofeggs> #topic Tickets discussion
15:31:33 <bowlofeggs> #info https://pagure.io/fedora-infrastructure/report/Meetings%20ticket
15:31:39 <smooge> no please move on. I am just here to derail
15:31:47 <bowlofeggs> just one from mizdebsk https://pagure.io/fedora-infrastructure/issue/8210
15:32:02 <mizdebsk> so, due to removal of python2 from Fedora, fas-clients is no longer installable on f31+
15:32:09 <smooge> yep
15:32:18 <mizdebsk> what is our plan? how should we address this issue? i have mentioned a couple of ideas in the ticket
15:32:34 <nirik> so, I looked at this a bit the other day
15:32:38 <mizdebsk> for now this problem affects f31-test only (that i know of), but this may become bigger problem after time as more systems try to move to f31
15:32:41 <bowlofeggs> i'm not familiar with fas-clients - what is it needed for?
15:32:55 <bowlofeggs> i.e., what breaks if we ignore this? ☺
15:32:56 <nirik> I think someone who knows python might be able to convert it to python3 without too much pain...
15:33:05 <mizdebsk> bowlofeggs, it creates user accounts on almost all of our machines
15:33:08 <smooge> fas-clients is what sets up who is allowed on each system, gets their ssh-keys, updates aliases, etc
15:33:09 <mizdebsk> syncs ssh keys from fas etc.
15:33:13 <bowlofeggs> oh ok
15:33:15 <bowlofeggs> so that's important
15:33:42 <nirik> I ran 2to3 on it and got most of the way, but then it uses pickles...
15:33:43 * marcdeop thinks that this can become nasty
15:33:47 <bowlofeggs> and so we can't move our hosts to F31 until we fix this
15:33:48 <smooge> it is also code originally written for python1
15:34:08 <bowlofeggs> pickles often raise both of my eyebrows
15:34:10 <marcdeop> do we know where exactly the code is hosted?
15:34:15 <nirik> yes:
15:34:53 <nirik> https://github.com/fedora-infra/fas
15:35:03 <nirik> note: this is just the client script/one file
15:35:14 <nirik> porting the rest to python3 is... not advised.
15:35:31 <smooge> but if you do.. please get on the liver transplant list right away
15:35:54 <bowlofeggs> nirik: which file is the client script?
15:36:35 <nirik> https://github.com/fedora-infra/fas/blob/master/client/fasClient
15:37:36 <bowlofeggs> odd, it seems that develop is the default branch and doesn't have a client folder
15:38:11 <nirik> right, thats the 3.0 version
15:38:13 <smooge> I think that is because that is fas3
15:38:29 <smooge> sorry
15:38:31 <bowlofeggs> develop is fas3? and master is fas2?
15:38:48 <nirik> yes
15:39:03 <bowlofeggs> well that's confusing hahaha
15:39:07 <nirik> yes
15:39:11 <bowlofeggs> ok so we need someone to port it
15:39:13 <bowlofeggs> any takers?
15:39:19 <marcdeop> if I knew python I could give it a try... however 940 lines of code doesn't sound too terrible to mgirate to python3, does it?
15:39:26 <smooge> please remember that it was also a rcs->cvs->git conversion
15:39:48 <smooge> marcdeop, it is the fact that it uses certain concepts which are not happy in Py3
15:39:50 <nirik> I got pretty far... if someone can help me with the last failures that would be cool.
15:39:50 <relrod> the issue is I think this also involves porting python-fedora modules to python3, right?
15:40:14 <bowlofeggs> i think python-fedora works in python3, or at least the bits that bodhi uses are
15:40:20 <bowlofeggs> not sure all of it is or not
15:40:25 <relrod> ah ok
15:40:28 <smooge> and kitchen
15:40:40 <smooge> bowlofeggs loves kitchen
15:40:49 <bowlofeggs> kitchen should be removed
15:40:51 <bowlofeggs> seriously
15:40:56 <bowlofeggs> it can only do harm
15:41:00 <nirik> I should go to the kitchen and get some more coffeee.
15:41:04 <bowlofeggs> like it really does do the wrong thing
15:41:08 <bowlofeggs> hahaha
15:41:33 <bowlofeggs> i removed it from bodhi and it didn't take too long (maybe an hour or two)
15:42:01 <bowlofeggs> replaced everything with stdlib calls that were easier to understand for python programmers, and more importantly, didn't corrupt data
15:42:15 <smooge> so anyway
15:42:34 <smooge> you have to convert a lot of things and also look at the logic
15:42:42 <bowlofeggs> yeah
15:42:48 <bowlofeggs> and it very likely does not have tests
15:43:11 <marcdeop> i cannot volunteer for this as i am not familiar enough with the system and my python knowledge is almost none existing... 😥
15:43:13 <bowlofeggs> which means it will be difficult to know if the changes are correct (unless i'm wrong and it does have tests? so much of our code is not tested, so i'm assuming it doesn't…)
15:43:21 <smooge> hahahaaha
15:43:30 <smooge> no it does not
15:43:40 <bowlofeggs> yeah i see no tests at all
15:43:44 <bowlofeggs> that makes this 10x harder
15:43:54 <bowlofeggs> because now you are just guessing that the changes are ok
15:44:06 <smooge> it is the lowest level of our infrastructure and only gets time when everything that is above is working
15:44:14 <nirik> my current failure is:
15:44:18 <bowlofeggs> which is the opposite of what makes sense…
15:44:20 <bowlofeggs> haha
15:44:29 <bowlofeggs> i want the lowest level stuff to be the most tested
15:44:33 <nirik> https://paste.centos.org/view/1314d653
15:45:31 <bowlofeggs> nirik: ah ok - so a string is an encoding aware list of bytes. a bytes is just a list of bytes (could be a string if you know the encoding, but could also just be binary data of any sort)
15:46:01 <bowlofeggs> you can turn str's into byte's and vice versa by .encode('utf8') and .decode('utf8'), respectively
15:46:13 <bowlofeggs> assuming you know the encoding, and that it is utf-8
15:46:20 <bowlofeggs> (this is the error that kitchen makes)
15:46:30 <smooge> I am guessing ssh_dir is a string and 'authorized_keys' is bytes
15:46:40 <bowlofeggs> if you don't know the encoding, well, you really shouldn't be doing operations like that
15:47:01 <nirik> indeed.
15:47:03 <smooge> nopre
15:47:06 <smooge> ssh_dir = to_bytes(os.path.join(home_dir_base, username, '.ssh'))
15:47:50 <bowlofeggs> i can't volunteer to do the port, but i'd be willing to answer questions for the person doing it ☺
15:47:55 <bowlofeggs> shall we move along?
15:47:58 <bowlofeggs> only 13 min left
15:48:09 <smooge> i will take you up on that
15:48:13 <smooge> move along
15:48:23 <mizdebsk> so the conclusion is to wait more for someone to do the python3 port
15:48:43 <nirik> yeah, I think so. we are also looking at replacements again... but thats not going to be quick
15:49:11 <bowlofeggs> #topic Fedora CoreOS related tickets
15:49:13 <bowlofeggs> #info https://github.com/coreos/fedora-coreos-tracker/blob/master/Fedora-Requests.md#existing-requests-for-fedora-infra
15:49:47 * dustymabe waves
15:49:55 <bowlofeggs> let's talk about https://pagure.io/fedora-infrastructure/issue/8142 first
15:50:18 <bowlofeggs> dustymabe: it sounds like the ticket is waiting on you
15:50:21 <dustymabe> sure yep. we're unblocked on the immediate issue
15:50:43 <dustymabe> bowlofeggs: oops - i'll respond in ticket
15:50:59 <bowlofeggs> ok, so let's move on to https://pagure.io/fedora-infrastructure/issue/8218
15:51:04 <dustymabe> there is a short term "get us working" and a longer term "how do we work together in the future"
15:51:21 <dustymabe> i think nirik is still working on the longer term
15:51:25 <nirik> there is no sync script
15:51:30 <dustymabe> nirik: any luck with tagging resources?
15:51:31 <bowlofeggs> yeah it's manual only
15:51:38 * bowlofeggs just wrote an SOP for it, in fact ☺
15:52:07 <bowlofeggs> we could use automation here, but not sure who has the knowledge/time to do it
15:52:11 <dustymabe> hmm
15:52:13 <nirik> dustymabe: nope. I have some questions in to people, but no answers back yet
15:52:17 <bowlofeggs> supposedly we could write an openshift operator to do it
15:52:18 <dustymabe> i was still talking about #8142
15:52:43 <dustymabe> nirik: +1 - who did you ask? RH people or community people?
15:52:46 <nirik> it may just not be possible, which is sad, but...
15:53:26 <nirik> our aws contact... patrick, I'm happy to ask someone else if you think they will know a way to do what I want
15:53:54 <dustymabe> ok cool. so auto resource tagging might be hard to achieve.. if we can't achieve that maybe we can pursue some other solution
15:54:11 <dustymabe> nirik: david duncan might have some ideas
15:54:14 <dustymabe> he works for AWS
15:54:22 <dustymabe> might be worth setting up some time to chat with him
15:54:24 <bowlofeggs> i'm gonna switch to open floor in 45 seconds, so we at least get 5 min for that
15:54:27 <nirik> yes, he is who I meant when I said "our aws contact"
15:54:38 <nirik> I have been talking to him, yes
15:54:38 <dustymabe> bowlofeggs: earlier were you talking about 8218?
15:54:39 <bowlofeggs> (skipping the docs topic for meeting notes due to time)
15:54:46 <bowlofeggs> dustymabe: yes
15:54:47 <dustymabe> nirik +1
15:54:57 <dustymabe> ok bowlofeggs you've been working on groups setup in communishift ?
15:55:00 <bowlofeggs> #topic open floor
15:55:22 <bowlofeggs> dustymabe: i just documented how we do it. we don't have a way to automate it, nor do we currently have plans or resources to do so
15:55:40 <dustymabe> bowlofeggs: ok. if you don't mind take a look at my last comment in that ticket and respond
15:56:04 <dustymabe> it's a proposed solution halfway between automation and manual
15:56:13 * dustymabe quiet now
15:56:46 <bowlofeggs> anything for open floor?
15:57:04 <bowlofeggs> i thought board.net worked ok for today's meeting
15:57:17 <nirik> It has some warts, but it was okish.
15:57:20 <bowlofeggs> no ToS to sign which i like
15:58:49 <bowlofeggs> closing in 1 min
15:59:46 <nirik> oh, should we try and find next weeks chair?
15:59:48 <bowlofeggs> #endmeeting