15:00:24 #startmeeting Infrastructure (2019-09-19) 15:00:24 Meeting started Thu Sep 19 15:00:24 2019 UTC. 15:00:24 This meeting is logged and archived in a public location. 15:00:24 The chair is bowlofeggs. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:24 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:00:24 The meeting name has been set to 'infrastructure_(2019-09-19)' 15:00:26 #meetingname infrastructure 15:00:26 The meeting name has been set to 'infrastructure' 15:00:28 #topic aloha 15:00:30 #chair nirik pingou puiterwijk relrod smooge tflink cverna mizdebsk mkonecny abompard bowlofeggs 15:00:30 Current chairs: abompard bowlofeggs cverna mizdebsk mkonecny nirik pingou puiterwijk relrod smooge tflink 15:00:36 morning 15:01:45 hello 15:01:50 .hello2 15:01:51 three is a party 15:01:51 dustymabe: dustymabe 'Dusty Mabe' 15:02:00 4 is a ?? 15:02:05 bigger party 15:02:06 crowd 15:02:07 hopefully not the donner party 15:02:23 Hi 15:02:24 party of 5 15:02:27 party of 4 15:02:34 #topic New folks introductions 15:02:36 #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves 15:02:38 klump party of 5 please 15:02:38 #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted 15:03:03 any new folks around who want to say "hi", or "bonjour", or "hola"? 15:03:20 ni hao you doin' 15:03:41 * bowlofeggs only knows hello in like 3 languages 15:04:08 * marcdeop is here 15:04:09 * relrod here, sorry was grabbing late breakfast (early lunch?) 15:04:54 bunch? lunfast? 15:05:12 brunch - which is actually a thing 15:05:48 maybe no new folks today ☹ 15:05:49 elevensies 15:05:59 #topic announcements and information 15:06:01 #info We are looking for people to maintain Fedocal and Nuancier - mkonecny 15:06:17 bowlofeggs: How can you forget "dobry den"? 15:06:21 but if there are no new folks, maybe that announcement is not for this audience ☺ 15:06:40 mboddu: heh, well i'm not sure i ever knew that one 15:06:42 They might read it in minutes later? we can hope 15:06:51 oh that's true 15:06:57 zodbot: tell 'em 15:07:00 #info Fedora 31 beta is released, go get it now. 15:07:00 bowlofeggs: (tell ) -- Tells the whatever is. Use nested commands to your benefit here. 15:07:06 haha what 15:07:10 i didn't know that was a command 15:07:12 that's great 15:07:18 zodbot: tell nirik this is neat 15:07:19 #info Koschei has been ported from fedmsg to fedora-messaging (staging) - mizdebsk 15:07:20 bowlofeggs: Kneel before zod! 15:07:28 mizdebsk: nice! 15:07:35 #info Beta freeze is over, 15:08:06 * bowlofeggs hangs his winter coat on the rack but keeps it close for the coming release freeze 15:08:08 bowlofeggs: FYI, tell is a private message. it just sent me "bowlofeggs wants me to tell you: this is neat" 15:08:17 haha 15:08:22 that's kinda silly but i love it 15:08:52 #info our s3 mirror now syncs releases/test content (and our isos, though that was done a while ago) 15:09:10 zodbot: tell bowlofeggs "dobry den" is hello in Czech :D 15:09:10 mboddu: Kneel before zod! 15:09:52 nice work @relrod 15:09:53 alright, moving along 15:10:07 #topic Oncall 15:10:09 #info https://fedoraproject.org/wiki/Infrastructure/Oncall 15:10:11 #info Summary of last week: (from relrod ) 15:10:40 It's actually been mostly quiet, probably due to freeze... let me think 15:12:27 I can't think of anything major that's happened. I acked some fedmsg alerts... whatcanidoforfedora.org's ssl is expiring soon and spammed us 15:12:36 cool 15:12:38 quiet is good 15:13:11 #info ???? is oncall from 2019-09-19 — 2019-09-26 15:13:14 i'm willing to take that one 15:13:19 haven't been oncall in months 15:13:21 retrace server hung up and had to be rebooted 15:13:29 ah yeah that 15:13:49 #info ???? is oncall from 2019-09-26 — 2019-10-03 15:14:00 bowlofeggs: ok. Note that I was planning to keep it the rest of today because I had to trade off with smooge the second half of Tuesday 15:14:08 #info ???? is oncall from 2019-10-03 — 2019-10-10 15:14:11 so I was going to make up the time by keeping it today 15:14:31 .takeoncallus 15:14:34 relrod: sure no objections from me, though that will help me and not smooge ☺ 15:14:39 .oncalltakeus 15:14:39 smooge: Kneel before zod! 15:14:59 anybody want either of those weeks i listed? 15:15:02 oh wait.. 15:15:08 sorry.. bowlofeggs is taking it. 15:15:14 bowlofeggs, put me down for next week 15:15:17 .oncalltakeus 15:15:17 bowlofeggs: Kneel before zod! 15:15:26 * nirik calls bowlofeggs and asks "Is your refrigerator running?" 15:15:38 #info bowlofeggs is oncall from 2019-09-19 — 2019-09-26 15:15:39 no 15:15:51 #info smooge is oncall from 2019-09-26 — 2019-10-03 15:16:00 any takers for oct 3-10? 15:16:19 we can force jcline to do it… 15:16:25 +1 15:16:51 bowlofeggs: you just want to force someone to do something because we forced you to take the meeting this week ;) 15:17:07 hahaha 15:17:08 did someone write the lottery program 15:17:09 true 15:17:15 well i guess ???? is the winner then 15:17:17 moving along 15:17:19 #topic Monitoring discussion 15:17:21 #info https://nagios.fedoraproject.org/nagios 15:17:23 #info Go over existing out items and fix 15:18:03 * nirik looks 15:18:15 there are some red things 15:18:19 and some orange things 15:18:34 the koschei ones are likely because mizdebsk was moving to openshift? 15:18:36 koschei stg alerts are because of move to openshift - nagios playbook run should clear them 15:18:51 but we were in freeze until recently, so i didn't run it 15:19:26 cool 15:19:34 the osbs ones I think I can fix... 15:19:35 should we file a ticket for the whatcanidoforfedora cert? 15:19:35 proxy playbook should maybe probably fix the whatcanidoforfedora stuff, and there's an open easyfix to have someone make just one box check that cert (because if it's okay on one proxy, it's almost certainly okay on the others) 15:19:53 I am running a master playbook run now. 15:19:56 it should clear that 15:20:00 cool 15:20:03 sweet 15:20:31 2 machines have drives out 15:20:45 autocloud-backend-libvirt2.phx2.fedoraproject.org and qa09 15:20:49 yeah.. and one system has a broken fan 15:20:56 it is on my list 15:21:15 cool. 15:21:38 anything else to discuss on this topic? would it be helpful to file tickets for a few of these things? 15:21:42 and we still need to make fas send fedmsgs again... (there's a ticket on it) 15:21:58 I think all of them that need tickets have them... 15:22:02 cool 15:22:06 what about the badges-backend fedmsg stuff 15:22:10 shall we move along? 15:22:12 I've just been acking them 15:22:18 * nirik notes also our ticket count has spiked of late. Need to start actually solving/closing some. ;) 15:22:36 * bowlofeggs files a ticket about how we need to fix some tickets so we remember to fix tickets 15:22:46 oh there was one thing.. someone was on #fedora-admin complaining about 10,000 emails from notifications this morning 15:22:47 relrod: well, thats a good question... I guess we are kinda waiting to see if people are taking it over or not? 15:23:06 yeah, they filed a ticket, someone should investigate. ;) 15:23:27 there are some people who have agreed to take it over, but it sounds like they are going to rewrite it from scratch 15:23:53 which is good, but i'm concerned that will take a long time and i think we should not host the existing badges for a long time 15:23:56 hum... and we are going to keep running this one until they have a running app? 15:24:00 yeah 15:24:04 yeah that's my concern 15:24:27 i laugh, i cry, i hit my head against the table until the pain stops 15:24:36 it would be more ideal if we could get someone to take over the existing badges in parallel to them writing a replacement (doesn' thave to be the same people) 15:25:03 agreed 15:25:06 I think if we can just get someone(s) to move it to communishift it would help a lot. 15:25:13 agreed 15:25:24 but... perhaps we should just remove the nagios checks for now... since they are causing us a lot of work. 15:25:35 but then it will be broken a lot 15:25:36 retiring apps is (one of) my new project(s) 15:25:53 and this one is high on my list since it alerts a lot in monitoring 15:26:10 or... remove monitoring and add a 'restart hourly' cron job? 15:26:30 as a side note: i'd like to prioritize app retirements that we spend the most time on, so i may ask for input from you all on what apps steal the most of your time 15:26:33 nirik, so currently we aren't fixing it when the errors occur. we are just acking 15:26:34 or perhaps this is a good topic for the list (to get more input) 15:26:38 nirik: lol 15:26:52 so I think the nagios checks going away would keep it as broken as current 15:26:57 smooge: I have been going to it and wiping its memory of where it is and restarting it. 15:27:08 yeah I was doing that every now and then 15:27:13 so I think that is what we do 15:27:13 ssh badges-backend01, up arrow, exit 15:27:29 nirik: yeah I will follow up on this. i'm +1 to rewriting badges, but i'm -1 to us hosting badges until that's ready. i'll find a way to get it off our hands ☺ 15:27:32 we set a cron job which does that and remove the nagios 15:27:54 smooge: yeah, thats what I was thinking... " remove monitoring and add a 'restart hourly' cron job?" 15:28:02 agreed 15:28:19 nirik: haha i love when you have a ssh up arrow thing - my music server fails to start httpd on boot for some reason, so i have to do that every now and then on it 15:28:53 sudo -i ssh-up-arrow needs to be a command 15:29:01 hahah 15:29:12 cool, anything else to discuss on monitoring, or shall we proceed? 15:30:07 smooge, ansible -m shell -a '$(tail -1 ~/.bash_history)' hostspec 15:30:15 smooge: $(tail -n 1 .bash_history) 15:30:15 yeah 15:30:49 move on is fine 15:31:31 #topic Tickets discussion 15:31:33 #info https://pagure.io/fedora-infrastructure/report/Meetings%20ticket 15:31:39 no please move on. I am just here to derail 15:31:47 just one from mizdebsk https://pagure.io/fedora-infrastructure/issue/8210 15:32:02 so, due to removal of python2 from Fedora, fas-clients is no longer installable on f31+ 15:32:09 yep 15:32:18 what is our plan? how should we address this issue? i have mentioned a couple of ideas in the ticket 15:32:34 so, I looked at this a bit the other day 15:32:38 for now this problem affects f31-test only (that i know of), but this may become bigger problem after time as more systems try to move to f31 15:32:41 i'm not familiar with fas-clients - what is it needed for? 15:32:55 i.e., what breaks if we ignore this? ☺ 15:32:56 I think someone who knows python might be able to convert it to python3 without too much pain... 15:33:05 bowlofeggs, it creates user accounts on almost all of our machines 15:33:08 fas-clients is what sets up who is allowed on each system, gets their ssh-keys, updates aliases, etc 15:33:09 syncs ssh keys from fas etc. 15:33:13 oh ok 15:33:15 so that's important 15:33:42 I ran 2to3 on it and got most of the way, but then it uses pickles... 15:33:43 * marcdeop thinks that this can become nasty 15:33:47 and so we can't move our hosts to F31 until we fix this 15:33:48 it is also code originally written for python1 15:34:08 pickles often raise both of my eyebrows 15:34:10 do we know where exactly the code is hosted? 15:34:15 yes: 15:34:53 https://github.com/fedora-infra/fas 15:35:03 note: this is just the client script/one file 15:35:14 porting the rest to python3 is... not advised. 15:35:31 but if you do.. please get on the liver transplant list right away 15:35:54 nirik: which file is the client script? 15:36:35 https://github.com/fedora-infra/fas/blob/master/client/fasClient 15:37:36 odd, it seems that develop is the default branch and doesn't have a client folder 15:38:11 right, thats the 3.0 version 15:38:13 I think that is because that is fas3 15:38:29 sorry 15:38:31 develop is fas3? and master is fas2? 15:38:48 yes 15:39:03 well that's confusing hahaha 15:39:07 yes 15:39:11 ok so we need someone to port it 15:39:13 any takers? 15:39:19 if I knew python I could give it a try... however 940 lines of code doesn't sound too terrible to mgirate to python3, does it? 15:39:26 please remember that it was also a rcs->cvs->git conversion 15:39:48 marcdeop, it is the fact that it uses certain concepts which are not happy in Py3 15:39:50 I got pretty far... if someone can help me with the last failures that would be cool. 15:39:50 the issue is I think this also involves porting python-fedora modules to python3, right? 15:40:14 i think python-fedora works in python3, or at least the bits that bodhi uses are 15:40:20 not sure all of it is or not 15:40:25 ah ok 15:40:28 and kitchen 15:40:40 bowlofeggs loves kitchen 15:40:49 kitchen should be removed 15:40:51 seriously 15:40:56 it can only do harm 15:41:00 I should go to the kitchen and get some more coffeee. 15:41:04 like it really does do the wrong thing 15:41:08 hahaha 15:41:33 i removed it from bodhi and it didn't take too long (maybe an hour or two) 15:42:01 replaced everything with stdlib calls that were easier to understand for python programmers, and more importantly, didn't corrupt data 15:42:15 so anyway 15:42:34 you have to convert a lot of things and also look at the logic 15:42:42 yeah 15:42:48 and it very likely does not have tests 15:43:11 i cannot volunteer for this as i am not familiar enough with the system and my python knowledge is almost none existing... 😥 15:43:13 which means it will be difficult to know if the changes are correct (unless i'm wrong and it does have tests? so much of our code is not tested, so i'm assuming it doesn't…) 15:43:21 hahahaaha 15:43:30 no it does not 15:43:40 yeah i see no tests at all 15:43:44 that makes this 10x harder 15:43:54 because now you are just guessing that the changes are ok 15:44:06 it is the lowest level of our infrastructure and only gets time when everything that is above is working 15:44:14 my current failure is: 15:44:18 which is the opposite of what makes sense… 15:44:20 haha 15:44:29 i want the lowest level stuff to be the most tested 15:44:33 https://paste.centos.org/view/1314d653 15:45:31 nirik: ah ok - so a string is an encoding aware list of bytes. a bytes is just a list of bytes (could be a string if you know the encoding, but could also just be binary data of any sort) 15:46:01 you can turn str's into byte's and vice versa by .encode('utf8') and .decode('utf8'), respectively 15:46:13 assuming you know the encoding, and that it is utf-8 15:46:20 (this is the error that kitchen makes) 15:46:30 I am guessing ssh_dir is a string and 'authorized_keys' is bytes 15:46:40 if you don't know the encoding, well, you really shouldn't be doing operations like that 15:47:01 indeed. 15:47:03 nopre 15:47:06 ssh_dir = to_bytes(os.path.join(home_dir_base, username, '.ssh')) 15:47:50 i can't volunteer to do the port, but i'd be willing to answer questions for the person doing it ☺ 15:47:55 shall we move along? 15:47:58 only 13 min left 15:48:09 i will take you up on that 15:48:13 move along 15:48:23 so the conclusion is to wait more for someone to do the python3 port 15:48:43 yeah, I think so. we are also looking at replacements again... but thats not going to be quick 15:49:11 #topic Fedora CoreOS related tickets 15:49:13 #info https://github.com/coreos/fedora-coreos-tracker/blob/master/Fedora-Requests.md#existing-requests-for-fedora-infra 15:49:47 * dustymabe waves 15:49:55 let's talk about https://pagure.io/fedora-infrastructure/issue/8142 first 15:50:18 dustymabe: it sounds like the ticket is waiting on you 15:50:21 sure yep. we're unblocked on the immediate issue 15:50:43 bowlofeggs: oops - i'll respond in ticket 15:50:59 ok, so let's move on to https://pagure.io/fedora-infrastructure/issue/8218 15:51:04 there is a short term "get us working" and a longer term "how do we work together in the future" 15:51:21 i think nirik is still working on the longer term 15:51:25 there is no sync script 15:51:30 nirik: any luck with tagging resources? 15:51:31 yeah it's manual only 15:51:38 * bowlofeggs just wrote an SOP for it, in fact ☺ 15:52:07 we could use automation here, but not sure who has the knowledge/time to do it 15:52:11 hmm 15:52:13 dustymabe: nope. I have some questions in to people, but no answers back yet 15:52:17 supposedly we could write an openshift operator to do it 15:52:18 i was still talking about #8142 15:52:43 nirik: +1 - who did you ask? RH people or community people? 15:52:46 it may just not be possible, which is sad, but... 15:53:26 our aws contact... patrick, I'm happy to ask someone else if you think they will know a way to do what I want 15:53:54 ok cool. so auto resource tagging might be hard to achieve.. if we can't achieve that maybe we can pursue some other solution 15:54:11 nirik: david duncan might have some ideas 15:54:14 he works for AWS 15:54:22 might be worth setting up some time to chat with him 15:54:24 i'm gonna switch to open floor in 45 seconds, so we at least get 5 min for that 15:54:27 yes, he is who I meant when I said "our aws contact" 15:54:38 I have been talking to him, yes 15:54:38 bowlofeggs: earlier were you talking about 8218? 15:54:39 (skipping the docs topic for meeting notes due to time) 15:54:46 dustymabe: yes 15:54:47 nirik +1 15:54:57 ok bowlofeggs you've been working on groups setup in communishift ? 15:55:00 #topic open floor 15:55:22 dustymabe: i just documented how we do it. we don't have a way to automate it, nor do we currently have plans or resources to do so 15:55:40 bowlofeggs: ok. if you don't mind take a look at my last comment in that ticket and respond 15:56:04 it's a proposed solution halfway between automation and manual 15:56:13 * dustymabe quiet now 15:56:46 anything for open floor? 15:57:04 i thought board.net worked ok for today's meeting 15:57:17 It has some warts, but it was okish. 15:57:20 no ToS to sign which i like 15:58:49 closing in 1 min 15:59:46 oh, should we try and find next weeks chair? 15:59:48 #endmeeting