15:00:18 #startmeeting Infrastructure (2020-05-28) 15:00:18 Meeting started Thu May 28 15:00:18 2020 UTC. 15:00:18 This meeting is logged and archived in a public location. 15:00:18 The chair is siddharthvipul. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:18 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:00:18 The meeting name has been set to 'infrastructure_(2020-05-28)' 15:00:19 #meetingname infrastructure 15:00:19 The meeting name has been set to 'infrastructure' 15:00:19 #chair nirik pingou smooge cverna mizdebsk mkonecny abompard siddharthvipul 15:00:19 #info Agenda is at: https://board.net/p/fedora-infra 15:00:19 #info About our team: https://docs.fedoraproject.org/en-US/cpe/ 15:00:19 Current chairs: abompard cverna mizdebsk mkonecny nirik pingou siddharthvipul smooge 15:00:19 #topic aloha 15:00:35 morning. 15:00:47 nirik: Good morning :D 15:01:05 let's see who is around today 15:01:17 .hello mobrien 15:01:18 mobrien[m]: mobrien 'Mark O'Brien' 15:01:26 .hello siddharthvipul1 15:01:31 siddharthvipul: siddharthvipul1 'Vipul Siddharth' 15:01:56 .hello austinpowered 15:01:57 austinpowered: austinpowered 'T.C. Williams' 15:02:01 .hello zlopez 15:02:02 mkonecny: zlopez 'Michal Konečný' 15:02:09 my IRC time and system time are out of sync.. hmm. Will fix it post meeting 15:03:43 #topic New folks introductions 15:03:54 alright! anyone new today :) 15:04:11 #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves 15:04:11 #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted 15:04:23 .hello2 15:04:24 pingou: pingou 'Pierre-YvesChibon' 15:05:21 looks like we just have experienced people today (and me) :) 15:05:42 #topic Next chair 15:05:42 #info magic eight ball says: 15:05:42 #info 2020-05-28 - siddharthvipul 15:05:42 #info 2020-06-04 - mkonecny 15:05:42 #info 2020-06-11 - siddharthvipul 15:05:53 Any Volunteers for 2020-06-18? 15:06:00 not me 15:07:04 smooge: haha, I was volunteering you but you saved yourself right on time :P 15:07:50 * nirik suspects he will still be fixing problems from the dc move then... 15:08:30 so if no one volunteers today, we can decide it in next meeting (since we are already running 2 weeks ahead and I can always volunteer in next one).. someone else might want to chair 15:08:39 is it alright? 15:08:44 +1 15:08:52 +1 15:08:55 sure 15:09:06 awesome, moving ahead then 15:09:19 #topic announcements and information 15:09:19 #info CPE Sustaining EU-hours team has standups on Tuesday and Thursday at 1400 UTC in #fedora-admin - please join 15:09:19 #info CPE Sustaining NA-hours team has a Monday through Friday 30 minute meeting going through tickets at 1800 UTC in #fedora-admin 15:09:19 #info Fedora Infrastructure will be moving in 2020-06 from its Phoenix Az datacenter to one near Herndon Va. A lot of planning will be involved on this. Please watch out for announcements on changes. 15:09:21 #info Fedora Communishift move has started but will take longer than expected. Current estimate for bringing back into production is TBD 15:09:31 anyone has anything else to announce? 15:10:07 we will wait for a couple of minutes for folks to fetch links (if they want to share something) 15:10:13 #info Cverna will be moving to CoreOS 15:10:27 #info the-new-hotness 0.13.1 is out, deployed to staging 15:10:58 #info please help test things in iad2, see doc in infrastructure list 15:12:04 congratulations cverna :) 15:12:25 moving ahead now 15:12:28 #topic Oncall 15:12:28 #info https://fedoraproject.org/wiki/Infrastructure/Oncall 15:12:43 #info mboddu was oncall 2020-05-21 -> 2020-05-28 15:12:49 any volunteer for oncall 2020-05-28 -> 2020-06-04? and for 2020-06-04 -> 2020-06-11? 15:12:53 NOT IT 15:13:08 I think I should take 06-04-06-11 15:13:18 or wait... 15:13:20 noted 15:13:28 nirik: waiting :) 15:13:28 misread. I thought that was after the move. 15:13:33 I'd take the one after that. 15:13:37 I can take this week 15:13:41 nirik: sure 15:13:44 cverna: thank you 15:13:44 For the move we might want to do something special 15:13:45 i figure I will be sort of on-call for the move weeks but not really 15:13:53 anyone after cverna? 15:14:04 like have oncall say we are moving, please file a ticket and we will get to it when we can. 15:14:35 .oncalltakeeu 15:14:35 cverna: Kneel before zod! 15:15:02 but I guess it could be nice to have someone tell people that and reassure them and not irq people trying to fix things. 15:15:04 #info cverna is oncall for 2020-05-28 -> 2020-06-04 15:15:44 since I don't see anyone here, I will take the week after cverna 15:15:56 mobrien do you want to shadow me during this week oncall duties ? 15:16:08 let's wait then 15:16:15 what if mobrien[m] is interested :) 15:16:20 I ll ping you when I get pinged and we can look at stuff together 15:16:27 we could always sign up mboddu, since he's not here. ;) 15:16:33 hehe 15:16:34 cverna: Ya thats sounds good. Thanks 15:16:37 cverna: +1, good idea 15:16:49 hahaha, fair.. he did skip one week when he couldn't do takeoncall :P 15:17:13 okay, so just to make it official, anyone taking the week after? 15:17:37 I am here 15:17:49 nirik: What are you throwing me at? 15:17:55 mboddu: will you like to take oncall duty for next week? 15:18:00 Sure 15:18:02 s/will/would 15:18:04 ha. :) 15:18:06 nirik: :P 15:18:11 #info mboddu is oncall for 2020-06-04 -> 2020-06-11 15:18:26 #info Summary of last week: (from current oncall ) 15:18:39 mboddu: time to shine, go ahead :) 15:18:41 Nothing much other than the yesterday's fire 15:18:46 we could also throw tomas under that bus :) 15:19:12 mboddu: was docs fire related? 15:19:14 But I didn't do much yesterday, it was all nirik pingou and smooge 15:19:19 Yes 15:19:44 and cverna and abompard 15:19:47 mboddu: the goal of oncall is to identify if the ping is worth a ticket or interupting someone 15:19:58 just a group of super amazing people *.* 15:20:00 * pingou didn't do much yesterday :( 15:20:25 Yeah, forgot about cverna and abompard 15:20:41 I guess I forgot cverna intentionally :P 15:20:43 one thing to remember is if everything is slow or broken it is most likely rabbitmq 15:20:46 :) 15:20:50 cverna: hahaha 15:21:10 it's the second time this year 15:21:18 I really wonder if we couldn't improve the koji fedora-messaging plugin for rabbitmq being down 15:21:20 honestly I have no idea how our rabbitmq ever worekd 15:21:20 we may want/need to document howto fix that though :( 15:21:41 yeah, but note we are moving to the much newer version... 15:21:43 Seems like bodhi is still having issues with creating updates? https://status.fedoraproject.org/ 15:21:52 nirik: short of it having itself a queueing mechanism, it'll be a little hard 15:21:58 nirik: There are issues with it? 15:22:27 pingou: well, it could timeout/error and continue gracefully. right now, it just hangs and koji builds get in all kinds of bad states. 15:22:39 like the build worked, but tagging it at the end failed. 15:22:50 or the build failed but the parent task is still running 15:22:54 mboddu I think it is better but I have not updated status yet 15:23:08 if only we had a dev environment where we could test koji's behavior more easily :-p 15:23:15 cverna: Ah okay, thanks 15:23:17 cverna: I think it's probibly ok for now... as we continue to investigate? 15:23:39 pingou: if only we had less files. They cause all the problems 15:24:07 Yeah 15:24:35 nirik: not more than one open file at a time! 15:24:54 we should know that by now ^^ 15:25:04 I hope we can setup a nicer staging after the move, but it turns out to be surprisingly hard to do. ;( 15:25:42 fire fire fire.. if anyone of you think I can help, let me know.. I am around on weekends as well (if it's needed) 15:26:09 siddharthvipul: watch out, we may call you on that! 15:26:09 nirik: seems this needs a longer decision and planning.. are we ready to move to next topic for now? 15:26:11 Same here, I am happy to help if needed even during weekends 15:26:17 pingou: I will be counting you on that :) 15:26:44 sure, we can move on. I appreciate help... we are going to likely need it the week of the 8th. ;) 15:27:00 nirik: oh, that's the week I am away 15:27:01 LOL 15:27:02 jk 15:27:16 #topic Monitoring discussion [nirik] 15:27:25 nirik: please do your think :) 15:27:32 lets take a look here... 15:27:58 we have 3 hosts down. 2 of them are taskotron hosts that are no more... I guess we need to just run the noc playbook? 15:28:14 the other is a aarch64 box that had problems and I rebooted it and it never came back up. ;) 15:28:24 we can try power cycling it. 15:28:42 there is still one taskotron related PR pending for our ansible repo 15:28:52 siddharthvipul: "do your thing" reminded me of https://www.youtube.com/watch?v=ojhTu9aAa_Y 15:29:11 pingou: yeah, I think tflink may have already done that one via direct push, but I am not sure. 15:29:21 proxy05 seems out/low on disk space again... 15:29:43 pagure01 and torrent02 also 15:30:08 * mboddu will take a look at torrent02 15:30:16 But I did clean up for f33 release 15:30:19 mboddu: can you nuke the f30 images if they are still there? 15:30:20 Maybe I missed something 15:30:31 since we are after eol 15:30:42 nirik: Yes, I didn't clean up f30 yet and sure, I will take care of it 15:30:44 pingou: you can look at pagure01? is that just normal usage? or ? 15:31:24 proxy05 is at 87% versus 100% 15:31:25 thats about it, some low swap... 15:31:29 nirik: I'll check disk space on pagure01 15:31:44 #action mboddu to clean f30 images from torrents 15:31:44 we can probibly move along unless anyone sees anything else. 15:31:45 currently docs takes up a lot of space 15:31:51 smooge: yeah. ;( 15:32:02 #action pingou to investigate disc space on pagure01 15:32:17 but hey we got that report that proxy05 had an open http port .. so maybe we can look elsewhere 15:32:36 smooge: I think that was one of the amazon ones... 15:33:00 but I could have misread. In any case it was a stupid report. :) 15:33:11 no that was how i read it also 15:33:42 "Your server is serving things!!!!!! OMG QTF BBQ" 15:34:00 :) 15:34:04 next topic? 15:34:20 yep 15:34:23 #topic Data-Center Move update 15:34:35 smooge: nirik, hope it's not *that* bad :) 15:34:49 hopefully not. ;) 15:34:56 so, lets see: 15:35:22 We have a few more bare metal machines to install/configure... power8/9 aarch64 and qa virthosts. 15:35:46 There's lots of various breakage I was going to fix yesterday, but got sucked into fires. ;( 15:36:14 my plan is to keep working on checking/fixing things today. Possibly get new x86_64 builders installed. 15:36:22 smooge is working on the baremetal installs. 15:36:42 Tomorrow I think we will do a mass reboot... and make sure things all come back up ok. 15:37:12 I also hope to finish working on the plan for the migration week. 15:37:17 what moves when, etc... 15:37:31 nirik: we don't have builders yet, right? 15:37:32 next week is testing testing, fixing anything we can see that we can fix. 15:37:53 then it's move week. ;) I'll buy some deathwish coffee next store trip. ;) 15:38:20 pingou: nope, but I have all I need now (they setup a iscsi volume for buildvm-01-32), so I just need to install some 15:38:41 we need to take the staging koji adjustmet sql script and re-work it for the migration 15:39:00 if someone wants to take that on, that would be great (I can file a ticket) 15:39:50 migration week is gonna be fun. It's hard to split things up... so a few days are just going to be really long. 15:39:59 any questions? 15:40:00 nirik: is that the one we use to sync to stg? 15:40:10 if so I've adjusted it not long ago for the latest sync 15:40:29 but it may need some adjustments still as this is prod -> prod vs prod -> stg 15:40:35 pingou: yeah... but it might need some adjustments for iad2 instead of stg 15:40:38 yeah 15:40:51 I was just going to make all new builders. 15:41:39 but it adjusts things like krb for people, and it should adjust them to iad2 not stg 15:42:17 hopefully puiterwijk will have our iad2 ipa working soon. 15:43:10 hopefully it won't be too much adjustments 15:44:34 there are a lot of moving pieces it seems.. :) let's move to openfloor and see if someone has something to discuss :) 15:44:49 #topic Open Floor 15:45:19 https://pagure.io/fedora-infrastructure/issue/8960 15:45:34 * tflink will look at nagios again, thought he fixed it already :-/ 15:46:12 tflink: ah, it's still in inventory/backups 15:46:25 * jednorozec would like to become infra padawan 15:46:41 welcome young padawn jednorozec 15:46:45 jednorozec: same here *nods head* same here 15:46:50 I cleaned up torrent02.fp.o 15:46:57 nirik: cool, I'll get it fixed when I do the HW machines 15:47:08 siddharthvipul: come on, you're already an old padawan 15:47:27 pingou: haha, let's say.. a "useful padawan" 15:47:34 or "more useful" :P 15:47:49 * mboddu is still a padawan too 15:48:21 mboddu: sigh, you all keep raising the bar and you will scare me off :P 15:48:23 siddharthvipul / jednorozec: sure! happy to add you, I thought we already had a long time ago? or do you mean more than the apprentice group? 15:48:40 * tflink will also deal with the rest of kparal's PR for the taskotron stuff - was just waiting for everything to be gone 15:48:48 nirik: I am not in apprentice program if it helps 15:49:22 but I would like to shadow someone and learn more (I did cverna once but that week was awfully quiet) (I am not sad about that) 15:49:40 s/apprentice program/apprentice group 15:50:22 nirik, not sure I may have a lot of rights to do stuff, but would really appreciate someone guiding me through some ticket. 15:50:23 siddharthvipul: your account is siddharthvipul right? 15:50:31 nirik: siddharthvipul1 15:50:50 apologies about that last 1.. sigh! I wish, I wish we could change it easily 15:51:30 nirik: thank youuu 15:51:37 added you both. 15:52:08 cool. Yeah, after dc move hopefully we can have time to start cross training... 15:52:19 awesome 15:52:23 nirik: looking forward to that :) 15:52:36 * nirik wants to give mobrien[m] more fun things to do too... 15:52:39 :) 15:53:02 Do we have any other topic/openfloorworthy dicussions? 15:53:17 I will wait for a few more minutes and then close the meetings if there is nothing :) 15:53:22 nirik mobrien is looking forward to the fun :) 15:53:55 mobrien[m]: uh ohh.. it was a warning I guess :P 15:55:00 and if you all want to hang out in #fedora-noc I am sure we will have dc move stuff going on all the time now. ;) 15:55:14 2 minutes more before meeting ends. 15:55:26 maybe after a lots of training we can help enough to let nirik and smooge work only 12 hour days 15:55:35 * nirik plays 2 minutes to midnight 15:55:39 nirik: I am there but whenever I open it, it's a lot of logs and I get lost immediately 15:55:52 mobrien[m]: I look forward to it! 15:56:11 mobrien[m]: long shot but I can see that's possible in a couple of years 15:56:41 1 minute 15:56:58 untill there is another dc move next year.... 15:57:15 Thank you everyone for coming, it was a good meeting :) 15:57:23 #endmeeting