15:00:10 <mkonecny> #startmeeting Infrastructure (2020-02-06)
15:00:10 <zodbot> Meeting started Thu Feb  6 15:00:10 2020 UTC.
15:00:10 <zodbot> This meeting is logged and archived in a public location.
15:00:10 <zodbot> The chair is mkonecny. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:10 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:00:10 <zodbot> The meeting name has been set to 'infrastructure_(2020-02-06)'
15:00:10 <mkonecny> #meetingname infrastructure
15:00:10 <mkonecny> #chair nirik pingou relrod smooge tflink cverna mizdebsk mkonecny abompard bowlofeggs
15:00:10 <mkonecny> #info Agenda is at: https://board.net/p/fedora-infra
15:00:10 <mkonecny> #topic aloha
15:00:10 <zodbot> The meeting name has been set to 'infrastructure'
15:00:10 <zodbot> Current chairs: abompard bowlofeggs cverna mizdebsk mkonecny nirik pingou relrod smooge tflink
15:00:17 <mkonecny> .hello zlopez
15:00:18 <zodbot> mkonecny: zlopez 'Michal Konečný' <michal.konecny@packetseekers.eu>
15:00:36 <tflink> morning
15:00:39 <smooge> hello
15:00:45 <nirik> morning everyone
15:00:48 <bowlofeggs> .hello2
15:00:49 <zodbot> bowlofeggs: bowlofeggs 'Randy Barlow' <rbarlow@redhat.com>
15:01:15 <cverna> hello
15:01:23 <jrichardson> Hi all
15:01:26 <austinpowered> hello
15:01:35 <mkonecny> #topic Next chair
15:01:35 <mkonecny> #info magic eight ball says:
15:01:35 <mkonecny> #info  2020-02-06 - mkonecny
15:01:35 <mkonecny> #info 2020-02-13 - smooge
15:01:35 <mkonecny> #info 2020-02-20 - ???
15:01:50 <mkonecny> Ok, who wants to drive next meeting?
15:02:43 <mkonecny> The next next one
15:02:55 <bowlofeggs> i'll do 2 weeks from nw
15:03:07 <bowlofeggs> the 20th
15:03:29 <mkonecny> #info 2020-02-20 - bowlofeggs
15:03:34 <cverna> a reminder that anyone can chair the meeting
15:03:44 <mkonecny> bowlofeggs: It's yours
15:03:49 <cverna> all the instructions are here https://board.net/p/fedora-infra
15:03:58 <mkonecny> #topic New folks introductions
15:03:58 <mkonecny> #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
15:03:58 <mkonecny> #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted
15:04:11 <mkonecny> Anybody new here?
15:04:53 <austinpowered> Not really new, but it's been some time since I attended
15:04:53 <jrichardson> I'm fairly sure I've introduced myself before, but just in case...
15:05:17 <jrichardson> austinpowered, go ahead
15:05:38 <puiterwijk> Not really new, but have been a while since I've showed up to these.
15:05:44 <austinpowered> Noot much to add - just trying to watch and learn
15:05:57 <austinpowered> Not much to add - just trying to watch and learn
15:06:03 <cverna> hey puiterwijk o/
15:06:03 * nirik waves at everyone.
15:06:14 <mkonecny> puiterwijk: Good to see you
15:06:22 <jrichardson> I'm new to Leigh's AAA/Sustaining team here in Waterford
15:06:53 <mkonecny> It looks like our Waterford team base is growing :-)
15:07:04 * pingou here but in a meeting still
15:07:08 <jrichardson> with one or two more to go it sounds like
15:07:10 <nirik> soon, they will take over the world! :)
15:07:19 <jrichardson> :)
15:07:34 <cverna> :)
15:07:45 <mkonecny> Welcome everybody
15:08:01 <mkonecny> #topic announcements and information
15:08:01 <mkonecny> #info ops folks are trying a 30min ticket triage every day at 19UTC in #fedora-admin
15:08:01 <mkonecny> #info https://xkcd.com/1334/
15:09:15 <mkonecny> Anything else that was missing on the board?
15:09:29 <cverna> #
15:09:33 <cverna> rhh
15:09:48 <cverna> #info osbs support aarch64 in staging
15:10:30 <cverna> #info we have a lot of tickets https://pagure.io/fedora-infrastructure/issues :)
15:10:44 <nirik> #info old openstack decomissing work is beinginning
15:10:49 <nirik> begining even
15:11:09 <nirik> cverna: also releng and bugzilla bugs too. :)
15:11:29 <cverna> nirik: yeah :)
15:11:45 <puiterwijk> \o/
15:12:10 <puiterwijk> Openstack has served us long and .... okay, just long. Good to see that specific (haunted) cluster go :)
15:12:41 <nirik> I contacted people with instances yesterday. There's a few folks that are really needing something, but we will get it sorted.
15:13:06 <relrod> #info we have an APAC proxy now in AWS and probably a few more to come if this one ends up working out well
15:13:27 * puiterwijk hasn't seen an email yet. But I bet iddev has come up. If people still want that, I can move it to communishift, but I guess people should start developing against Keycloak if Fedora Infra is ever to move to that
15:13:52 <nirik> puiterwijk: yeah, that one I mentioned in the ticket...
15:14:02 <nirik> .ticket 8614
15:14:03 <zodbot> nirik: Issue #8614: Retire old OpenStack Cloud - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8614
15:14:08 <relrod> puiterwijk: KC isn't in scope for the AAA project right now, it's a "later, maybe, some day" -- so ipsilon is still needed
15:14:41 <puiterwijk> relrod: okay. Then I can look at iddev at some point probably
15:14:43 <cverna> iddev is really useful  IMHO would be good to keep it
15:14:50 * nirik nods.
15:15:00 <nirik> in related news: I looked at kubevirt/CNV again.
15:15:19 <nirik> I got an instance to bootup... but still can't figure out how to ssh into it from the outside.
15:15:54 <puiterwijk> on iddev: will see what I can do next week
15:16:43 <mkonecny> Ok, moving to next topic
15:16:51 <mkonecny> #topic Oncall
15:16:51 <mkonecny> #info https://fedoraproject.org/wiki/Infrastructure/Oncall
15:16:51 <mkonecny> #info bowlofeggs is oncall 2020-01-30 -> 2020-02-06
15:16:51 <mkonecny> #info smooge is oncall 2020-02-06 -> 2020-02-13
15:16:52 <mkonecny> #info ??? is oncall 2020-02-13 -> 2020-02-20
15:16:53 <smooge> .takeoncallus
15:17:10 <mkonecny> ## .oncalltakeeu .oncalltakeus
15:17:36 <mkonecny> Who wants to take oncall for 13-20
15:17:39 <mkonecny> ?
15:17:50 * nirik can
15:18:15 <cverna> I can take after that 20 - 27
15:18:15 <smooge> .oncalltakeus
15:18:15 <zodbot> smooge: Kneel before zod!
15:18:17 <nirik> so, related here... what every happened to the CPE 'how to work with us' doc?
15:18:53 <cverna> it is on docs.fp.o but hidden
15:18:54 <mkonecny> nirik: It's on the docs.fp.o, but it's not accessible from main page
15:19:05 * cverna looks for it
15:19:07 <nirik> ok, we should try and finish that someday.
15:19:21 <nirik> would be nice to point people to.
15:19:28 <mkonecny> #info nirik is oncall 2020-02-13 -> 2020-02-20
15:19:28 <mkonecny> #info cverna is oncall 2020-02-20 -> 2020-02-27
15:19:33 <cverna> #link https://docs.fedoraproject.org/en-US/cpe/
15:19:52 <nirik> possibly needs some rework now
15:20:18 <mkonecny> nirik: It was created year ago, so probably
15:20:21 <mkonecny> #info Summary of last week: (from current oncall )
15:20:48 <mkonecny> bowlofeggs: Your call
15:21:15 <bowlofeggs> not too much to report - most of my oncalls i missed due to being in my early morning
15:21:20 <bowlofeggs> and a few becausei was in meetings
15:21:29 <bowlofeggs> so i think i only attended to one
15:21:37 <bowlofeggs> <end_of_report>
15:22:04 <cverna> I think people got us to file a ticket now which is nice :)
15:22:11 <cverna> used*
15:22:38 <mkonecny> Yeah, not too much pings
15:22:47 <nirik> its getting better yeah
15:22:57 <mkonecny> #topic Monitoring discussion [nirik]
15:22:57 <mkonecny> #info https://nagios.fedoraproject.org/nagios
15:22:57 <mkonecny> #info Go over existing out items and fix
15:23:01 <nirik> might just be because everyone is recovering from devconf tho. ;)
15:23:14 <nirik> so, lets see
15:23:18 <cverna> haha maybe :)
15:23:28 <nirik> 3 down hosts:
15:23:44 <nirik> one of those is a cloud instance that I shutdown (regcfp2... hope thats ok, puiterwijk )
15:23:58 <nirik> 2 are stg builders, but I need new aarch64 hw to reinstall them...
15:24:07 <nirik> the noc playbook is currently broken
15:24:15 <puiterwijk> nirik: thanks!
15:24:27 <nirik> 2 busgateway ones that have been around for a long time:
15:24:32 <nirik> 1) no fas messages
15:24:34 <puiterwijk> I wasn't aware that was still running :)
15:24:36 <nirik> 2) no planet messages
15:25:14 <nirik> qa-stg01 is now on space. need to ping tflink on that most likely
15:25:33 <tflink> I really need to get around to archiving the data on that and deleting it
15:25:36 <nirik> thats everything
15:26:44 <mkonecny> Ok, next topic
15:26:53 <mkonecny> #topic Tickets discussion [nirik]
15:26:53 <mkonecny> #info https://pagure.io/fedora-infrastructure/report/Meetings%20ticket
15:27:13 <nirik> no meeting tickets. we could discuss any other ones if there's some people want to bring up
15:27:41 <nirik> if people are looking for things to do... it sure would be nice to finish retiring fedmsg before the datacenter move
15:28:00 <nirik> and a replacement for FMN would be a nice thing. ;)
15:28:16 <cverna> .ticket 8040
15:28:17 <zodbot> cverna: Issue #8040: Credentials for https://id.fedoraproject.org/openidc/ - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8040
15:28:45 <cverna> I saw a ping on that and it has been sitting there for a while
15:28:59 <mkonecny> nirik: I think the FMN couldn't be replaced till every fedora-messaging application will have their own schema
15:29:00 <nirik> huh, I don't understand the last comment
15:29:11 <puiterwijk> The ticket title is too broad
15:29:15 <puiterwijk> And the last reply is totally off-base
15:29:30 <puiterwijk> The ticket is for https://github.com/Fedora-dotnet/verification-fas-discord-reddit, but the last comment is talking about general stuff.
15:29:38 <cverna> yeah but the original request still stand
15:29:38 <nirik> yeah, that confused me
15:29:49 <puiterwijk> cverna: not quite
15:29:57 <puiterwijk> Because last I heard they're not ready for launch yet
15:30:09 <cverna> ah ok
15:30:27 <puiterwijk> At least, I've not seen a comment back from the original requestor
15:30:40 <nirik> the 'until we are ready to launch' makes it sound like 'until we have OIDC in fedora infrastructure...'
15:30:45 <cverna> https://pagure.io/fedora-infrastructure/issue/8040#comment-588998
15:31:11 <cverna> yes sounds that they have been waiting for a while
15:31:55 <puiterwijk> ...
15:31:55 <puiterwijk> Potential milestones could be:
15:31:55 <puiterwijk> Simple graphical interface as described above, without the actual functionality.*
15:31:55 <puiterwijk> FAS login using ipsilon.
15:32:01 <nirik> I am not clear on the current state of their development
15:32:02 <puiterwijk> From their github README.
15:32:16 <puiterwijk> Which sounds like "we want to implement this", for which I pointed them to iddev.
15:32:45 <nirik> so I think we should explitcitly ask them if they are ready to move their app in production/it's working with iddev fine now ?
15:32:50 <puiterwijk> Anyway, commented asking if they're ready for deployment.
15:32:55 <cverna> yes this is all very confusing
15:33:13 <nirik> we have some script or template or sop for this right?
15:33:44 <puiterwijk> Yes
15:34:32 <cverna> cool
15:34:36 <puiterwijk> https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/ipsilon.html#create-openid-connect-secrets-for-apps
15:34:51 <nirik> cool. lets see what they say. :)
15:35:31 <mkonecny> Any other ticket?
15:35:34 <puiterwijk> I'm also relatively sure we had a template $somewhere with the questions to ask for OIDC credentials
15:36:18 <nirik> yeah
15:38:04 <nirik> next item I guess?
15:38:07 <mkonecny> #topic ResultsDB Ownership Going Forward
15:38:07 <mkonecny> #link https://pagure.io/fedora-infrastructure/issue/8415 (revist post DevConf)
15:38:19 <cverna> we can skipt that
15:38:23 <cverna> skip*
15:38:24 <mkonecny> Ok
15:38:27 <mkonecny> #topic backlog discussion
15:38:27 <mkonecny> #info go over our backlog and discuss and determine priority
15:38:37 <cverna> and remove it from the board, pingou is tracking this
15:39:01 <mkonecny> Removed
15:39:13 <cverna> #link https://pagure.io/fedora-infrastructure/issues?status=Open&tags=backlog&priority=0&close_status=
15:39:20 <nirik> so yeah, not sure how to really approach things...
15:39:32 <nirik> I did mark N things backlog a while back
15:39:44 <nirik> (and we also have copies in jira which are probibly out of sync now)
15:39:54 <nirik> .ticket 8516
15:39:56 <zodbot> nirik: Issue #8516: fedmsg-gateway on Python3 needs monitoring fixes - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8516
15:40:03 <nirik> I think this is done now. relrod ^
15:40:38 <cverna> .ticket 8167
15:40:41 <zodbot> cverna: Issue #8167: Adding topic authorization to our RabbitMQ instances - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8167
15:40:52 <cverna> this would be nice to have
15:41:18 <nirik> that needs rhel8? /me looks again
15:41:32 <cverna> yeah I think
15:42:00 <nirik> I wonder how hard it will be to migrate things...
15:42:04 <cverna> do we have any rhel8 host already ?
15:42:10 <nirik> yes, a number of them.
15:42:16 <cverna> cool :)
15:42:57 <nirik> I wonder if we can take down the old cluster, save the queues and somehow restore them on the new cluster?
15:43:27 <cverna> yes we might need to get a bit of abompard times, if he knows
15:43:27 <mkonecny> You mean the rabbitMQ queues?
15:43:31 <cverna> yes
15:43:34 <nirik> yeah.
15:43:54 <nirik> or just say 'there's an outage, all queues are new after this'
15:44:03 <mkonecny> I think you could create duplicate instance of rabbitMQ server
15:44:03 <nirik> but not sure if that will upset people
15:44:11 <abompard> there isn't so many messages in the queues
15:44:36 <nirik> I guess if the entire cluster is down things just keep retrying to send?
15:44:36 <mkonecny> Most of them are processed immediately
15:44:39 <abompard> Also, RabbitMQ has an upgrading guide
15:45:03 <abompard> https://www.rabbitmq.com/upgrade.html
15:45:40 <nirik> yeah...
15:45:43 <cverna> the thing is that we will need new host, if we need rhel8
15:45:46 <nirik> just pondering the best way to do it.
15:46:05 <nirik> yes. there's no way to in place upgrade rhel7-8 (well, there is, but I don't trust it)
15:46:50 <cverna> could we bring new host and try to hot swap ? using the proxy ?
15:46:53 <nirik> I guess ideally for me from the ops side would be: 1) take down all current cluster. 2) reinstall with rhel8, 3) bring up, 4) reconfigure for our queues, etc...
15:47:16 <nirik> if we need to try and swap over or move data it makes it more complex
15:47:52 <cverna> yes and that might not be worth the trouble
15:48:05 <abompard> message producers will retry during downtime, with exponential backoff
15:48:25 <abompard> but if they use an older version of the library they may need to be restarted
15:48:42 <cverna> ha nice :)
15:48:46 <abompard> also, for web application that can look like a page that is stuck
15:48:54 <nirik> well, the producers shouldn't change any right?
15:48:57 <abompard> so, not ideal
15:49:02 <pingou> with a resulting timeout for the web-app
15:49:09 <nirik> oh, I see what you are saying
15:49:44 <cverna> 10 min till the end of the meeting
15:49:45 <nirik> Anyhow, I am game to try in stg first if we want... not sure when would be a good time to do it...
15:49:59 <cverna> +1 to try in stg :)
15:50:26 <abompard> happy to help, as long as I can leave AAA for a while ;-)
15:50:27 <mkonecny> cverna: only open floor is waiting for us
15:50:59 <nirik> abompard: I can do the reinstall, etc... but would be good for you to be available after to tweak it/fix any upgrade issues...
15:51:08 <nirik> perhaps we can pick sometime next week?
15:51:29 <abompard> I think we have flagged fridays for the sustaining team
15:51:55 <abompard> but we need to be conscious of the timezone difference
15:52:03 <nirik> yeah. ;(
15:52:17 <nirik> I can try and get up early sometime... just need notice in advance.
15:52:23 <abompard> me jumping in after you is not in the right direction ;-)
15:52:37 <mkonecny> The timezones are always in our way
15:52:52 <abompard> or, if staging can be left broken for a few hours, you do it at the end of your day and I pick it up in the morning
15:53:35 <mkonecny> I don't think anybody will be mad, if staging will be broken for some time
15:53:41 <cverna> isn't staging always in a broken state ? :P
15:53:42 <nirik> could work. Not sure how bad it would mess with things there.
15:54:06 <pingou> it'll impact the monitor_gating script
15:54:07 <nirik> I'll look at my schedule next week and see if I can do it last thing in my day
15:54:07 <cverna> we can announce the outage like for prod
15:54:19 <pingou> so if we have it running in openshift, we may want to stop it for the duration of the outage
15:54:24 <pingou> no need to tests a known down time
15:54:25 <mkonecny> cverna: Staging is always stable \me Doing Jedi trick
15:54:26 <abompard> your thursday evening would be ideal I think
15:54:51 <nirik> abompard: cool... so 13th?
15:55:02 <abompard> we're in luck ;-)
15:55:06 <mkonecny> 5 minutes till end of the meeting
15:55:12 <nirik> missing friday 13th. ;)
15:55:20 <nirik> abompard: I'll plan on it/post to list
15:55:21 <abompard> oh, 13th is the thu. too bad
15:55:31 <abompard> Thanks!
15:55:59 <mkonecny> #topic Open Floor
15:56:14 <mkonecny> Does anybody has something quick to discuss?
15:56:20 <nirik> if it goes smoothly we can do prod soon after.
15:56:58 <nirik> oh, I had one quick thing!
15:57:09 <mkonecny> Go ahead
15:57:40 <nirik> so f32 branching is next week... can we work out somewhere how to setup f32 branched so it has gating? last branching we didn't... it just was like pregating rawhide until bodhi was enabled later.
15:58:07 <smooge> no idea
15:58:12 <nirik> perhaps pingou could look at the branching stuff and see...
15:59:03 <nirik> we have a releng ticket on this we can use to coordinate.
15:59:06 <nirik> https://pagure.io/releng/blob/master/f/scripts/branching
15:59:07 <pingou> we need to create the release in bodhi like we do for rawhide
15:59:10 <cverna> I think that should work, at least I don't see any reason why not
15:59:20 <nirik> https://pagure.io/releng/issue/9228
15:59:44 <cverna> https://pagure.io/releng/blob/master/f/docs/source/sop_rawhide_bodhi.rst
15:59:46 <nirik> we also need to make it so it stops processing new builds until we have a compose... but that should be pretty easy
15:59:52 <mkonecny> Ok, I'm ending the meeting now
15:59:55 <mkonecny> #endmeeting