<@zlopez:fedora.im>
17:00:05
!startmeeting Infrastructure (2025-12-04)
<@meetbot:fedora.im>
17:00:06
Meeting started at 2025-12-04 17:00:05 UTC
<@meetbot:fedora.im>
17:00:06
The Meeting name is 'Infrastructure (2025-12-04)'
<@zlopez:fedora.im>
17:00:12
!info About our team: https://docs.fedoraproject.org/en-US/cle/
<@zlopez:fedora.im>
17:00:12
!info Agenda is at: https://board.net/p/fedora-infra
<@zlopez:fedora.im>
17:00:12
!topic Hola y bienvenido
<@zlopez:fedora.im>
17:00:12
!info Fedora Infra documentation: https://docs.fedoraproject.org/en-US/infra
<@zlopez:fedora.im>
17:00:12
!meetingname infrastructure
<@zlopez:fedora.im>
17:00:12
!chair @nirik:matrix.scrye.com @zlopez:fedora.im @jnsamyak:matrix.org @james:fedora.im @gwmngilfen:fedora.im
<@meetbot:fedora.im>
17:00:12
The Meeting Name is now infrastructure
<@zlopez:fedora.im>
17:00:29
!hi
<@zodbot:fedora.im>
17:00:29
Michal Konecny (zlopez)
<@zlopez:fedora.im>
17:00:42
Welcome everyone to weekly Fedora Infra meeting
<@nirik:matrix.scrye.com>
17:00:58
morning everyone
<@gwmngilfen:fedora.im>
17:02:16
!hi
<@zodbot:fedora.im>
17:02:19
Greg Sutcliffe (gwmngilfen) - he / him / his
<@zlopez:fedora.im>
17:02:51
How is everyone?
<@nirik:matrix.scrye.com>
17:03:22
better today than in a while. ;)
<@gwmngilfen:fedora.im>
17:03:53
mostly i'm struggling to believe it is actually December already
<@zlopez:fedora.im>
17:04:39
Christmas time soon 🎄 🙂
<@gwmngilfen:fedora.im>
17:04:50
day off to go shopping tomorrow 😉
<@nirik:matrix.scrye.com>
17:04:51
yeah.
<@zlopez:fedora.im>
17:05:30
I'm making apple jam as gifts 🙂
<@zlopez:fedora.im>
17:05:44
Let's go to first topic
<@zlopez:fedora.im>
17:05:50
!info This is a place where people who are interested in Fedora Infrastructure can introduce themselves
<@zlopez:fedora.im>
17:05:50
!topic New folks introductions
<@zlopez:fedora.im>
17:05:50
!info Getting Started Guide: https://docs.fedoraproject.org/en-US/infra/gettingstarted/
<@zlopez:fedora.im>
17:05:57
Anybody new around?
<@zlopez:fedora.im>
17:07:31
It doesn't seem like we have anybody
<@zlopez:fedora.im>
17:07:37
Let's go to next quest
<@zlopez:fedora.im>
17:07:43
!topic Next chair
<@zlopez:fedora.im>
17:07:43
!info chair 2026-01-01 - New Year
<@zlopez:fedora.im>
17:07:43
!info chair 2025-12-25 - Christmas
<@zlopez:fedora.im>
17:07:43
!info chair 2025-12-18 - ???
<@zlopez:fedora.im>
17:07:43
!info chair 2025-12-11 - ???
<@zlopez:fedora.im>
17:07:43
!info chair 2025-12-04 - zlopez
<@zlopez:fedora.im>
17:07:43
!info chair 2026-01-08 - ???
<@zlopez:fedora.im>
17:07:43
!info magic eight ball says:
<@zlopez:fedora.im>
17:07:59
I prepared the chairing schedule a little
<@nirik:matrix.scrye.com>
17:08:41
I will be gone 18th and 8th i think...
<@nirik:matrix.scrye.com>
17:08:48
I can take next week tho
<@gwmngilfen:fedora.im>
17:09:13
i'll be here but as ever it's hard to commit to the 2nd half of the hour
<@zlopez:fedora.im>
17:09:35
I will be off from 19th to 5th, we can decide to just skip the 18th as well
<@nirik:matrix.scrye.com>
17:09:55
+1 to skip 18th...
<@nirik:matrix.scrye.com>
17:10:07
8th folks might be back (but not me)
<@zlopez:fedora.im>
17:10:16
nirik: Let's give you the 11th
<@zlopez:fedora.im>
17:10:28
!info chair 2025-12-11 - nirik
<@nirik:matrix.scrye.com>
17:10:32
k
<@zlopez:fedora.im>
17:11:05
I can take the first meeting in 2026
<@zlopez:fedora.im>
17:11:15
!info chair 2026-01-08 - zlopez
<@nirik:matrix.scrye.com>
17:11:26
👍
<@zlopez:fedora.im>
17:11:36
Let's skip the meeting on 18th
<@zlopez:fedora.im>
17:11:58
!info chair 2025-12-18 - Cancelled
<@zlopez:fedora.im>
17:12:16
!info rdu-cc to rdu3 datacenter move next monday ( 2025-12-08 )
<@zlopez:fedora.im>
17:12:16
!topic announcements and information
<@zlopez:fedora.im>
17:12:16
!info CLE Infra&Releng NA-hours team has a Monday through Thursday 30 minute meeting going through tickets at 1900 UTC in https://matrix.to/#/#meeting-3:fedoraproject.org
<@zlopez:fedora.im>
17:12:16
!info pagure.io has been migrated, please report any issues
<@nirik:matrix.scrye.com>
17:12:41
There's been a few issues with the pagure migration, but hopefully handled soon...
<@zlopez:fedora.im>
17:12:58
!info OpenID authentication is now separated from rest of authentication. See https://communityblog.fedoraproject.org/end-of-openid-in-fedora/
<@nirik:matrix.scrye.com>
17:13:19
also, I am going to need some EU folks help with the migration monday. Basically will need someone(s) to shut down things in rdu2-cc before they start moving them...
<@nirik:matrix.scrye.com>
17:14:20
They are planning on starting that at 13UTC on monday. (8am EST).
<@gwmngilfen:fedora.im>
17:14:44
i'm sure I can help there
<@gwmngilfen:fedora.im>
17:14:55
do you have a doc for it, like we did in June?
<@nirik:matrix.scrye.com>
17:14:59
cool. I can write up the steps in the outage ticket?
<@nirik:matrix.scrye.com>
17:15:06
I don't, but there's a ticket. ;)
<@zlopez:fedora.im>
17:15:08
I should be around hopefully, I have a dentist appointment in morning
<@gwmngilfen:fedora.im>
17:15:11
either works
<@nirik:matrix.scrye.com>
17:15:16
well, 2 tickets
<@nirik:matrix.scrye.com>
17:15:19
https://pagure.io/fedora-infrastructure/issue/12955
<@nirik:matrix.scrye.com>
17:15:25
(for the outage)
<@nirik:matrix.scrye.com>
17:15:26
and
<@nirik:matrix.scrye.com>
17:15:39
https://pagure.io/fedora-infrastructure/issue/12818
<@nirik:matrix.scrye.com>
17:15:43
for the overall migration
<@gwmngilfen:fedora.im>
17:16:07
got both open, i can refresh on monday (out tomorrow)
<@nirik:matrix.scrye.com>
17:16:17
ok
<@nirik:matrix.scrye.com>
17:16:30
I'll update 12818
<@gwmngilfen:fedora.im>
17:17:31
if you can do that today, i'll try to find a few min tomorrow to review in case i have questions
<@gwmngilfen:fedora.im>
17:17:52
but on first glance it seems like a small enough list 🙂
<@nirik:matrix.scrye.com>
17:20:27
comment added. ;)
<@zlopez:fedora.im>
17:21:09
Let's go for the next quest 🙂
<@zlopez:fedora.im>
17:21:10
!topic Oncall
<@zlopez:fedora.im>
17:21:10
!info on call from 2026-01-09 to 2026-01-15 - ???
<@zlopez:fedora.im>
17:21:10
!info on call from 2025-12-19 to 2026-01-08 - Holidays
<@zlopez:fedora.im>
17:21:10
!info on call from 2025-12-12 to 2025-12-18 - ???
<@zlopez:fedora.im>
17:21:10
!info on call from 2025-12-05 to 2025-12-11 - ???
<@zlopez:fedora.im>
17:21:10
!info https://docs.fedoraproject.org/en-US/infra/day_to_day_fedora/#_the_oncall_role_in_our_team
<@zlopez:fedora.im>
17:21:40
I noticed that Gwmngilfen is now on oncall, although I didn't found it in agenda
<@zlopez:fedora.im>
17:21:53
I can take the next week
<@gwmngilfen:fedora.im>
17:21:59
i didnt see any pings
<@zlopez:fedora.im>
17:22:20
Or the week after
<@nirik:matrix.scrye.com>
17:22:23
note that for the holidays we should change it to have no one on call and suggest people file tickets
<@nirik:matrix.scrye.com>
17:22:42
I'll of course be around and check in on tickets from time to time
<@zlopez:fedora.im>
17:22:42
This is why I have the holidays in schedule 🙂
<@gwmngilfen:fedora.im>
17:23:07
> Mention in internal slack that everything is shutdown and ready to move.
<@gwmngilfen:fedora.im>
17:23:07
<@gwmngilfen:fedora.im>
17:23:07
nirik the same chat we've been using for all the dc-move stuff?
<@nirik:matrix.scrye.com>
17:23:16
yes
<@gwmngilfen:fedora.im>
17:23:34
ok, i can deal with that on monday then
<@gwmngilfen:fedora.im>
17:23:40
sorry, back to the topic 😉
<@nirik:matrix.scrye.com>
17:23:42
cool
<@zlopez:fedora.im>
17:23:57
So any volunteer for next week or should I take it?
<@nirik:matrix.scrye.com>
17:24:05
I can take it...
<@nirik:matrix.scrye.com>
17:24:09
either way
<@zlopez:fedora.im>
17:24:24
!info on call from 2025-12-05 to 2025-12-11 - nirik
<@zlopez:fedora.im>
17:24:24
!info on call from 2025-12-12 to 2025-12-18 - zlopez
<@zlopez:fedora.im>
17:24:55
!oncall
<@zodbot:fedora.im>
17:24:55
If they do not respond, please file a ticket (https://pagure.io/fedora-infrastructure/issues)
<@zodbot:fedora.im>
17:24:55
<@zodbot:fedora.im>
17:24:55
● @nirik:matrix.scrye.com (kevin) Current Time for them: 09:24 (US/Pacific)
<@zodbot:fedora.im>
17:24:55
The following people are oncall:
<@zlopez:fedora.im>
17:24:59
Set
<@zlopez:fedora.im>
17:25:38
I will set the oncall to nobody before leaving on 18th 🙂
<@nirik:matrix.scrye.com>
17:25:55
sounds good.
<@gwmngilfen:fedora.im>
17:26:05
i'm working until the 23rd so I'll be keeping an eye out, but sounds good all the same
<@nirik:matrix.scrye.com>
17:26:15
we can leave a tap dripping to prevent the pipes from freezing while we are gone. ;)
<@zlopez:fedora.im>
17:26:59
!info Summary of last week: (from current oncall)
<@gwmngilfen:fedora.im>
17:27:10
nothing afaik
<@zlopez:fedora.im>
17:27:21
Let's continue on our road than
<@zlopez:fedora.im>
17:27:26
!topic Monitoring discussion [nirik]
<@zlopez:fedora.im>
17:27:26
!info Go over existing items and fix them
<@zlopez:fedora.im>
17:27:26
!info https://nagios.fedoraproject.org/nagios & https://zabbix.fedoraproject.org (top 100 triggers: https://zabbix.fedoraproject.org/zabbix.php?action=toptriggers.list)
<@gwmngilfen:fedora.im>
17:27:54
nagios first?
<@zlopez:fedora.im>
17:29:35
Go for it
<@gwmngilfen:fedora.im>
17:30:09
not my area 😎
<@nirik:matrix.scrye.com>
17:30:28
sorry... sidetracked.
<@nirik:matrix.scrye.com>
17:30:58
so nagios has a few pagure.io things... network related I am sure... need to sort those out
<@nirik:matrix.scrye.com>
17:31:41
I am seeing bodhi alerts from time to time and also vmhost max procs.
<@zlopez:fedora.im>
17:31:41
You mean the certs alerts?
<@nirik:matrix.scrye.com>
17:31:46
we should ajust those up
<@nirik:matrix.scrye.com>
17:32:10
also, log01 is close to alerting... so it alerts, compresses things, nagios recovers, etc... every day. We should change that limit
<@zlopez:fedora.im>
17:32:45
I started processing bodhi queue on staging, it has around 1,5 million messages
<@nirik:matrix.scrye.com>
17:33:08
this is prod, when people submit large updates. ;)
<@nirik:matrix.scrye.com>
17:33:18
but what was the problem with bodhi in staging?
<@nirik:matrix.scrye.com>
17:33:24
(we have a ticket on this also)
<@gwmngilfen:fedora.im>
17:34:03
hmm, we have proc alerts in zabbix which did not trigger, I should check we have the same thresholds for those hosts
<@zlopez:fedora.im>
17:34:22
There is only one consumer and the messages are going in faster than it could consume it
<@nirik:matrix.scrye.com>
17:35:18
Gwmngilfen: the ones I mean are rabbitmq queuyes
<@nirik:matrix.scrye.com>
17:35:27
Gwmngilfen: the ones I mean are rabbitmq queues
<@gwmngilfen:fedora.im>
17:35:39
i was referring to the second part of this
<@nirik:matrix.scrye.com>
17:35:48
ah, ok
<@nirik:matrix.scrye.com>
17:36:01
so you added consumers?
<@zlopez:fedora.im>
17:36:56
Yes, there are 4 consumers now, hopefully the queue will drop and I will lower that number
<@gwmngilfen:fedora.im>
17:37:54
we have graphs to check!
<@gwmngilfen:fedora.im>
17:38:37
ok, quick zabbix update before I have to run?
<@zlopez:fedora.im>
17:38:48
Go for it
<@gwmngilfen:fedora.im>
17:39:02
- HAProxy alerts for src.fpo - definitely scrapers
<@gwmngilfen:fedora.im>
17:39:02
- IT have just given me inbound ip/ports for zabbix so i should be able to sort the copr hosts next week
<@gwmngilfen:fedora.im>
17:39:02
- I just enabled the Rabbit template in prod, so lots of new data coming in for that
<@gwmngilfen:fedora.im>
17:39:02
other updates:
<@gwmngilfen:fedora.im>
17:39:02
<@gwmngilfen:fedora.im>
17:39:02
so first the top 100 / 7 days:
<@gwmngilfen:fedora.im>
17:39:02
- load avg on zabbix / bmvhosts / proxies - planning to up the vCPU on zabbix, the others need better triggers, which will have to wait
<@gwmngilfen:fedora.im>
17:39:02
- some disk space issues with log01 / openqa / ipa, not sure they are just what they are and should have a higher threshold (log01 is definitely that)
<@gwmngilfen:fedora.im>
17:39:21
- HAProxy alerts for src.fpo - definitely scrapers
<@gwmngilfen:fedora.im>
17:39:21
so first the top 100 / 7 days:
<@gwmngilfen:fedora.im>
17:39:21
<@gwmngilfen:fedora.im>
17:39:21
- load avg on zabbix / bmvhosts / proxies - planning to up the vCPU on zabbix, the others need better triggers, which will have to wait
<@gwmngilfen:fedora.im>
17:39:21
- some disk space issues with log01 / openqa / ipa, not sure if they are just what they are and should have a higher threshold (log01 is definitely that)
<@gwmngilfen:fedora.im>
17:39:21
<@gwmngilfen:fedora.im>
17:39:21
other updates:
<@gwmngilfen:fedora.im>
17:39:21
<@gwmngilfen:fedora.im>
17:39:21
- I just enabled the Rabbit template in prod, so lots of new data coming in for that
<@gwmngilfen:fedora.im>
17:39:21
- IT have just given me inbound ip/ports for zabbix so i should be able to sort the copr hosts next week
<@gwmngilfen:fedora.im>
17:39:32
forward-looking, I'm hoping to have certs & datagrepper checks done next week, at which point I think we can make a call on switching to Zabbix
<@gwmngilfen:fedora.im>
17:40:19
maybe we can make next week's meeting a go/no-go on that?
<@gwmngilfen:fedora.im>
17:40:35
we won't have 100% coverage but i think we'll have most of what we need
<@nirik:matrix.scrye.com>
17:40:36
sure.
<@nirik:matrix.scrye.com>
17:40:40
sounds good
<@gwmngilfen:fedora.im>
17:42:01
thats all, most of the regular alerts are load on various things and I just need to rip out that trigger and write a bunch of new ones based on server role
<@gwmngilfen:fedora.im>
17:42:25
(such is the result of just grabbing the default upstream template to get us rolling 😛)
<@zlopez:fedora.im>
17:42:43
Would be nice to have datagrepper checks for staging as well, as the bodhi queue was kind of under radar
<@gwmngilfen:fedora.im>
17:43:00
yep, we can do that
<@gwmngilfen:fedora.im>
17:43:19
if we end up on a call on Mond for rdu2-cc shutdown I can talk it through with you
<@gwmngilfen:fedora.im>
17:43:26
if we end up on a call on Monday for rdu2-cc shutdown I can talk it through with you
<@zlopez:fedora.im>
17:43:35
Sounds good
<@zlopez:fedora.im>
17:43:57
Let's switch to open floor, as we have only 17 minutes left
<@zlopez:fedora.im>
17:44:06
!topic Open Floor
<@zlopez:fedora.im>
17:44:12
nirik: Go for it
<@nirik:matrix.scrye.com>
17:45:54
1. Moving to forge. Does anyone have cycles to help and look at that soon? or should we punt to next year?
<@zlopez:fedora.im>
17:46:28
I can probably start looking at it
<@nirik:matrix.scrye.com>
17:46:29
I think moving infra-docs-fpo -> docs in forge should be pretty easy.
<@nirik:matrix.scrye.com>
17:46:49
tickets we would need to make more announcements since so many people file tickets.
<@zlopez:fedora.im>
17:47:07
Are there any other docs already moved or will this be a PoC?
<@nirik:matrix.scrye.com>
17:47:13
There is a ticket with info on it.
<@nirik:matrix.scrye.com>
17:47:22
yes, other groups have already moved, releng, epel, etc
<@mwinters:fedora.im>
17:47:51
Do contributors such as myself need any additional perms to be able to fork / PR? That would be my only potential hangup.
<@james:fedora.im>
17:48:03
I'm much less worried about moving docs than ansible, or the main infra/releng tickets.
<@nirik:matrix.scrye.com>
17:48:30
Michael Winters: shouldn't no...
<@nirik:matrix.scrye.com>
17:48:40
I keep hoping to look at it, but...
<@nirik:matrix.scrye.com>
17:49:27
anyhow, we can look at it and see. I guess we should punt tickets to next year... but can try for docs?
<@zlopez:fedora.im>
17:49:55
That is why we starting with docs
<@nirik:matrix.scrye.com>
17:50:27
anyhow, second item... scrapers. I'd really like to get to the point where I don't have to spend a lot of my time manually doing anything to block them.
<@nirik:matrix.scrye.com>
17:50:47
also, during the holidays less people will be around to do so.
<@nirik:matrix.scrye.com>
17:51:08
A super easy thing I would like to do is to just give pkgs01 more resources.
<@nirik:matrix.scrye.com>
17:51:20
needs a reboot when it's quiet, but can double cpus or something.
<@nirik:matrix.scrye.com>
17:51:39
But failing that... I guess this might be better as a discussion thread.
<@james:fedora.im>
17:52:26
More resources seems like an easy workaround, so +1 ... you think it just needs 2x CPUs?
<@nirik:matrix.scrye.com>
17:52:54
I think that would help and is easy.
<@nirik:matrix.scrye.com>
17:53:17
it doesn't seem to hit bw, io, etc... limits. db might be a issue also, not sure.
<@nirik:matrix.scrye.com>
17:54:48
anyhow, I'll start a discussion about it...
<@zlopez:fedora.im>
17:57:10
Anything else we want to talk about in the last 3 minutes?
<@nirik:matrix.scrye.com>
17:57:46
Michael Winters had something?
<@mwinters:fedora.im>
17:58:21
Not with 2 minutes left :) . But thanks -- I'll maybe file a Discussion.
<@zlopez:fedora.im>
17:59:04
Or ask in #admin:fedoraproject.org
<@zlopez:fedora.im>
17:59:18
Time is up
<@zlopez:fedora.im>
17:59:27
Thanks everybody for coming today
<@zlopez:fedora.im>
17:59:33
Let's see each other next week
<@zlopez:fedora.im>
18:00:02
!endmeeting