18:00:43 <nirik> #startmeeting Fedora Infrastructure Ops Daily Standup Meeting
18:00:43 <zodbot> Meeting started Tue May 26 18:00:43 2020 UTC.
18:00:43 <zodbot> This meeting is logged and archived in a public location.
18:00:43 <zodbot> The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:43 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
18:00:43 <zodbot> The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting'
18:00:43 <nirik> #chair cverna mboddu nirik smooge
18:00:43 <nirik> #meetingname fedora_infrastructure_ops_daily_standup_meeting
18:00:43 <nirik> #info meeting is 30 minutes MAX. At the end of 30, its stops
18:00:43 <zodbot> Current chairs: cverna mboddu nirik smooge
18:00:43 <zodbot> The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting'
18:00:44 <nirik> #info agenda is at https://board.net/p/fedora-infra-daily
18:00:45 <nirik> #topic Tickets needing review
18:00:46 <nirik> #info https://pagure.io/fedora-infrastructure/issues?status=Open&priority=1
18:00:49 <pingou> someone brought the post-it?
18:00:55 * mboddu is kinda here
18:00:55 * siddharthvipul is here to observe.. don't mind it :)
18:01:02 <siddharthvipul> s/it/him sigh
18:01:04 * pingou will take note
18:01:18 <pingou> siddharthvipul: we do not mind you observing, but feel free to participate :)
18:01:21 * cverna waives
18:01:23 <nirik> +1
18:01:31 <siddharthvipul> pingou: definitely :D
18:01:45 <nirik> nice pile of tickets today due to long weekend and me filing iad2 ones. ;)
18:01:52 <nirik> .ticket 8904
18:01:53 <zodbot> nirik: Issue #8904: Please provide copr frontend/keygen backups from 2020-05-08 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8904
18:02:01 <nirik> does someone want to edit tickets today
18:02:02 <nirik> ?
18:02:11 * pingou 
18:02:23 <nirik> This one lets move to waiting on asignee, low-trouble, medium-gain, groomed.
18:02:31 <nirik> and we can do it later when we are not crazy busy
18:02:46 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8904 https://pagure.io/fedora-infrastructure/issue/8904
18:02:48 <nirik> .ticket 8941
18:02:49 <zodbot> nirik: Issue #8941: get ipad01/02.iad2 replicating with ipa01/02.phx2 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8941
18:02:52 <pingou> they say they're fine with waiting
18:02:58 <nirik> this one puiterwijk is going to look into. hurray.
18:03:08 <nirik> This one lets move to waiting on asignee, medium-trouble, medium-gain, groomed.
18:03:17 <nirik> .8942
18:03:23 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8941: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8941
18:03:24 <nirik> .ticket 8942
18:03:24 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8941 https://pagure.io/fedora-infrastructure/issue/8941
18:03:27 <zodbot> nirik: Issue #8942: rabbitmq cluster in iad2 not clustering - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8942
18:03:36 <nirik> this one abompard was going to look into. :)
18:03:46 <pingou> \ó/
18:03:51 <nirik> I had to get sudo working, but that should be the case now (non yubikey)
18:03:53 <pingou> more help!
18:03:59 <nirik> This one lets move to waiting on asignee, medium-trouble, medium-gain, groomed.
18:04:08 <nirik> .8943
18:04:14 <nirik> sigh
18:04:16 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8942: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8942
18:04:17 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8942 https://pagure.io/fedora-infrastructure/issue/8942
18:04:18 <nirik> .ticket 8943
18:04:20 <zodbot> nirik: Issue #8943: sigul rpm for epel8/python3 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8943
18:04:27 <nirik> this one also puiterwijk is going to look into.
18:04:32 <nirik> This one lets move to waiting on asignee, medium-trouble, medium-gain, groomed.
18:04:43 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8943: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8943
18:04:43 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8943 https://pagure.io/fedora-infrastructure/issue/8943
18:04:46 <nirik> .ticket 8944
18:04:48 <zodbot> nirik: Issue #8944: odcs: choose new deployment os - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8944
18:05:00 <nirik> after discussion on ticket we are going to try for rhel8...
18:05:10 <nirik> so I set up some rhel8 instances just a few minutes ago.
18:05:31 <nirik> so, I think we can just close this one now and if we need to reevaluate open a new one/reopen
18:05:33 <pingou> med-med-groomed?
18:05:39 <pingou> ok :)
18:05:53 <pingou> closed as?
18:06:06 <nirik> fixed I guess, since we decided the question in the title
18:06:18 <nirik> .ticket 8945
18:06:19 <zodbot> nirik: Issue #8945: mbs fails to deploy in iad2 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8945
18:06:29 <fm-admin> pagure.issue.edit -- pingou edited the close_status and status fields of ticket fedora-infrastructure#8944 https://pagure.io/fedora-infrastructure/issue/8944
18:06:29 <fm-admin> pagure.issue.comment.added -- pingou commented on ticket fedora-infrastructure#8944: "odcs: choose new deployment os" https://pagure.io/fedora-infrastructure/issue/8944#comment-654446
18:06:37 <nirik> this needs looking into. I am not sure who to ping about mbs these days... anyone have ideas?
18:06:50 <pingou> the koji folks
18:07:00 <pingou> so Mike McLean and Tomas Kopececk?
18:07:15 * pingou hopes the spelling isn't too far from the reality
18:07:16 <nirik> well, they are taking it over, but are they also helping us run the old instance we have?
18:07:26 <nirik> but we can try pinging them sure.
18:07:35 <nirik> anyone who can get it working is fine with me.
18:07:38 <pingou> if they aren't, then I honestly don't know
18:07:50 <mboddu> We can try pinging them, then if not we go back to the guys who deployed it for us
18:08:00 <mboddu> mprahl, contyk ?
18:08:12 * pingou needs to step out, can someone take over the tickets/tagging?
18:08:12 <nirik> ok, so ping them, and med/med/groomed
18:08:19 * mboddu can take over
18:08:34 <nirik> pingou: will you be back? had a thing for you... but later is fine.
18:08:39 <nirik> thanks mboddu
18:08:51 <mboddu> I think its high-trouble and high-gain?
18:08:53 <contyk> Hmm.
18:09:01 <nirik> yeah, probibly pretty important
18:09:07 <mboddu> Yup
18:09:08 <contyk> Well, Mike McLean would be the contact person.
18:09:23 <nirik> contyk: ok, will give him a ring...
18:09:24 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8945: groomed, high-gain, and high-trouble https://pagure.io/fedora-infrastructure/issue/8945
18:09:25 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8945 https://pagure.io/fedora-infrastructure/issue/8945
18:09:34 <mboddu> Thanks contyk
18:09:37 <nirik> .ticket 8946
18:09:38 <zodbot> nirik: Issue #8946: copr backend needs larger volume, but we miss AWS some permissions - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8946
18:10:00 <nirik> I can try and do this later today if I can find time.
18:10:16 <nirik> waiting on assignee, medium trouble, high gain, groomed
18:10:25 <mboddu> ack
18:10:35 <nirik> .ticket 8947
18:10:40 <zodbot> nirik: Issue #8947: Rawhide builds are not pushed to stable - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8947
18:10:46 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8946: groomed, high-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8946
18:10:48 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8946 https://pagure.io/fedora-infrastructure/issue/8946
18:11:00 <nirik> so, I am not sure what our service level is here... I mean I know we want it to be fast, but...
18:11:19 <mboddu> And also, its random
18:11:27 <cverna> I had a quick look at the celery worker in openshift get killed OOO
18:11:33 <mboddu> By the time we got to it, its fixed :(
18:11:42 <cverna> which I think explain why it is random and sometime slow
18:11:53 <nirik> ah...
18:12:00 <cverna> it depends on the os-node the pod is allocated
18:12:01 <mboddu> cverna: Can we close the ticket and ask them to reopen if they notice it again?
18:12:04 <nirik> so hopefully the bigger pods in iad2 will help this.
18:12:12 * pingou back, sorry
18:12:30 * mboddu can hand over the duty to pingou if he wants to
18:12:35 <cverna> yeah the celery worked seems to take a lot of mem lie 22% of the os-node
18:12:39 <nirik> I think we should explain that it's a OOM issue most likely and we are moving to bigger pods during the move, so not much we can do right now
18:12:43 <pingou> mboddu: go for it, I'm catching up
18:12:50 <mboddu> pingou: Okay
18:12:55 <cverna> so I want to look at it to understand if this is normal or not
18:12:55 <nirik> unless you can see a way to clear it's memory or something?
18:13:10 <nirik> +1... so lets leave it open for cverna to look?
18:13:18 <mboddu> Okay
18:13:26 <mboddu> cverna: Can you comment on the ticket with your findings?
18:13:33 <cverna> yeah I need a bit more time to look at it
18:13:39 <mboddu> Okay
18:13:44 <nirik> so waiting on asignee, med/med, groomed?
18:13:58 <cverna> +1
18:14:06 <mboddu> ack, but cverna is kinda working on it, so assign it to him?
18:14:22 <nirik> he's not working on it _now_... so no, leave unassigned.
18:14:27 <mboddu> Okay
18:14:33 <nirik> then he assigns it when he actually sits down to work on it. :)
18:14:42 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8947: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8947
18:14:43 <nirik> .ticket 8949
18:14:43 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8947 https://pagure.io/fedora-infrastructure/issue/8947
18:14:44 <zodbot> nirik: Issue #8949: bodhi times out on update with many (almost 300) packages - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8949
18:14:52 <nirik> so, this might in fact be related. ;)
18:14:52 <cverna> yeah I ll assign it to myself when I can focus on it
18:15:08 <pingou> this my be memory related, trying to load too many things
18:15:08 <nirik> cverna: what do you want to do with this one?
18:15:11 <mboddu> nirik: I feel so
18:15:21 <pingou> but, wild guess
18:15:21 <cverna> no different issue, OpenShift haproxy kill the request because it takes too long
18:15:33 <cverna> I have change the timeout to 120s and it seems to help
18:15:53 <cverna> This can be assigned to me since i look at it now :)
18:16:10 <nirik> ok.
18:16:14 <fm-admin> pagure.issue.assigned.added -- cverna assigned ticket fedora-infrastructure#8949 to cverna https://pagure.io/fedora-infrastructure/issue/8949
18:16:27 <nirik> and move to assignee, etcetc
18:16:27 <mboddu> cverna: You got to it before me :)
18:16:31 <fm-admin> pagure.issue.tag.added -- cverna tagged ticket fedora-infrastructure#8949: low-trouble and medium-gain https://pagure.io/fedora-infrastructure/issue/8949
18:16:32 <fm-admin> pagure.issue.edit -- cverna edited the priority fields of ticket fedora-infrastructure#8949 https://pagure.io/fedora-infrastructure/issue/8949
18:16:41 <nirik> .ticket 8950
18:16:42 <zodbot> nirik: Issue #8950: OpenShift build stuck — a node (or just Docker?) needs restarting, I think - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8950
18:16:49 <nirik> and... possibly more related. ;)
18:16:54 <cverna> :D
18:17:01 <nirik> shall I restart docker on all the nodes?
18:17:01 <pingou> I can take this one if it's just about restarting docker
18:17:11 <nirik> or sure, pingou can or whoever wants
18:17:33 <pingou> do we want to restart on all or see if we can pinpoint which one?
18:17:37 <cverna> yeah we could maybe add a daily cron job that does a docker restart
18:17:40 <mboddu> low trouble, medium gain, groomed?
18:17:43 <nirik> I'd just do them all for now.
18:17:47 <nirik> mboddu: ack
18:17:52 <pingou> nirik: roger on it
18:17:57 <pingou> mboddu: assigne to me
18:18:03 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8950: groomed, low-trouble, and medium-gain https://pagure.io/fedora-infrastructure/issue/8950
18:18:04 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950
18:18:18 <fm-admin> pagure.issue.assigned.added -- mohanboddu assigned ticket fedora-infrastructure#8950 to pingou https://pagure.io/fedora-infrastructure/issue/8950
18:18:24 <mboddu> pingou: Done
18:18:25 <nirik> .ticket 8951
18:18:26 <zodbot> nirik: Issue #8951: yubikey auth isn't working in iad2 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8951
18:18:35 <nirik> puiterwijk: was going to look at this one too.
18:18:46 <nirik> waiting on asignee, med/med/groomed
18:18:53 <mboddu> High gain?
18:19:09 <nirik> well, otp works... so there's somewhat of a work around
18:19:12 <nirik> but sure
18:19:28 <pingou> hm, what's the difference b/w os_infra_nodes and os_nodes?
18:19:30 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8951: groomed, high-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8951
18:19:47 <pingou> note that noggin doesn't support yubikey atm
18:19:53 <mboddu> Its security, so, its always a high gain for me :)
18:19:58 <pingou> so we may loose that in a soonish future
18:20:16 <nirik> infra nodes are ones that run routers/infra jobs, normal nodes can run other non infra tagged tasks
18:20:28 <smooge> oh fudge sorry.. was focusing on somehting
18:20:42 <nirik> so, thats all the needs-reviews in infra.
18:20:51 <mboddu> Now the releng side
18:21:00 <pingou> nirik: so I want to restart os_nodes then, correct?
18:21:07 <nirik> pingou: yep.
18:21:10 <pingou> thanks
18:21:11 <nirik> mboddu: go for it
18:21:24 <mboddu> .releng 9473
18:21:26 <zodbot> mboddu: Issue #9473: Fedora Python Classroom Lab container images not available @ candidate-registry.fedoraproject.org - releng - Pagure.io - https://pagure.io/releng/issue/9473
18:21:38 <mboddu> cverna: Any thoughts?
18:21:44 <mboddu> I didn't get a chance to look at it
18:21:45 <fm-admin> pagure.issue.comment.added -- cverna commented on ticket fedora-infrastructure#8949: "bodhi times out on update with many (almost 300) packages" https://pagure.io/fedora-infrastructure/issue/8949#comment-654468
18:21:54 <nirik> I didn't think we made any containers from labs/spins?
18:22:03 * cverna clicks
18:23:05 <cverna> mboddu: candidate-registry is garbage collected I think we delete all images that are older than 30days
18:23:23 <cverna> so that is likely the reason why this image is not there anymore
18:23:42 <cverna> nirik:  this is just a normal layered container image available in dist-git
18:23:52 <nirik> ah... ok
18:23:58 <fm-admin> pagure.issue.tag.removed -- pingou removed the groomed, low-trouble, and medium-gain tags from ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950
18:23:59 <fm-admin> pagure.issue.assigned.reset -- pingou reset the assignee of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950
18:24:00 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950
18:24:01 <fm-admin> pagure.issue.comment.added -- pingou commented on ticket fedora-infrastructure#8950: "OpenShift build stuck — a node (or just Docker?) needs restarting, I think" https://pagure.io/fedora-infrastructure/issue/8950#comment-654470
18:24:04 <pingou> rahg!
18:24:05 <mboddu> Okay, I will comment on the ticket
18:24:31 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8950: groomed, low-trouble, and medium-gain https://pagure.io/fedora-infrastructure/issue/8950
18:24:32 <fm-admin> pagure.issue.assigned.added -- pingou assigned ticket fedora-infrastructure#8950 to pingou https://pagure.io/fedora-infrastructure/issue/8950
18:24:33 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950
18:25:08 <mboddu> .releng 9472
18:25:09 <zodbot> mboddu: Issue #9472: update stuck because bodhi thinks it's not signed again - releng - Pagure.io - https://pagure.io/releng/issue/9472
18:25:23 <mboddu> nirik: Is it the same as the other ticket in infra?
18:25:43 <nirik> no. that was a rawhide one, this is a f32 one.
18:25:43 <cverna> mboddu:  for some context https://pagure.io/ContainerSIG/container-sig/issue/33
18:26:17 <cverna> yeah this is upstream https://github.com/fedora-infra/bodhi/issues/4032
18:26:54 <cverna> I can manually fix the update, but I have an upstream fix since yesterday
18:27:36 <mboddu> Okay, I will comment on the ticket and add the groomed tag
18:27:57 <mboddu> cverna: Can you fix this manually for now?
18:28:00 <cverna> mboddu:  I ll fix it now it takes 2 min
18:28:08 <mboddu> Thanks cverna++
18:28:26 <cverna> I ll comment on the ticket how to fix that
18:28:58 <mboddu> cverna: Sure and close the ticket once its fixed as well
18:29:17 <mboddu> .releng 9469
18:29:18 <zodbot> mboddu: Issue #9469: Block nuvola-app-google-calendar from koji - releng - Pagure.io - https://pagure.io/releng/issue/9469
18:29:44 <pingou> cverna: maybe the howto repo?
18:29:44 <mboddu> So, I fixed a bunch of pdc entries few days back but it seems some of them are still sneaky and got missed
18:29:52 <pingou> we should also note how to check if a build is signed
18:30:20 <cverna> yeah I usually check the tags in koji
18:30:22 <mboddu> Generally I will look at build history to check if a build is signed or not
18:30:40 <nirik> I usually call koji write-signed-build. :)
18:30:47 <nirik> if its not signed, that errors.
18:31:16 <mboddu> Coming back to 9469, I will work on them tomorrow, as I got EOL work going on.
18:31:23 <mboddu> So, adding groomed tag to it
18:31:26 <nirik> sounds good. +1
18:31:51 <nirik> I know we are over time, but I wanted to bring up a quick item...
18:31:59 <nirik> mboddu: oh, did you have anymore?
18:32:13 <mboddu> nirik: well, unretirement stuff, not important
18:32:17 <mboddu> nirik: Go ahead
18:32:19 <nirik> ok.
18:32:49 <nirik> so, resultsdb is currently in the qa network... but we need it. it's currently f31 I think...
18:33:03 <nirik> so, should we just move it into our normal prod network with the move?
18:33:15 <nirik> and should we keep it at f31? or try and upgrade?
18:33:51 <nirik> it also currently uses qa-db01... but if we move it into our prod network we can just use db01...
18:34:01 <nirik> or should I ask this on the list? :)
18:34:13 <pingou> I'd be ok w/ it in the main network
18:34:22 <mboddu> Maybe check with qa before you do that, but generally +1
18:34:25 <pingou> I would like us not to make a decision on the OS
18:34:42 <pingou> nirik: can you give me a date/deadline for this?
18:34:45 <nirik> the fewer changes the better right now. ;)
18:34:49 <pingou> I'd like to apply a little more pressure on this
18:34:55 <cverna> can we run it in OpenShift ?
18:35:14 <nirik> well, the virthost it's on will be turned off the week of june 8th? :)
18:35:20 <nirik> so we have to move it before then...
18:35:26 <nirik> cverna: I don't know
18:35:45 <nirik> and I am not sure we have time... but I'll go with whatever people want
18:36:00 <pingou> technically yes, but it'll require some changes to the clients that load data in it
18:36:14 <cverna> ok so yeah we don't have time :)
18:36:28 <pingou> nirik: thanks I'll raise this again
18:36:41 <pingou> worst case I think I know the answer and how to proceed
18:36:42 <nirik> I guess it's clear we are moving it...
18:36:51 <pingou> I just don't want us to do it
18:36:53 <nirik> I just want to know where it would make the most sense to move it to.
18:37:03 <pingou> lol
18:37:09 <pingou> a then b vs b then a :)
18:37:43 <nirik> I think the chances of someone else doing it are... low.
18:37:56 <pingou> I'd like them to at least be there
18:38:04 <pingou> even if it's only to look over our shoulders
18:38:15 <nirik> threebean did say he was willing to help with it on the move week...
18:39:38 <nirik> so, give it more time and ask around a bit more?
18:40:10 <pingou> let's give it until early next week
18:40:21 <nirik> well, thats cutting it very very very close
18:40:24 <nirik> but ok
18:40:32 <pingou> thanks
18:40:46 <nirik> I'm hoping to send out later today a link to a doc for testing/validating things in iad2.
18:41:06 <nirik> most things are up and running there, in various levels of working.
18:41:21 <cverna> we have over run our 30min time slot do we want to close the meeting ? and then we can continue the convo if needed
18:41:34 <nirik> yeah, lets end, thats fine.
18:41:43 <nirik> thanks cverna
18:41:46 <cverna> #endmeeting