18:00:43 <nirik> #startmeeting Fedora Infrastructure Ops Daily Standup Meeting 18:00:43 <zodbot> Meeting started Tue May 26 18:00:43 2020 UTC. 18:00:43 <zodbot> This meeting is logged and archived in a public location. 18:00:43 <zodbot> The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:43 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:43 <zodbot> The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting' 18:00:43 <nirik> #chair cverna mboddu nirik smooge 18:00:43 <nirik> #meetingname fedora_infrastructure_ops_daily_standup_meeting 18:00:43 <nirik> #info meeting is 30 minutes MAX. At the end of 30, its stops 18:00:43 <zodbot> Current chairs: cverna mboddu nirik smooge 18:00:43 <zodbot> The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting' 18:00:44 <nirik> #info agenda is at https://board.net/p/fedora-infra-daily 18:00:45 <nirik> #topic Tickets needing review 18:00:46 <nirik> #info https://pagure.io/fedora-infrastructure/issues?status=Open&priority=1 18:00:49 <pingou> someone brought the post-it? 18:00:55 * mboddu is kinda here 18:00:55 * siddharthvipul is here to observe.. don't mind it :) 18:01:02 <siddharthvipul> s/it/him sigh 18:01:04 * pingou will take note 18:01:18 <pingou> siddharthvipul: we do not mind you observing, but feel free to participate :) 18:01:21 * cverna waives 18:01:23 <nirik> +1 18:01:31 <siddharthvipul> pingou: definitely :D 18:01:45 <nirik> nice pile of tickets today due to long weekend and me filing iad2 ones. ;) 18:01:52 <nirik> .ticket 8904 18:01:53 <zodbot> nirik: Issue #8904: Please provide copr frontend/keygen backups from 2020-05-08 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8904 18:02:01 <nirik> does someone want to edit tickets today 18:02:02 <nirik> ? 18:02:11 * pingou 18:02:23 <nirik> This one lets move to waiting on asignee, low-trouble, medium-gain, groomed. 18:02:31 <nirik> and we can do it later when we are not crazy busy 18:02:46 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8904 https://pagure.io/fedora-infrastructure/issue/8904 18:02:48 <nirik> .ticket 8941 18:02:49 <zodbot> nirik: Issue #8941: get ipad01/02.iad2 replicating with ipa01/02.phx2 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8941 18:02:52 <pingou> they say they're fine with waiting 18:02:58 <nirik> this one puiterwijk is going to look into. hurray. 18:03:08 <nirik> This one lets move to waiting on asignee, medium-trouble, medium-gain, groomed. 18:03:17 <nirik> .8942 18:03:23 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8941: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8941 18:03:24 <nirik> .ticket 8942 18:03:24 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8941 https://pagure.io/fedora-infrastructure/issue/8941 18:03:27 <zodbot> nirik: Issue #8942: rabbitmq cluster in iad2 not clustering - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8942 18:03:36 <nirik> this one abompard was going to look into. :) 18:03:46 <pingou> \ó/ 18:03:51 <nirik> I had to get sudo working, but that should be the case now (non yubikey) 18:03:53 <pingou> more help! 18:03:59 <nirik> This one lets move to waiting on asignee, medium-trouble, medium-gain, groomed. 18:04:08 <nirik> .8943 18:04:14 <nirik> sigh 18:04:16 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8942: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8942 18:04:17 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8942 https://pagure.io/fedora-infrastructure/issue/8942 18:04:18 <nirik> .ticket 8943 18:04:20 <zodbot> nirik: Issue #8943: sigul rpm for epel8/python3 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8943 18:04:27 <nirik> this one also puiterwijk is going to look into. 18:04:32 <nirik> This one lets move to waiting on asignee, medium-trouble, medium-gain, groomed. 18:04:43 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8943: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8943 18:04:43 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8943 https://pagure.io/fedora-infrastructure/issue/8943 18:04:46 <nirik> .ticket 8944 18:04:48 <zodbot> nirik: Issue #8944: odcs: choose new deployment os - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8944 18:05:00 <nirik> after discussion on ticket we are going to try for rhel8... 18:05:10 <nirik> so I set up some rhel8 instances just a few minutes ago. 18:05:31 <nirik> so, I think we can just close this one now and if we need to reevaluate open a new one/reopen 18:05:33 <pingou> med-med-groomed? 18:05:39 <pingou> ok :) 18:05:53 <pingou> closed as? 18:06:06 <nirik> fixed I guess, since we decided the question in the title 18:06:18 <nirik> .ticket 8945 18:06:19 <zodbot> nirik: Issue #8945: mbs fails to deploy in iad2 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8945 18:06:29 <fm-admin> pagure.issue.edit -- pingou edited the close_status and status fields of ticket fedora-infrastructure#8944 https://pagure.io/fedora-infrastructure/issue/8944 18:06:29 <fm-admin> pagure.issue.comment.added -- pingou commented on ticket fedora-infrastructure#8944: "odcs: choose new deployment os" https://pagure.io/fedora-infrastructure/issue/8944#comment-654446 18:06:37 <nirik> this needs looking into. I am not sure who to ping about mbs these days... anyone have ideas? 18:06:50 <pingou> the koji folks 18:07:00 <pingou> so Mike McLean and Tomas Kopececk? 18:07:15 * pingou hopes the spelling isn't too far from the reality 18:07:16 <nirik> well, they are taking it over, but are they also helping us run the old instance we have? 18:07:26 <nirik> but we can try pinging them sure. 18:07:35 <nirik> anyone who can get it working is fine with me. 18:07:38 <pingou> if they aren't, then I honestly don't know 18:07:50 <mboddu> We can try pinging them, then if not we go back to the guys who deployed it for us 18:08:00 <mboddu> mprahl, contyk ? 18:08:12 * pingou needs to step out, can someone take over the tickets/tagging? 18:08:12 <nirik> ok, so ping them, and med/med/groomed 18:08:19 * mboddu can take over 18:08:34 <nirik> pingou: will you be back? had a thing for you... but later is fine. 18:08:39 <nirik> thanks mboddu 18:08:51 <mboddu> I think its high-trouble and high-gain? 18:08:53 <contyk> Hmm. 18:09:01 <nirik> yeah, probibly pretty important 18:09:07 <mboddu> Yup 18:09:08 <contyk> Well, Mike McLean would be the contact person. 18:09:23 <nirik> contyk: ok, will give him a ring... 18:09:24 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8945: groomed, high-gain, and high-trouble https://pagure.io/fedora-infrastructure/issue/8945 18:09:25 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8945 https://pagure.io/fedora-infrastructure/issue/8945 18:09:34 <mboddu> Thanks contyk 18:09:37 <nirik> .ticket 8946 18:09:38 <zodbot> nirik: Issue #8946: copr backend needs larger volume, but we miss AWS some permissions - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8946 18:10:00 <nirik> I can try and do this later today if I can find time. 18:10:16 <nirik> waiting on assignee, medium trouble, high gain, groomed 18:10:25 <mboddu> ack 18:10:35 <nirik> .ticket 8947 18:10:40 <zodbot> nirik: Issue #8947: Rawhide builds are not pushed to stable - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8947 18:10:46 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8946: groomed, high-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8946 18:10:48 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8946 https://pagure.io/fedora-infrastructure/issue/8946 18:11:00 <nirik> so, I am not sure what our service level is here... I mean I know we want it to be fast, but... 18:11:19 <mboddu> And also, its random 18:11:27 <cverna> I had a quick look at the celery worker in openshift get killed OOO 18:11:33 <mboddu> By the time we got to it, its fixed :( 18:11:42 <cverna> which I think explain why it is random and sometime slow 18:11:53 <nirik> ah... 18:12:00 <cverna> it depends on the os-node the pod is allocated 18:12:01 <mboddu> cverna: Can we close the ticket and ask them to reopen if they notice it again? 18:12:04 <nirik> so hopefully the bigger pods in iad2 will help this. 18:12:12 * pingou back, sorry 18:12:30 * mboddu can hand over the duty to pingou if he wants to 18:12:35 <cverna> yeah the celery worked seems to take a lot of mem lie 22% of the os-node 18:12:39 <nirik> I think we should explain that it's a OOM issue most likely and we are moving to bigger pods during the move, so not much we can do right now 18:12:43 <pingou> mboddu: go for it, I'm catching up 18:12:50 <mboddu> pingou: Okay 18:12:55 <cverna> so I want to look at it to understand if this is normal or not 18:12:55 <nirik> unless you can see a way to clear it's memory or something? 18:13:10 <nirik> +1... so lets leave it open for cverna to look? 18:13:18 <mboddu> Okay 18:13:26 <mboddu> cverna: Can you comment on the ticket with your findings? 18:13:33 <cverna> yeah I need a bit more time to look at it 18:13:39 <mboddu> Okay 18:13:44 <nirik> so waiting on asignee, med/med, groomed? 18:13:58 <cverna> +1 18:14:06 <mboddu> ack, but cverna is kinda working on it, so assign it to him? 18:14:22 <nirik> he's not working on it _now_... so no, leave unassigned. 18:14:27 <mboddu> Okay 18:14:33 <nirik> then he assigns it when he actually sits down to work on it. :) 18:14:42 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8947: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8947 18:14:43 <nirik> .ticket 8949 18:14:43 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8947 https://pagure.io/fedora-infrastructure/issue/8947 18:14:44 <zodbot> nirik: Issue #8949: bodhi times out on update with many (almost 300) packages - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8949 18:14:52 <nirik> so, this might in fact be related. ;) 18:14:52 <cverna> yeah I ll assign it to myself when I can focus on it 18:15:08 <pingou> this my be memory related, trying to load too many things 18:15:08 <nirik> cverna: what do you want to do with this one? 18:15:11 <mboddu> nirik: I feel so 18:15:21 <pingou> but, wild guess 18:15:21 <cverna> no different issue, OpenShift haproxy kill the request because it takes too long 18:15:33 <cverna> I have change the timeout to 120s and it seems to help 18:15:53 <cverna> This can be assigned to me since i look at it now :) 18:16:10 <nirik> ok. 18:16:14 <fm-admin> pagure.issue.assigned.added -- cverna assigned ticket fedora-infrastructure#8949 to cverna https://pagure.io/fedora-infrastructure/issue/8949 18:16:27 <nirik> and move to assignee, etcetc 18:16:27 <mboddu> cverna: You got to it before me :) 18:16:31 <fm-admin> pagure.issue.tag.added -- cverna tagged ticket fedora-infrastructure#8949: low-trouble and medium-gain https://pagure.io/fedora-infrastructure/issue/8949 18:16:32 <fm-admin> pagure.issue.edit -- cverna edited the priority fields of ticket fedora-infrastructure#8949 https://pagure.io/fedora-infrastructure/issue/8949 18:16:41 <nirik> .ticket 8950 18:16:42 <zodbot> nirik: Issue #8950: OpenShift build stuck — a node (or just Docker?) needs restarting, I think - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8950 18:16:49 <nirik> and... possibly more related. ;) 18:16:54 <cverna> :D 18:17:01 <nirik> shall I restart docker on all the nodes? 18:17:01 <pingou> I can take this one if it's just about restarting docker 18:17:11 <nirik> or sure, pingou can or whoever wants 18:17:33 <pingou> do we want to restart on all or see if we can pinpoint which one? 18:17:37 <cverna> yeah we could maybe add a daily cron job that does a docker restart 18:17:40 <mboddu> low trouble, medium gain, groomed? 18:17:43 <nirik> I'd just do them all for now. 18:17:47 <nirik> mboddu: ack 18:17:52 <pingou> nirik: roger on it 18:17:57 <pingou> mboddu: assigne to me 18:18:03 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8950: groomed, low-trouble, and medium-gain https://pagure.io/fedora-infrastructure/issue/8950 18:18:04 <fm-admin> pagure.issue.edit -- mohanboddu edited the priority fields of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950 18:18:18 <fm-admin> pagure.issue.assigned.added -- mohanboddu assigned ticket fedora-infrastructure#8950 to pingou https://pagure.io/fedora-infrastructure/issue/8950 18:18:24 <mboddu> pingou: Done 18:18:25 <nirik> .ticket 8951 18:18:26 <zodbot> nirik: Issue #8951: yubikey auth isn't working in iad2 - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/8951 18:18:35 <nirik> puiterwijk: was going to look at this one too. 18:18:46 <nirik> waiting on asignee, med/med/groomed 18:18:53 <mboddu> High gain? 18:19:09 <nirik> well, otp works... so there's somewhat of a work around 18:19:12 <nirik> but sure 18:19:28 <pingou> hm, what's the difference b/w os_infra_nodes and os_nodes? 18:19:30 <fm-admin> pagure.issue.tag.added -- mohanboddu tagged ticket fedora-infrastructure#8951: groomed, high-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/8951 18:19:47 <pingou> note that noggin doesn't support yubikey atm 18:19:53 <mboddu> Its security, so, its always a high gain for me :) 18:19:58 <pingou> so we may loose that in a soonish future 18:20:16 <nirik> infra nodes are ones that run routers/infra jobs, normal nodes can run other non infra tagged tasks 18:20:28 <smooge> oh fudge sorry.. was focusing on somehting 18:20:42 <nirik> so, thats all the needs-reviews in infra. 18:20:51 <mboddu> Now the releng side 18:21:00 <pingou> nirik: so I want to restart os_nodes then, correct? 18:21:07 <nirik> pingou: yep. 18:21:10 <pingou> thanks 18:21:11 <nirik> mboddu: go for it 18:21:24 <mboddu> .releng 9473 18:21:26 <zodbot> mboddu: Issue #9473: Fedora Python Classroom Lab container images not available @ candidate-registry.fedoraproject.org - releng - Pagure.io - https://pagure.io/releng/issue/9473 18:21:38 <mboddu> cverna: Any thoughts? 18:21:44 <mboddu> I didn't get a chance to look at it 18:21:45 <fm-admin> pagure.issue.comment.added -- cverna commented on ticket fedora-infrastructure#8949: "bodhi times out on update with many (almost 300) packages" https://pagure.io/fedora-infrastructure/issue/8949#comment-654468 18:21:54 <nirik> I didn't think we made any containers from labs/spins? 18:22:03 * cverna clicks 18:23:05 <cverna> mboddu: candidate-registry is garbage collected I think we delete all images that are older than 30days 18:23:23 <cverna> so that is likely the reason why this image is not there anymore 18:23:42 <cverna> nirik: this is just a normal layered container image available in dist-git 18:23:52 <nirik> ah... ok 18:23:58 <fm-admin> pagure.issue.tag.removed -- pingou removed the groomed, low-trouble, and medium-gain tags from ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950 18:23:59 <fm-admin> pagure.issue.assigned.reset -- pingou reset the assignee of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950 18:24:00 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950 18:24:01 <fm-admin> pagure.issue.comment.added -- pingou commented on ticket fedora-infrastructure#8950: "OpenShift build stuck — a node (or just Docker?) needs restarting, I think" https://pagure.io/fedora-infrastructure/issue/8950#comment-654470 18:24:04 <pingou> rahg! 18:24:05 <mboddu> Okay, I will comment on the ticket 18:24:31 <fm-admin> pagure.issue.tag.added -- pingou tagged ticket fedora-infrastructure#8950: groomed, low-trouble, and medium-gain https://pagure.io/fedora-infrastructure/issue/8950 18:24:32 <fm-admin> pagure.issue.assigned.added -- pingou assigned ticket fedora-infrastructure#8950 to pingou https://pagure.io/fedora-infrastructure/issue/8950 18:24:33 <fm-admin> pagure.issue.edit -- pingou edited the priority fields of ticket fedora-infrastructure#8950 https://pagure.io/fedora-infrastructure/issue/8950 18:25:08 <mboddu> .releng 9472 18:25:09 <zodbot> mboddu: Issue #9472: update stuck because bodhi thinks it's not signed again - releng - Pagure.io - https://pagure.io/releng/issue/9472 18:25:23 <mboddu> nirik: Is it the same as the other ticket in infra? 18:25:43 <nirik> no. that was a rawhide one, this is a f32 one. 18:25:43 <cverna> mboddu: for some context https://pagure.io/ContainerSIG/container-sig/issue/33 18:26:17 <cverna> yeah this is upstream https://github.com/fedora-infra/bodhi/issues/4032 18:26:54 <cverna> I can manually fix the update, but I have an upstream fix since yesterday 18:27:36 <mboddu> Okay, I will comment on the ticket and add the groomed tag 18:27:57 <mboddu> cverna: Can you fix this manually for now? 18:28:00 <cverna> mboddu: I ll fix it now it takes 2 min 18:28:08 <mboddu> Thanks cverna++ 18:28:26 <cverna> I ll comment on the ticket how to fix that 18:28:58 <mboddu> cverna: Sure and close the ticket once its fixed as well 18:29:17 <mboddu> .releng 9469 18:29:18 <zodbot> mboddu: Issue #9469: Block nuvola-app-google-calendar from koji - releng - Pagure.io - https://pagure.io/releng/issue/9469 18:29:44 <pingou> cverna: maybe the howto repo? 18:29:44 <mboddu> So, I fixed a bunch of pdc entries few days back but it seems some of them are still sneaky and got missed 18:29:52 <pingou> we should also note how to check if a build is signed 18:30:20 <cverna> yeah I usually check the tags in koji 18:30:22 <mboddu> Generally I will look at build history to check if a build is signed or not 18:30:40 <nirik> I usually call koji write-signed-build. :) 18:30:47 <nirik> if its not signed, that errors. 18:31:16 <mboddu> Coming back to 9469, I will work on them tomorrow, as I got EOL work going on. 18:31:23 <mboddu> So, adding groomed tag to it 18:31:26 <nirik> sounds good. +1 18:31:51 <nirik> I know we are over time, but I wanted to bring up a quick item... 18:31:59 <nirik> mboddu: oh, did you have anymore? 18:32:13 <mboddu> nirik: well, unretirement stuff, not important 18:32:17 <mboddu> nirik: Go ahead 18:32:19 <nirik> ok. 18:32:49 <nirik> so, resultsdb is currently in the qa network... but we need it. it's currently f31 I think... 18:33:03 <nirik> so, should we just move it into our normal prod network with the move? 18:33:15 <nirik> and should we keep it at f31? or try and upgrade? 18:33:51 <nirik> it also currently uses qa-db01... but if we move it into our prod network we can just use db01... 18:34:01 <nirik> or should I ask this on the list? :) 18:34:13 <pingou> I'd be ok w/ it in the main network 18:34:22 <mboddu> Maybe check with qa before you do that, but generally +1 18:34:25 <pingou> I would like us not to make a decision on the OS 18:34:42 <pingou> nirik: can you give me a date/deadline for this? 18:34:45 <nirik> the fewer changes the better right now. ;) 18:34:49 <pingou> I'd like to apply a little more pressure on this 18:34:55 <cverna> can we run it in OpenShift ? 18:35:14 <nirik> well, the virthost it's on will be turned off the week of june 8th? :) 18:35:20 <nirik> so we have to move it before then... 18:35:26 <nirik> cverna: I don't know 18:35:45 <nirik> and I am not sure we have time... but I'll go with whatever people want 18:36:00 <pingou> technically yes, but it'll require some changes to the clients that load data in it 18:36:14 <cverna> ok so yeah we don't have time :) 18:36:28 <pingou> nirik: thanks I'll raise this again 18:36:41 <pingou> worst case I think I know the answer and how to proceed 18:36:42 <nirik> I guess it's clear we are moving it... 18:36:51 <pingou> I just don't want us to do it 18:36:53 <nirik> I just want to know where it would make the most sense to move it to. 18:37:03 <pingou> lol 18:37:09 <pingou> a then b vs b then a :) 18:37:43 <nirik> I think the chances of someone else doing it are... low. 18:37:56 <pingou> I'd like them to at least be there 18:38:04 <pingou> even if it's only to look over our shoulders 18:38:15 <nirik> threebean did say he was willing to help with it on the move week... 18:39:38 <nirik> so, give it more time and ask around a bit more? 18:40:10 <pingou> let's give it until early next week 18:40:21 <nirik> well, thats cutting it very very very close 18:40:24 <nirik> but ok 18:40:32 <pingou> thanks 18:40:46 <nirik> I'm hoping to send out later today a link to a doc for testing/validating things in iad2. 18:41:06 <nirik> most things are up and running there, in various levels of working. 18:41:21 <cverna> we have over run our 30min time slot do we want to close the meeting ? and then we can continue the convo if needed 18:41:34 <nirik> yeah, lets end, thats fine. 18:41:43 <nirik> thanks cverna 18:41:46 <cverna> #endmeeting