18:00:14 #startmeeting Fedora Infrastructure Ops Daily Standup Meeting 18:00:14 Meeting started Thu Jul 23 18:00:14 2020 UTC. 18:00:14 This meeting is logged and archived in a public location. 18:00:14 The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:14 Useful Commands: #action #agreed #halp #info #idea #link #topic. 18:00:14 The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting' 18:00:14 #chair mboddu nirik smooge pingou 18:00:14 #meetingname fedora_infrastructure_ops_daily_standup_meeting 18:00:14 #info meeting is 30 minutes MAX. At the end of 30, its stops 18:00:14 Current chairs: mboddu nirik pingou smooge 18:00:14 The meeting name has been set to 'fedora_infrastructure_ops_daily_standup_meeting' 18:00:15 #info agenda is at https://board.net/p/fedora-infra-daily 18:00:16 #topic Tickets needing review 18:00:17 #info https://pagure.io/fedora-infrastructure/issues?status=Open&priority=1 18:00:35 .tiicket 9165 18:00:58 .ticket 9165 18:00:58 smooge: Issue #9165: src.fedoraproject.org/user/churchyard/requests always timeouts - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/9165 18:01:34 I am not sure what timeouts this is.. most of the ones I looked at are like 5 minutes long 18:01:46 I guess pingou might look into this... improve perf? 18:01:50 yeah, not sure either. 18:01:58 med/med/groomed/waiting? 18:02:39 pagure.issue.tag.added -- smooge tagged ticket fedora-infrastructure#9165: groomed, high-trouble, and 2 others https://pagure.io/fedora-infrastructure/issue/9165 18:02:40 pagure.issue.edit -- smooge edited the priority fields of ticket fedora-infrastructure#9165 https://pagure.io/fedora-infrastructure/issue/9165 18:02:58 so it loaded here in 45seconds 18:03:33 * mboddu is here 18:04:10 .ticket 9166 18:04:11 nirik: Issue #9166: Fedora mailman doesn't seem to be receiving messages thru email - fedora-infrastructure - Pagure.io - https://pagure.io/fedora-infrastructure/issue/9166 18:04:24 I guess this just needs looked into... I'm not aware of any problems tho 18:04:36 we might ask for message-id's from sent messages... 18:06:12 pagure.issue.comment.added -- smooge commented on ticket fedora-infrastructure#9166: "Fedora mailman doesn't seem to be receiving messages thru email" https://pagure.io/fedora-infrastructure/issue/9166#comment-667061 18:06:37 pagure.issue.tag.added -- smooge tagged ticket fedora-infrastructure#9166: groomed, lists, and 2 others https://pagure.io/fedora-infrastructure/issue/9166 18:06:38 pagure.issue.edit -- smooge edited the priority fields of ticket fedora-infrastructure#9166 https://pagure.io/fedora-infrastructure/issue/9166 18:07:12 so how is the mass rebuild going? 18:07:49 it's not. 18:07:56 delayed until monday. 18:08:18 mboddu: you have any relengy things? thats it from the infra side 18:08:44 None from releng, only couple of unretirement requests 18:10:03 I have let matt know about the firewall port changes between prod and build which should allow mbs messages 18:10:05 cool. 18:10:31 smooge: cool... you think that was/is its issue? 18:10:49 I think there are multiple issues 18:11:22 1. fedmsg is sending messages or expecting messages from boxes not on the fedmsg bus anymore 18:11:42 it would do that if they were anywhere in the fedmsg config... 18:12:08 that would be most of the hosts listed.. none of them had listeners 18:12:37 2. firewall was not allowing connections from busgateway to build servers 18:13:30 alright, well, hope the changes make things happier until we can move everything off it 18:13:35 3. on several of the boxes they are not setting up the number of listeners they should be 18:14:28 so messages might get sent to 3005 but the box is only listening on 3000 and 3001 even though the config says it should be listening on 3005 18:15:36 yeah, thats really odd. I don't understand what would cause that 18:15:51 so.. I am not sure how anything 'worked' for the last months as the problems would have been happening in PHX2 unless there were ghost configs on boxes 18:17:02 currently busgateway is trying to send traffic to pkgs01, bodhi-backend01, odcs-backend01, mbs-backend01 and pdc-web01/02 18:17:52 pagure.issue.tag.removed -- kevin removed the groomed, high-trouble, and 2 others tags from ticket fedora-infrastructure#9165 https://pagure.io/fedora-infrastructure/issue/9165 18:17:53 pagure.issue.edit -- kevin edited the priority fields of ticket fedora-infrastructure#9165 https://pagure.io/fedora-infrastructure/issue/9165 18:17:54 pagure.issue.comment.added -- kevin commented on ticket fedora-infrastructure#9165: "src.fedoraproject.org/user/churchyard/requests always timeouts" https://pagure.io/fedora-infrastructure/issue/9165#comment-667063 18:18:15 pagure.issue.tag.added -- kevin tagged ticket fedora-infrastructure#9165: groomed, medium-gain, and medium-trouble https://pagure.io/fedora-infrastructure/issue/9165 18:18:16 pagure.issue.edit -- kevin edited the priority fields of ticket fedora-infrastructure#9165 https://pagure.io/fedora-infrastructure/issue/9165 18:18:17 pagure.issue.comment.added -- kevin commented on ticket fedora-infrastructure#9165: "src.fedoraproject.org/user/churchyard/requests always timeouts" https://pagure.io/fedora-infrastructure/issue/9165#comment-667065 18:19:29 so... I wonder if we need to somehow tell it to send those first 3 to the messaging bridges 18:20:22 well, wait, busgateway shouldn't send... it should listen? or am I confused there... 18:21:04 I don't know.. it seems to send to anything listed as an endpoint in /etc/fedmsg.d 18:21:33 to find out which hosts it could not talk to I ran the following on the server: 18:21:35 lsof -P| grep ':30[0-9][0-9].*(SYN_SENT)' | awk '{for (i=1; i<=NF;i++){ if ($i~/busgateway01/){split($i,a,":"); split(a[2],b,">"); print b[2]; if (length(b[2])==0){print $0}}}}'| sort | uniq -c | sort -bnr 18:21:56 if the connection works then it is in a different state 18:22:20 (ESTABLISHED) 18:22:22 * nirik isn't sure we are going to solve this here and now. 18:22:33 nope sorry.. I am breaking standup rule 18:24:19 really we need to get everything moved over so we can stop dealing with it. 18:24:27 anyhow, anything else to discuss? 18:25:33 ok, then will close out. 18:25:36 #endmeeting