14:00:00 #startmeeting Infrastructure (2018-08-23) 14:00:00 #meetingname infrastructure 14:00:00 Meeting started Thu Aug 23 14:00:00 2018 UTC. 14:00:00 This meeting is logged and archived in a public location. 14:00:00 The chair is smooge. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:00 Useful Commands: #action #agreed #halp #info #idea #link #topic. 14:00:00 The meeting name has been set to 'infrastructure_(2018-08-23)' 14:00:00 The meeting name has been set to 'infrastructure' 14:00:00 #topic aloha 14:00:00 #chair nirik pingou puiterwijk relrod smooge tflink threebean 14:00:01 Current chairs: nirik pingou puiterwijk relrod smooge tflink threebean 14:00:09 morning 14:00:19 * nirik waves and coughs. 14:00:23 hi 14:00:51 Hi 14:01:07 Will be there apprentice hours today? 14:01:16 hello 14:01:27 x3mboy, as in the meeting or in something else? 14:01:29 guten tag! 14:01:53 smooge, in any of those 14:01:54 * pingou here 14:01:55 xD 14:02:00 .hello2 14:02:01 x3mboy: x3mboy 'Eduard Lucena' 14:02:05 .hello sayanchowdhury 14:02:10 sayan: sayanchowdhury 'Sayan Chowdhury' 14:02:11 .hello2 14:02:13 bowlofeggs: bowlofeggs 'Randy Barlow' 14:02:24 there should be a short one. The main open office hours are on Tuesday? and run by cverna 14:02:43 hello o/ 14:02:48 hello all who are here 14:02:58 #topic New folks introductions 14:02:58 #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves 14:02:58 #info wilsonj introduced themselves on list and IRC this week. 14:03:06 Any other introductions ? 14:05:45 #topic announcements and information 14:05:46 #info tflink is on extended PTO 14:05:46 #info PHX2 visit notes: https://gobby.fedoraproject.org/cgit/infinote/tree/infrastructure-phx2-2018Q2-visit 14:05:46 #info Beta Freeze starts next Tuesday. All changes to Infrastructure will require +1 14:05:46 #info bodhi 3.9.0 released https://bodhi.stg.fedoraproject.org/docs/user/release_notes.html - deployment planned for Monday 14:06:28 I think there are some other things people have gotten into place lately.. but I have been catching up with things 14:06:57 openshift in stg is updated to 3.10 14:07:50 #info openshift in stg is updated to 3.10 14:08:22 #info old docker-registry boxes renamed to oci-registry and rebuilt as f28 14:09:02 puiterwijk abompard cverna clime pingou jcline (and my apologies for any others I am forgetting) any items you have to announce? 14:09:26 #info pagure 5 in staging 14:09:29 working up a second beta of pagure 5 :) 14:09:34 I am good , I added 2 topics for later :) 14:09:35 cool 14:09:38 I'll be gone the coming 2/3 weeks 14:09:57 Hrmm. Nothing much, other than authentication with fedora-messaging is working now so in the next week or two the bridges can be set up in stg 14:09:57 that means 2 to 3 weeks not 2/3 of a week :) 14:09:58 nirik: any issues getting the apps back up and running after the stg update to 3.10? 14:10:30 threebean: nope, but there is an issue re-running the deploy... was going to look more at that today. 14:10:53 #info we had a problem with leakage of webhook secrets and inserted api pagure tokens on integration page in Copr: https://lists.fedoraproject.org/archives/list/copr-devel@lists.fedorahosted.org/thread/VOOOVQ4VOZIB4GKXZWSX7REWCX3WVTLN/ 14:11:17 thanks guys 14:11:30 thanks everyone 14:11:42 #topic Oncall 14:11:42 #info Pingou is on call from 2018-08-16->2018-08-23 14:11:42 #info Smooge is on call from 2018-08-23->2018-08-30 14:11:42 #info ??? is on call from 2018-08-30->2018-09-06 14:11:42 #info ??? is on call from 2018-09-06->2018-09-13 14:11:43 #info Summary of last week: (from Pingou) 14:11:56 pingou, anything big happen while you were on call? 14:12:12 clime: note that puiterwijk can provide you a CVE number if you ever want one 14:12:23 things were quite quiet during my time oncall 14:12:25 pingou: ye, that would cool 14:12:34 I tried to reply to the ticket as they were created 14:12:44 I tried helping cverna w/o much much success :) 14:12:45 clime: just PM me if you need one 14:12:52 puiterwijk: ok thx 14:13:07 overall, I have no idea if I did a good job or not :D 14:13:22 * pingou saw some people pinging nirik directly 14:13:26 ok I would like to get the next 2 weeks of oncall dealt with 14:13:28 pingou++ for the help :) 14:14:42 relrod, nirik bowlofeggs puiterwijk who can do next week of oncall? and the week afterwords? 14:15:02 i am again travelling during both of those windows, so i cannot 14:15:11 people keep taking the weeks i *can* do :) 14:15:29 I can take next 14:15:42 bowlofeggs, please make sure you are on the vacations calendar 14:16:30 I can take the one after, but there are two 1-hour chunks on m/w/f I won't be available due to class 14:16:34 so if that's okay 14:17:01 smooge: i prefer not to use the vacations calendar due to privacy reasons 14:17:04 that is fine.. they are during the day 14:17:45 ok 14:17:48 which of course makes it a little silly to type into IRC 14:17:48 bowlofeggs, ok thanks 14:17:54 but at least that's not as easy to query :) 14:18:14 except for all those telegram/discord etc bots :) 14:18:19 bowlofeggs: yeah not like these meetings are logged or anything ;) 14:18:24 * relrod hides 14:18:26 hahaha 14:18:27 yeah 14:18:29 #topic fedora-tagger sunset (cverna) 14:18:43 ok I take it 14:19:01 just don't tell them you live at 1060 W Addison St, Chicago, IL... 14:19:10 hahaha 14:19:24 did you move? 14:19:34 * pingou updates his address book 14:19:41 so we tried earlier this year to handover the maintenance of fedora-tagger to the community. I think we can say that this did not really work. So I am proposing to sunset this service 14:20:14 +1 14:20:15 +1. 14:20:19 :-( 14:20:22 +1 14:20:23 +1 too 14:20:37 +1 14:20:54 i accidentally turned on bodhi's "integration" with that service once… and it went… great! 14:20:57 (right patrick?) 14:21:02 bowlofeggs: lol 14:21:10 what are the steps forward? 14:21:20 (it made it so yum/dnf did not work for anybody on both Fedora and EPEL) 14:21:28 email to devel-announce 14:21:33 I can send a email to the list but we need to have a sunset date 14:21:34 wait post-freeze to turn it off? 14:22:08 I would go with post beta freeze 14:22:28 yes sounds good to me too (post freeze) 14:22:41 * misc go quickly tag stuff to get badge before it is too late 14:22:46 sunset on 2018-10-01? 14:22:58 that gives us one week before final freeze 14:23:10 +1 14:23:20 and it the week after the beta freeze end 14:23:28 +1 14:23:42 * pingou just realized something, for the open floor 14:24:15 ok I take the action to send an email to the infra list, do we need to add devel list too ? 14:24:24 I would say you are going to do it the week after freeze ends. Since freezes tend to drift 14:24:29 devel-announce & infra I'd think 14:24:37 I would say you are going to do it the week of freeze ends. Since freezes tend to drift 14:24:52 smooge: that day is the week post the slipping date of the freeze end 14:25:04 but being more vague would allow doing it earlier if we don't slip 14:25:16 aka freeze ends on a wednesday so the friday after 14:25:38 always make world breaking changes on a friday. puiterwijk will thank you for it 14:25:47 ok anything else on this? 14:25:52 ok we can be more vague now, and once the freeze ends communicate on a specific date 14:26:08 I am done 14:26:11 #topic statscache sunset (cverna) 14:26:17 me again :) 14:26:18 no, no you are not 14:26:34 same thing? 14:26:46 I was wondering if anybody is using statcache, and if it was worth looking at a sunset 14:26:55 no quite :) 14:26:59 my issue with this is that it's really meant to solve some questions that mattdm has 14:27:15 so the first person to ask would be mattdm 14:27:18 it's basically meant to provide a significant part of the stats from his state of the union talk 14:28:03 on the other side, I'm not sure we have many people who worked on it 14:28:13 maybe that's something we should resurect with fedora-messaging 14:28:15 * nirik thought it was for caching specific long datagrepper queries. 14:28:19 so I believe the commops team is looking a using something else for that 14:28:36 cverna: they are the last ones who worked on it iirc 14:28:42 by the way, in the documents for applications would it help if listed a primary consumer? So we can try to get people on what they need ealrier? 14:28:42 * cverna looks for a ticket 14:29:12 https://pagure.io/fedora-commops/issue/114 14:30:33 A way to move this forward is maybe to send an email to the list, to ask if anybody is using it or is planing to use it ? 14:30:52 +1 14:31:15 yeah. 14:31:18 that ticket is long 14:32:13 Ok I take the action to send the email too for that 14:32:20 thanks cverna 14:32:41 I think I am done now :) 14:33:01 #topic looking for alternatives to gluster (smooge) 14:33:01 #info this isn't to say gluster is bad.. just we keep using it in non-supportable ways 14:33:54 well, we have always nfs... 14:34:05 which has it's other share of issues. 14:34:29 yeah i had wondered this when we made the registry use it 14:34:37 gluster is a lot like mongodb 14:35:04 yes agreed. The issue is that upstream says if we want to use it we need at lease 3 and more like 5 systems not 2 14:35:16 indeed 14:35:30 so for the registry we could just make one more registry node 14:35:34 or... nfs 14:35:45 the registry is going to be read heavy 14:36:07 well they also say that the systems running gluster should be dedicated to running gluster and the clients either get it via network fuse or nfs 14:36:17 (i'm guessing, don't have data, but we only plan to publish during bodhi runs) 14:36:26 oh that's interesting 14:36:32 the registry def doesnt' do that 14:36:32 hahaha 14:36:54 well, as far as I can tell it's working ok now... there was a slight issue with the ansible module calls to gluster_volume 14:36:58 what is/are the use case(s) we need a distributed FS for? 14:37:11 i don't really have true experience with NFS or gluster to inform an opinion of my own, so i will defer to those who know what they are talking about :) 14:37:18 * abompard is very late, reading back 14:37:40 aikidouke: i think the real need for the registry is a "shared" volume, not necessarily a distributed one 14:37:52 the registry nodes need to see the same data upon reads 14:38:02 yes. most of the cases where are using this it is because drbd is not distributed in the kernel 14:38:09 but they won't care how that happens 14:38:42 and yeah, i'd say that what we've been doing has been "mostly working ok" 14:38:46 even if not good 14:39:10 i.e., there haven't been a lot of outages due to it so far (and it's been humming along for over 1.5 years iirc) 14:39:22 and we can't add a 3rd node? 14:39:32 not that i'm opposed to changing it, just that i think it's probably not dire 14:39:47 so we have also plan in a possible near futur to move to quay.io, that means we would not have to host a registry 14:40:14 queue-tip: technically nothing stops us other than maybe resources (?) 14:40:17 cverna: true 14:40:21 cverna: are they open-source it? 14:40:31 there is at least one other project using gluster - can't remember what it is though 14:40:32 cverna: how does that affect flatpack support? 14:40:42 bowlofeggs: at least odcs is. unhappily using it. 14:40:44 nirik: we need them to add OCI support for flatpak 14:40:45 for that we need quay.io to support multi-arch and osbs to support quay.io 14:40:54 nirik: oh, and we need them to add multi-arch support too 14:41:00 nuancier uses gluster iirc 14:41:07 threebean: hahaha 14:41:17 threebean: so there are problems due to outages or something? 14:41:36 bowlofeggs: oh, I was talking about gluster (odcs uses gluster, and its a pain) 14:41:47 quay, otoh, has been super nice in my experience. 14:41:58 yeah we are missing a few things but I son't have much info on the roadmap of quay.io 14:42:01 threebean: yeah, i was more asking what sorts of pain? 14:42:07 afaik, the manifest list and osbs support are on the way there. 14:42:17 threebean, odcs could simply use nfs, i think 14:42:25 is this something that iscsi wouldnt be good for? 14:42:29 no 14:42:37 iscsi is 1:1 disk not shared 14:42:52 you share iscsi and you end up with BOOOM 14:43:07 bowlofeggs: general setup. the config format changed entirely at some point, invalidating our setup. the new setup involves issues commands to the nodes in a certain order that seems to be unreliable. I haven't seen any particular issues after it gets finally setup. 14:43:13 mizdebsk: ack, yeah. 14:43:19 ah interesting 14:43:31 i wonder if we hit any of that when we moved the registry from RHEL 7 to F28 14:43:48 what would be the pros/cons of switching to nfs for the registry? i assume maybe nfs is a bit slower? 14:43:50 well, dunno if these are packaged or the right solution, but there is OrangeFS and MooseFS 14:44:10 note I fixed one issue we had: if you pass the role a brick name with tailing / it would not handle it well at all... 14:44:13 aikidouke, they don't fix the issue. The problem is they are distributed and want 3 nodes 14:44:26 bowlofeggs: major advantage of NFS for the registry: we don't have to manage gluster 14:44:30 hahah 14:44:46 that's funny, but actually a good point :) 14:45:04 The point here isn't specifically gluster. It just means we don't have to manage the storage system 14:45:06 i would expect that the registry doesn't really see that much load, though i don't have data about it 14:45:18 ah yeah 14:45:26 bowlofeggs: the multi-node is more for high availability then for the load 14:45:42 isnt a way to get some more nodes for gluster? 14:45:43 I think we should give a try to NFS if the switch is easy 14:45:58 Kal3ssiN: I'm sure there is. But that still means we need to manage it 14:46:40 and whats the issue? not a big fan of it, but it works. 14:46:42 ok I think we aren't going to reach any consensus here and we need to move to otehr items 14:47:06 Kal3ssiN, we can talk about it later 14:47:15 Kal3ssiN: the man power we have available to deal with it if it dies at 3am 14:47:24 #topic Apprentice Open office minutes 14:47:24 #info A time where apprentices may ask for help or look at problems. 14:47:47 x3mboy, I think you had something to ask here? 14:47:56 Yes 14:48:00 What can I do? 14:48:22 I mean, I've logged in into the batcave01 14:48:27 And from there, nothing to do 14:48:30 what are you interested in doing? 14:48:51 so have you looked through the 98 open infrastructure tickets? 14:48:55 Well, I'm an operational guy I like to solve problem 14:49:22 smooge, not through all of them 14:49:23 does zodbot know about infra ticket tags? 14:49:52 sorry - not wanting to derail 14:50:12 Just the easyfix 14:50:16 aikidouke: not tags no, but it can list tickets... 14:50:32 x3mboy, maybe check out ticket 7105? 14:50:42 it's a problem to solve, i think we can give you access you need 14:50:47 Just about every ticket I look at seems to require more knowledge of the infrastructure. Haven't looked since the last meeting though 14:50:49 I'm looking at it now 14:51:36 hi guys, Im also new here =] 14:51:41 Well, I'm going to investigate it 14:51:41 :D 14:51:45 Kal3ssiN, welcome! 14:51:50 ty 14:52:04 yw 14:52:13 f28-test 14:52:59 queue-tip: sapo and I were looking at ways to help infra apprentices (like me) figure out the basics of fedora-infra setup 14:53:41 I think I was supposed to message mizdebsk or whoever created a ticket regarding this, with the idea of creating an apprentice workday to devise/brainstorm a solution 14:54:28 also cverna hosts office-hours for questions by apprentices 14:54:43 also if your python or javascript savvy https://github.com/fedora-infra is a good place to look for issues to solve 14:55:04 if there was an easyfix you wanted to look at now, someone could probably give you an idea to start research 14:55:17 I need to go, but I will look over 7105 14:55:28 aikidouke: ty 14:55:28 there is also https://github.com/release-monitoring 14:55:50 http://fedoraproject.org/easyfix 14:56:04 creaked: I will look at issues. That's a great idea. I could contribute there. 14:56:12 I also have a quick Q about something Im working on, but can askin in admin chnl if someone else needs time 14:56:35 Also if you can join the office-hours, then you can work together with other apprentice 14:56:41 like aikidouke and sapo are doing 14:56:41 eh, actually, will ask after meeting 14:57:22 cverna: Will do. I thought it was for help with tickets, not help _finding_ tickets as well 14:57:49 queue-tip: it is help :) for any topic even personal life if needed :) 14:58:03 cverna: office hours means office hours for real or is a codename for something else... 14:58:25 oh snap cverna - I have all kinds of things to ask now...LOL... 14:58:30 Kal3ssiN: for real :) 14:58:35 aikidouke: :D 14:58:47 cverna: is there a link for that? 14:58:53 pingou: release-monitoring is missing on the page 14:59:14 Kal3ssiN: https://apps.fedoraproject.org/calendar/infrastructure/ 14:59:27 cverna: ack, thx 14:59:38 mkonecny: there is doc at the top on how to add it ;-) 14:59:47 Thanks, I will add it 14:59:59 * cverna has to go another meeting 15:00:03 thanks all 15:00:24 ok I am going to have wrap this up 15:00:38 nirik, did you have any tickets you wanted looked at this week? 15:00:46 naw. nothing urgent 15:00:58 there is one ticket on the agenda, but yeah it can wait 15:01:23 ok. 15:01:28 #topic open floor 15:01:47 ok thank you all for coming. Our meetings are weekly. 15:01:53 and usually within 1 hour 15:01:55 ty smooge! 15:01:58 I've just realized that pagure 5.0 release is likely going to come in the middle of beta freeze 15:02:01 I hope you all have a good week 15:02:06 thanks 15:02:09 and also with u 15:02:16 pingou, well better beta than final :) 15:02:17 thanks all! 15:02:21 smooge: sure 15:02:25 thanks smooge 15:02:29 #endmeeting