15:01:48 #startmeeting Infrastructure (2019-02-21) 15:01:48 Meeting started Thu Feb 21 15:01:48 2019 UTC. 15:01:48 This meeting is logged and archived in a public location. 15:01:48 The chair is smooge. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:48 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:01:48 The meeting name has been set to 'infrastructure_(2019-02-21)' 15:01:48 #meetingname infrastructure 15:01:48 #topic aloha 15:01:48 The meeting name has been set to 'infrastructure' 15:01:48 #chair nirik pingou puiterwijk relrod smooge tflink threebean cverna mkonecny mizdebsk 15:01:48 Current chairs: cverna mizdebsk mkonecny nirik pingou puiterwijk relrod smooge tflink threebean 15:02:01 morning 15:02:03 morning. 15:02:06 .hello zlopez 15:02:07 mkonecny: zlopez 'Michal Konečný' 15:02:07 morning everyone 15:02:10 .hello 15:02:12 pingou: (hello ) -- Alias for "hellomynameis $1". 15:02:18 meh 15:02:21 .hello pingou 15:02:22 .hello2 15:02:22 pingou: pingou 'Pierre-YvesChibon' 15:02:25 mizdebsk: mizdebsk 'Mikolaj Izdebski' 15:05:02 #topic New folks introductions 15:05:02 #info This is a place where people who are interested in Fedora Infrastructure can introduce themselves 15:05:02 #info Getting Started Guide: https://fedoraproject.org/wiki/Infrastructure/GettingStarted 15:05:10 Hello any new people this weke? 15:05:47 I'm so old I could be new again? :) 15:05:49 * nirik runs 15:06:40 nirik, you got hit by the 1970-01-01 fedmsg bus? 15:07:09 I even predate that. :) but I digress... 15:07:17 #topic announcements and information 15:07:17 #info nirik will have sparse hours due to house move 15:07:17 #info Mass branching happened and it was ... 15:07:17 #info Mass update/reboots planned for 2019-02-26 -> 2019-02-28 15:07:17 #info Beta Freeze Begins 2019-03-05 15:07:18 #info Anitya (release-monitoring.org) 0.15.0 was released and deployed on staging 15:07:20 #info Pagure 3.2.90 was deployed to stg.pagure.io. Please test 15:07:22 #info Pagure 3.2.90 was deployed to src.stg.fedoraproject.org. some manual work was needed 15:07:24 #info Taskotron Staging re-deployed on F29, mostly working 15:07:38 5.2.93 even :) 15:07:53 you can tell I wrote this earlier this week 15:08:03 oh cool. 15:08:11 #info Koji 1.16.2 was deployed today. No new features, just a CVE fix. 15:08:13 did that include repospanner 0.4? 15:08:18 no 15:08:22 ok 15:08:32 repospanner 0.4 may be tomorrow or Monday 15:09:08 we should schedule a mass update/reboot next week... tue/wed or just marathon wed. 15:09:28 Yep. I would prefer Tue/Wed 15:09:42 puiterwijk: nice cve 15:09:43 Unless we are going to do it during working hours 15:10:20 because I am finding that working after 00:00 UTC hard 15:10:28 I probibly will finally be moving next week (I sure hope), but still unclear when, and I can help as time permits 15:11:37 might know more tomorrow, or we could just schedule tue/wed and I can try and help as much as I can. . 15:11:45 .helo2 15:11:47 .hello2 15:11:48 bowlofeggs: bowlofeggs 'Randy Barlow' 15:11:54 (sorry i'm late) 15:12:22 ok that does change the calculations a bit 15:12:28 puiterwijk, how does your next week look? 15:12:44 smooge: hopefully better than the last few? 15:12:47 you are on call so it would be your say on what days and times you want to deal with outages 15:12:50 #info bodhi-3.12.{0,1,2} were deployed this week (1 and 2 were sad) 15:12:55 * tflink plans to be around to help, at least with the qa stuff 15:13:14 #undo 15:13:16 and I realized.. I need to do this later in the meeting 15:13:17 #info bodhi-3.13.{0,1,2} were deployed this week (1 and 2 were sad) 15:13:18 smooge: meh, whatever. I'll end up dealing with it anyway probably. 15:13:38 so let's put off scheduling in the section where we do that 15:13:52 #topic Oncall 15:13:52 #info smooge is on call from 2019-02-14 -> 2019-02-21 15:13:52 #info puiterwijk is on call from 2019-02-21 -> 2019-02-28 15:13:52 #info ?????? is on call from 2019-02-28 -> 2019-03-07 15:13:53 #info ?????? is on call from 2019-03-07 -> 2019-03-14 15:13:54 #info ?????? is on call from 2019-03-14 -> 2019-03-21 15:13:55 #info Summary of last week: (from smooge ) 15:14:09 So last week had a couple of issues.. almost all of them were fixed by mizdebsk 15:14:11 * relrod gets home from class in time for meeting for once! Ooh yay just in time to be given on-call duties :P 15:14:44 thank you very much mizdebsk 15:14:55 mizdebsk++ 15:14:55 pingou: Karma for mizdebsk changed to 17 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 15:15:07 yes indeed. appreciated. ;) 15:15:13 nice 15:15:18 my pleasure 15:15:37 mizdebsk++ 15:16:14 mizdebsk, is going to help me on getting something in to have fedmsg-hub and fedmsg-hub3 restart if it grows too much in memory 15:16:26 and then our large amount of pages will be that 15:16:41 #topic Monitoring discussion 15:16:41 #info https://nagios.fedoraproject.org/nagios 15:16:41 #info Go over existing out items and fix 15:16:59 I think it's mostly similar to last week. 15:17:02 * nirik looks 15:17:30 someone could power back up kvm01 (not that we use it much) 15:18:08 osbs-master01.stg is firewall (fixing) 15:18:35 perhaps we should move back to autosign01 later today? 15:18:40 should we turn swamp back on pkgs02.phx2.fedoraproject.org 15:19:08 nirik: autosign01 is on my list, will work on it today 15:19:17 relrod: cool. 15:19:34 smooge: if you can, I tried to clear it, but it won't go back on because memory is too fragmented. 15:19:45 ah ok.. that is a reboot then 15:19:48 swapon: /dev/vda2: swapon failed: Cannot allocate memory 15:20:00 which it iwll get next week 15:20:13 we should remove the askbot fedmsg check 15:20:34 ok will do that 15:20:50 ppc8-02/03 and virthost06 have disks to replace 15:21:01 smooge, did you have time to look at openshift nagios plugin? 15:21:31 no sadly the proxy 503 issue ate all my time 15:21:38 which reminds me 15:21:47 no problem 15:22:03 #info mirrorlist proxies are much lower on 503's but proxy01 still generates much more than any other 15:22:14 #topic Tickets discussion 15:22:14 #info https://pagure.io/fedora-infrastructure/report/Meetings%20ticket 15:22:38 No one marked a ticket for discussion so we will move to the next. 15:22:44 nice work on the 503s smooge :) 15:23:07 how much did it decrease on non proxy01? 15:23:09 Please people (and audience members) if you have a ticket that you want us to work on/talk beyond anything else.. please mark it 15:23:22 from several thousand to 0 15:23:45 but I want to talk about that next 15:23:55 and if you can't mark the ticket for meeting, you can add a comment asking us to do so 15:23:58 #topic Priorities for next week? 15:23:58 #info please put tickets needing to be focused on here 15:24:06 mizdebsk++ 15:24:21 OK the top priority for next week will be updates/reboots 15:24:42 https://pagure.io/fedora-infrastructure/issue/7523 may be nice to have 15:24:54 so we can move ahead with testing some of the already ported apps in stg 15:25:03 +1 15:25:25 I was going to do that, but I had questions... which folks have replied to, but I haven't gotten back around to it. 15:25:28 Anitya and the-new-hotness are waiting for this already :-) 15:25:57 There are a TON of packages queued up for systems.. so I am actually going to ask to update staging hosts sooner and test that we don't have a BOOM 15:25:58 basically how we want to name them: $service, $fqdn, something else 15:26:01 i wouldn't mind seeing some movement on the rabbitmq certificates 15:26:05 nirik: thanks for that :) 15:26:17 bodhi in production *should* be able to publish with fedora-messaging (but we should test in stg first, of course ☺) 15:26:33 pagure 5.3 comes with fedora-messaging support as well 15:26:34 it sounds like we want to do $service I suppose... 15:26:35 nirik: let's call them $service. 15:26:41 (to be tested as well) 15:26:47 that seems... less exact, but ok 15:27:02 $service.$fqdn ? 15:27:13 is there a schema? perhaps we could name them by that? 15:27:18 smooge: well, the thing is that we don't often have an fqdn... 15:27:30 org.fedoraproject.bodhi or whatever 15:27:32 Due to the fact that many things are moving to openshift. And in addition, as Jeremy said, per-host certs add nothing 15:27:41 Sure, that's fine. 15:28:00 $domainname.$service ? 15:28:05 io.pagure.pagure? 15:28:24 I would suggest to not make it harder than need be... 15:28:24 if there's a source of truth for that, I can use that... 15:28:47 or we can just do servicename I guess. 15:28:51 we can always redo it 15:28:55 +1 15:28:58 ok I am going to give people 1 more minute to call out shed colours 15:28:59 +1 15:29:03 Just servicename is fine I'd say. Nobody else will see it in the messages anyway 15:29:08 well, $servicename$env 15:29:13 +1 15:29:15 It's just username for rabbit. 15:29:22 pagure.stg - srcfp.stg 15:29:22 we do want seperate stg and prod certs right? they are different cas I think? 15:29:43 so we could just name them the same, but that might be confusing 15:30:15 nirik, puiterwijk would any eventual leakage between environments cause problems? 15:30:16 How it is done now for fedmsg? 15:30:21 because we know they will mix 15:30:23 Different CAs, yeah. But adding .stg or .prod might be useful for quick determination 15:30:31 smooge: there's entirely separate CA chains. 15:30:38 So the stg cert just will not work for prod 15:30:48 mkonecny: mostly fqdn, but with openshift... thats not very useful 15:31:07 right, so I will do $servicename.stg for stg and $servicename for prod 15:31:15 with fedmsg, staging services sometimes listen to prod messages; we don't want to allow it with fedora-messaging? 15:31:16 puiterwijk, someday someone will sign something wrong.. OR we will have one of our systems which needs to be on both 15:31:17 (so we can use the ansible $env thing) 15:31:25 Sure 15:31:36 smooge: if it needs to be on both, it'll need to connect to two brokers anyway 15:31:54 i.e. two different entire connection chains, which includes certs 15:31:55 mizdebsk: we can... I guess for those we would do specific more descriptive certs? 15:32:12 I *think* that we had plans to do a broker-shovel. 15:32:17 openqa-stg-listening-on-prod 15:32:24 So have the prod broker send stuff to the stg broker 15:32:25 or that could work 15:32:43 ok anything more on this? 15:32:47 And then apps can just ignore or not specific topics 15:32:51 * nirik will get these today 15:33:12 OK I marked the ticket as next-meeting just to say it was this meeting 15:33:32 thanks nirik and everyone else 15:33:39 nirik: let's deploy to prod at EOB friday in your TZ 15:33:51 Any other high priority items for next week? 15:34:07 smooge: yeah, is next week going to be the week of the linux desktop? 15:34:20 that can wait for open flood 15:34:29 bowlofeggs: isn't it always? 15:34:37 #topic Discuss: Is the Fedora pastebin still useful? - relrod 15:34:37 #info how many users are using it? 15:34:37 #info should we look at converging with CentOS one so simpler setup? 15:34:41 relrod, you are up 15:34:42 haha 15:35:00 i use the pastebin sometimes, fwiw 15:35:04 So our pastebin has become a cesspit of spam 15:35:04 I use it all the time 15:35:08 so i'd say yes 15:35:12 * pingou uses it most frequently 15:35:15 and we are also duplicating efforts with centos who runs one 15:35:26 i would also say that there's no reason people couldn't just use another public one though 15:35:37 and yeah, duplicating with centos is not ideal either 15:35:39 relrod: can fpaste work with the centos one? 15:35:40 I would like to offer the following optin 15:35:43 but why should either project host one? 15:35:48 I'm fine with consolidate on one 15:35:49 and can the centos one be banded/answer for ours? 15:35:57 Also upstream is... somewhat dead? They talked about doing a rewrite/version 2 of it and then went silent 15:36:04 just run two instances of the same thing 15:36:09 sadly the way of such things. 15:36:25 relrod: isn't this the second or third upstream that we've used? 15:36:31 tflink: yeah :( 15:36:42 we run with Bahhumbug's playbooks and modify fpaste to work with the php stikked 15:36:46 yikes, short lifetimes 15:36:49 nirik: fpaste right now can't, but I think it would be pretty easy to get it to hit the centos one 15:37:12 relrod: the cli ? 15:37:17 pingou: yeah 15:37:20 ok 15:37:27 the CentOS one is setup to flush everything after 1 day to deal with spam 15:37:32 is there a reason for either project to host one of these thought? 15:37:43 there are other hosted one on the interweb right? 15:37:46 if there was a nice one we could run in openshift that could be an option, but of course we would still be running it. 15:37:46 A LOT of people use them 15:37:58 bowlofeggs: branding I guess? 15:38:06 even spammers, and that's telling something :) 15:38:07 i know people use them, but why do they need to use ours and not someone else's? 15:38:14 yeah branding 15:38:15 show your pride in fedora by using paste.fedoraproject.org or something? 15:38:30 it's fun to host stuff to an extent, but this doesn't seem like a wheel house to me 15:38:34 bowlofeggs, the reason Fedora community asked us was for 2 reasons: 1 branding and 2 they trusted us more than most of the paste sites 15:38:38 what even is a wheel house… 15:38:38 also we have a cli and I don't know how some places would feel about allowing it? 15:38:50 must be a water wheel thing? 15:39:02 bowlofeggs: the top cabin of the ship with the wheel that directs it and the captain. 15:39:04 the cli is nice for sure 15:39:07 i use it 15:39:08 a lot of those other sites go dodgy with ads and otehr stuff 15:39:12 bowlofeggs: http://taylormarshall.com/wp-content/uploads/2013/09/hamster-ball.jpg 15:39:14 nirik: oooh 15:39:16 haha 15:39:30 pingou: haha 15:39:33 yeah, there's a bunch with ads and trackers and other malware 15:39:44 well i'm not against hosting it, just thought it was worth asking the question 15:39:50 doesn't affect me really ☺ 15:39:52 so we were asked to set this up to avoid that for fedora qa and other related tasks 15:39:57 i do use ours for sure and i like it 15:40:00 if the centos one could be branded or we could both run the same one that might be good 15:40:17 info bowlofeggs is now the PM for the paste project because it didn't affect him before 15:40:17 how does the centos one look like? 15:40:21 lol 15:40:24 hahahah 15:40:25 https://paste.centos.org/ 15:40:36 I suspect that this question has been asked but would it be possible to re-purpose something like spamassassin to help filter out some of the spam? 15:40:44 is anyone from centos here today? bstinson? 15:40:50 tflink, the stikked has several plugins 15:40:52 tflink: possible, but work... 15:41:04 tflink: possibly but then we diverge from upstream and have to maintain a fork of modernpaste 15:41:06 but it is a constant job of Bahhumbug on that 15:41:22 relrod: we might be able to contribute to upstream 15:41:24 tflink, it is a lot of work. 15:41:26 nirik: yeah, I hadn't gotten to the "would it be worth the effort" part :) 15:41:26 i'm sure that'd be generally beneficial 15:41:27 proposal: gather info for a week and discuss next time? 15:41:35 of course, writing it at all takes $resources 15:41:52 nirik: +1 15:41:52 +1 to nirik's proposal 15:42:00 nirik: sure. I usually have class at meeting time, but if someone else can bring it up next week, +1 15:42:10 we can and should discuss this on the list then 15:42:15 relrod: can you do the gathering of info? or would you like someone else to? 15:42:16 the branding on the CentOS paste is minimal 15:42:25 relrod, I think you opened up an email thread on this previously? 15:42:26 so we could likely just replace CentOS by Fedora and done 15:42:47 smooge: Not that I recall, at least not recently 15:42:52 if we run a Stikked in openshift we could make it use EmprtyDir for storage, and just gets wiped everytime we deploy. 15:43:16 nirik: :s may make paste dispear sooner than a day 15:43:16 it might be even possible to do some proxy server tricks where it's the same paste instance, but depending on the CNAME you use you see a different logo 15:43:20 pingou, the centos site is incredibly simple set up and most of the options are turned off to make it fast as possible 15:43:32 they don't seem really active either... 15:43:41 no paste is active 15:44:03 smooge: do we need/want any of the options that are actually off? 15:44:06 bowlofeggs: or just do a theme that has both logos somehow and make paste.fp.o and paste.c.o both work, if they are amenable to that 15:44:13 pingou, I don't think so 15:44:20 we could use twitterfs to store the pastes in twitter as tweets! then we don't have to solve storage issues in openshift! 15:44:29 there are a couple of things we do to guard against crawling/indexing too 15:44:33 ok guys.. we have a motion 15:44:42 but smooge is better connected about this than i am 15:45:00 well I just pinned Bahhumbug down last week on it 15:45:11 nirik: I can try to gather some info... 15:45:37 cool. If you like I can assist, or stay out of your way 15:45:50 I think we have more options that we did before now... 15:45:50 relrod,what nirik just said 15:46:14 because if we just run it in openshift we can avoid packaging it 15:46:40 so we could even look at hastebin or something 15:46:48 #info relrod will gather more info on pastes and we will discuss options next week 15:47:06 My main thing is the spam problem. The 7 day thing hacks around that, but makes it less useful imo (there are pastebins online with no expiry). So just wanted to toss the question out. 15:47:30 I'm fine with up to 2 or 3 days 15:47:32 yeah, might be worth looking at what other ones do for that... 15:47:42 centos offers 1 which may be a little short 15:47:43 it must be a common problem over them all 15:47:56 most of the pastebins don't care about spam 15:48:12 the ones with ads actually like it because they make money off it 15:48:20 many of them have anti-features we don't want either like password protected or client encrypted 15:48:26 perhaps we could offer 1 day to anonymous users and more to authenticated users 15:49:04 ok lets put those ideas on the future email thread 15:49:07 mizdebsk: well that's the other thing, we never got auth working on modernpaste. puiterwijk had a patch that integrated with oidc stuff, but ran out of cycles to work on it 15:49:21 we are coming up to an hour and I would like to see if we have other items to discuss 15:49:27 smooge: +1 15:49:41 #topic mirrorlist 503's 15:50:00 #info we had a big increase in 503's on proxy01 starting 2019-01-20 15:50:31 #info we worked on a couple of wsgi patches which dropped the number in 1/2 on proxy01 15:51:19 was 2019-01-20 the day we moved it to f29? 15:51:20 #info the other hosts which have large numbers are in EU where they get hit more due to DNS 15:51:52 nirik, I don't know.. I was focusing on some other items at the time 15:52:02 nope, it was moved in december 15:52:16 when did the mirrorlist image get updates 15:53:16 at the moment I am mostly writing about this to document in the meeting minutes 15:53:22 "2 months ago" 15:53:47 is it worth checking if we have a single ip or network hitting things hard? (that showed up on 01-20...) 15:53:59 nirik: "docker image inspect"? 15:54:20 "created": "2018-10-15T11:04:36Z", 15:54:28 when I did it.. it was all over the place 15:54:30 then the next change was the ones we did about a week ago 15:54:45 basically at the beginning of the hour 20,000 systems checkin 15:55:36 they hit the server at the end of the :59 minute to the beginning of the :00 minute and then some number of them get a BOOM 503 15:55:49 i've g2g to another meeting - have a great day everyone! 15:55:56 see you bowlofeggs 15:56:22 I will put more info in another email. I think we have run out of time to discuss this longer 15:56:29 #topic Open Floor 15:57:30 anyone have anything for the floor.. 15:57:47 if not, thank you all for coming this week. I hope to see you all next week 15:57:48 I was previously a member of the fedora-apprentice group but was removed due to inactivity. I think I'm going to have some spare time coming up and was hoping I could get readded to that group. 15:58:15 ok get with me after the meeting in #fedora-admin 15:58:17 I'll hopefully have some items next week on community openshift, but we can save that for next week. ;) 15:58:29 I wanted to look into #7376 among others. 15:58:30 nirik, I was going to say 2 weeks 15:58:37 because next week is going to be boxes 15:58:41 sooo many boxes 15:58:58 yeah, and me moving probibly, or imploding due to not moving 15:59:31 #endmeeting