15:03:42 #startmeeting FESCO (2020-05-11) 15:03:42 Meeting started Mon May 11 15:03:42 2020 UTC. 15:03:42 This meeting is logged and archived in a public location. 15:03:42 The chair is ignatenkobrain. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:03:42 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:03:42 The meeting name has been set to 'fesco_(2020-05-11)' 15:03:44 .hello2 15:03:46 zbyszek: zbyszek 'Zbigniew Jędrzejewski-Szmek' 15:03:49 .hello2 15:03:49 hey 15:03:50 #meetingname fesco 15:03:50 The meeting name has been set to 'fesco' 15:03:50 dcantrell: dcantrell 'David Cantrell' 15:03:58 #chair nirik, ignatenkobrain, decathorpe, zbyszek, bookwar, sgallagh, contyk, mhroncok, dcantrell 15:03:58 Current chairs: bookwar contyk dcantrell decathorpe ignatenkobrain mhroncok nirik sgallagh zbyszek 15:04:07 #topic init process 15:04:08 .hello2 15:04:09 .hello2 15:04:09 bcotton: bcotton 'Ben Cotton' 15:04:12 sgallagh: sgallagh 'Stephen Gallagher' 15:04:13 .hello2 15:04:14 bookwar1: Sorry, but you don't exist 15:04:15 I sent a reminder on #fedora-devel and then got disturbed at home 15:04:15 morning 15:04:21 sorry for delay :) 15:04:24 .hello2 15:04:26 decathorpe: decathorpe 'Fabio Valentini' 15:04:29 .hello2 15:04:29 bookwar: bookwar 'Aleksandra Fedorova' 15:04:30 .hello psabata 15:04:32 contyk: psabata 'Petr Šabata' 15:04:34 I've been disturbed for a long time. I understand 15:05:08 I think I see everyone here 15:05:20 so let's start 15:05:24 #topic #2372 F33 Self-contained Change: Network Time Security 15:05:26 .fesco 2372 15:05:27 ignatenkobrain: Issue #2372: F33 Self-contained Change: Network Time Security - fesco - Pagure.io - https://pagure.io/fesco/issue/2372 15:06:00 I can be +1 now. 15:06:11 * dcantrell reading again quickly 15:06:35 * mhroncok was +1 in the ticket 15:06:43 yeah, still +1 for me 15:06:47 Most people voted. 15:06:53 +1 (althought has anaconda team signed up to do the changes there or ?) 15:06:59 +1 15:07:01 +1 15:07:01 We're at like 6.75 days on this one. 15:07:30 Well, we have +6 right here, so that's approved. 15:07:41 nirik: my understanding is that the anaconda part is postponed too. 15:07:53 But indeed, that wasn't clarified really. 15:07:57 hum, I still see it on the change page? 15:08:45 I see that too, so I think I will reach to the change owner to clarify that, otherwise I think we are good 15:09:00 It says "Proposal owners" will do this (under "scope"). 15:09:02 so I guess let's wait for dcantrell and have +9,0,-0? 15:09:37 I said I was +1 15:09:55 there, added it again to the ticket 15:10:02 I think we are +9 now 15:10:07 (sorry) 15:10:14 #action ignatenkobrain to contact change owners and clarify anaconda part of the change. 15:10:17 #agree APPROVED (+9, ±0, -0) 15:10:20 #topic #2381 F33 System-Wide Change: systemd-resolved 15:10:24 .fesco 2381 15:10:25 ignatenkobrain: Issue #2381: F33 System-Wide Change: systemd-resolved - fesco - Pagure.io - https://pagure.io/fesco/issue/2381 15:11:00 did we see more discussion on list? 15:11:28 No 15:11:31 no, I think there was no discussion so I was hoping that RH Security Team will put something or sgallagh but that did not happen 15:11:41 Sorry, my week was out of control 15:11:57 Anyway, with Michael's last message, I can be 0 I suppose 15:12:05 so shall we punt this another week for that discussion? 15:12:28 zbyszek: I'd prefer if you would coordinate any nsswitch.conf changes with me and/or sbose though. 15:13:34 nirik: if you know what you wait for, then probably. in this case, I think it is waiting for nothing 15:13:55 btw, I was running the proposed configuration for some time and did not notice any regressions 15:13:57 Yeah, I'm the only one hesitating here, so just vote it in and I'll reserve the right to say I Told You So later :) 15:14:08 well, I was thinking there was going to be some input from the security folks? 15:14:15 I'm also counting myself as a 0 here 15:14:25 and sgallagh was going to bring up the nsswitch.conf changes and see if there was a better way to do them? 15:14:32 sgallagh: sorry, I dropped of the net for a second. 15:14:46 I don't think we need to hold it up for that. I can work with them on the implementation 15:14:54 I'm running it here too... the dnssec support doesn't work with dnf's key verify is the only thing I have noticed. 15:14:54 sgallagh: changes to nsswitch.conf will most likely be in authselect profiles. 15:15:04 Proposal: Approve change, zbyszek to make sure coordinating nsswitch changes with sgallagh or sbose 15:15:18 sgallagh: is is enough if I cc you on any PR or ticket so that you can review it? 15:15:27 Yeah, that'll work 15:16:01 +1 from me 15:16:28 +1 15:17:01 mhroncok: contyk bookwar decathorpe vote? 15:17:03 +0 (I wasn't able to dedicate time to get the details here, sorry) 15:17:06 zbyszek: I suppose you are +1 here :) 15:17:23 zbyszek is 0 15:17:28 Abstain, I guess. 15:17:28 I've been +1 over a week :) 15:17:30 ignatenkobrain: no, I'm not voting on my own ticket ;) 15:17:54 ok, bookwar ? 15:17:58 added my vote to the ticket 15:18:03 I'll set my +1, but i assume that if the security implications will be found during testing, we will go back to revisit 15:18:24 bookwar: well, sure. 15:18:40 #agree APPROVED (+4, ±5, -0) 15:18:48 #topic Next week's chair 15:18:49 In the sense that if security issues are found, I hope we'll fix them immediately. 15:18:56 But in the worst case... 15:19:05 I can take next week if nobody else wants 15:19:17 Hmm, that's actually never happened before; I'm not sure our policy covers (+5, 5, -0) 15:19:26 err, +4 15:19:39 "Because Fedora is not prepared to handle an influx of DNSSEC-related bug reports, we will disable this feature altogether." 15:19:42 ignatenkobrain: wait 15:19:50 "A majority of the committee (that is, five out of nine) is required to pass a proposal in a meeting." 15:19:59 APPROVED (+4, ±5, -0) is not a thing 15:20:12 oh 15:20:15 #undo 15:20:15 Removing item from minutes: 15:20:17 #undo 15:20:18 Oh, for heavens' sake. 15:20:24 +1 15:20:33 you have it, and if this backfires, it's on me 15:20:40 people with 0, do you need more time? 15:20:42 mhroncok: thanks mhroncok 15:20:49 last chance to say it 15:20:51 #agree APPROVED (+5, ±4, -0) 15:20:53 I'll give it a +1 as well, since zbyszek said he'd coordinate with me 15:20:56 :) 15:20:57 mhroncok: we could've just let it sit in the ticket for a few days and it would have been approved :) 15:21:02 okaay 15:21:03 #undo 15:21:05 decathorpe: I know 15:21:14 thanks sgallagh 15:21:23 #agree APPROVED (+6, ±3, -0) 15:21:25 decathorpe: I just want to make sure we follow our rules 15:21:27 #topic Next week's chair 15:21:46 mhroncok: yeah, thanks for catching that. 15:21:52 I can do it next week, but I'll be away the week after 15:22:02 I'll take meeting chair if no one else wants to 15:22:03 mhroncok++ 15:22:03 ignatenkobrain: Karma for churchyard changed to 6 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 15:22:13 Either way is fine with me 15:22:18 sgallagh: was first 15:22:24 #action sgallagh will chair next meeting 15:22:31 #topic Open Floor 15:22:42 anybody has anything for open floor? 15:22:48 yes 15:23:06 Proposal: Revise the voting rules to state that "0 (abstain) votes reduce the number of positive votes required to pass a measure" 15:23:07 not necessarily fesco, but what is next for the ticket I opened requesting 'sponsor' on my FAS account 15:23:20 sgallagh: hey! I was just typing this! 15:23:20 I know some of you may have seen that ticket or commented on it 15:23:28 * nirik can give his typical DC move update. Can wait until others are done. 15:23:35 sgallagh: I don't like such proposal to be just proposed on a meeting and voted on 15:23:46 sgallagh: -1 15:23:50 mhroncok: +1 15:24:03 mhroncok: Well, it's a meeting where we actually have all 9 members, so I figured it was reasonable 15:24:14 Plus, I think rule changes require unanimity? 15:24:16 sgallagh: no commuity invovement 15:24:16 I think this needs to be reworded, because we want to subtract from total too. 15:24:18 if half of the fesco says 0 - this shouldn't be approved, but rather discussed in more details, i think 15:24:19 sgallagh: wait, so you want to reduce the passing requirement to be the majority of only those voting? 15:24:34 bookwar: agreed 15:24:47 dcantrell: No, only those *not abstaining*. It's differnet. 15:24:47 bookwar: in that case somebody should say -1, no? 15:25:45 dcantrell: that should be approved in a few hours... it has to wait 7 days for votes, and I think it's up later this morning? 15:25:54 sgallagh: so I guess for you proposal - open a ticket and let's discuss it as usual. 15:25:58 nirik: noted, thanks for the update 15:26:01 Very well 15:26:19 #action sgallagh to open a ticket to discuss abstention policy 15:26:28 i have a topic on gcc10 15:26:31 Yes, I think people should either say "+1", or "-1" or "I need more time for discussion" or "0, I don't have an opinion because ...", and in that last case, this shouldn't block a ticket. 15:26:56 zbyszek++ 15:26:56 ignatenkobrain: Karma for zbyszek changed to 3 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 15:26:57 zbyszek: I kinda agree, but let's have this discussion on devel? 15:27:02 mhroncok: sure 15:27:23 zbyszek: the - and + could be more formalized on 0 votes. -0 == need more time/discussion, +0 == no opinion 15:27:28 zbyszek: if half of the fesco couldn't find time to get the opinion - we are doing something wrong, imho 15:27:35 dcantrell: -1 15:27:56 dcantrell: I prefer to spell it out. 15:28:02 bookwar: is gcc10 broken entirely? :) 15:28:03 yeah that's fine too 15:28:19 I'll write up some ideas and we can work it through on the list. 15:28:32 sgallagh++ 15:28:32 zbyszek: Karma for sgallagh changed to 2 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 15:29:04 sgallagh: thanks 15:29:10 dc move status: RDU - communishift is still down, hopefully we can get some traction to get the cabling and network changes we need done this week, hard to estimate however. IAD2: we have "found" all the machines that are new/were shipped. We have a number of virthosts installed and we have a few vm's up. This week will be getting external ips sorted out, firewall/nat rules, getting everything re-installed that needs it, and hopefully bringing up a 15:29:10 proxy or two and out auth stack (which means an openshift 3.11 cluster). 15:31:02 nirik: i've sent you e-mail on the requirements for fedora ci jenkins, which supposed to replace taskotron. We can discuss there. I currently don't undertsand if we should just sit and wait, or we can have some other options 15:31:29 I can look... I just got up before the meeting, so I haven't processed mail yet 15:31:34 sure 15:32:50 dcantrell also had something... 15:33:00 I'll also try and find out more about this weeks schedule when I can from networking/dcops 15:33:01 already mentioned, it was just my ticket question that nirik answered 15:33:21 nirik: is https://status.fedoraproject.org/CY2020-inframove.html being updated as the schedule changes or is it diverging? 15:33:47 that was more for the actual moving of services. 15:33:52 bookwar: do you want to talk about something related to gcc10? you've mentioned this some lines above 15:34:15 https://hackmd.io/Eqsf5wFoQRGYhAVwSKJvIA is our running planning/tasks (but might be too detailed) 15:34:27 ignatenkobrain: let's answer dcantrell's question is possible, and then enter the flame ) 15:34:47 ah, if it will be flame - I'll mentioned one thing 15:34:54 I have created script which to some extent automates and enforces FTI policy - https://pagure.io/releng/pull-request/9443 15:34:57 it is not perfect, but is quite good at setting needinfos and opening/closing bugs… 15:35:19 s/mentioned/mention/ 15:35:50 nirik: ack. i'll try to parse that to update f33-infra schedule and run it by you 15:36:14 bcotton: sure. 15:36:24 ignatenkobrain: cool. 15:37:19 ignatenkobrain: should this be schedule to run periodically from some infra job? 15:37:55 zbyszek: well, I guess once I add FTBFS support there and it is merged, that would be something what would be nice 15:38:37 ignatenkobrain: just make sure to stop / repalce the current cron job for ftbfs 15:38:39 while you are at it, add the security ones. :) 15:38:43 :) 15:38:57 I was just thinking exactly that ;) 15:39:02 but really, great work... 15:40:24 so, let's move to bookwar topic? 15:41:03 so, approved gcc10 pre-release version for Fedora 32 caused major failures in many projects (ex things like rabbitmq, freecad) These projects are not in the workstation default, but having crashing `systemctl rabbitmq-server start` command is not nice 15:41:21 afaik updated gcc10 was merged last week 15:42:16 so, 1) do we need mass rebuild? 2) do we need a retrospective on the gcc10 change ? 3) should we extend release criteria to cover some more packages and by which criteria? 15:42:28 It's also causing some build-time issues, as there are a bunch of packages that use poor version-detection 15:42:30 hard questions 15:42:47 e.g. comparing the first digit 15:43:03 I have plenty of feedback on 2), but it's an mprovement, this time we at least had a change proposal before the update happened 15:43:37 mhroncok: let's try the retrospective then? 15:43:39 1) do we know how much things are broken because of that? does rebuild fixes problem? 15:43:58 bookwar: 2) we can, though I am not sure what we would discuss there. 15:44:14 bookwar: re 1: I don't think so. The ABI is not changed, so a mass rebuild will most likely only cause problems. 15:44:35 is there a tracker for these issues? 15:44:39 the problem is that packages with broken gcc were built successfully, but they failed randomly when running 15:44:42 the problem with such changes is that you don't really have any leverage unless the change has landed. -- for example an ignored "your package doesn't build" bug may have consequences, but an ignored "we plan to upgrade gcc and your package won't build" but has none 15:44:43 A mass rebuild on the side, to see if there are build issues, would be useful. But it shouldn't be merged into rawhid.e 15:45:16 zbyszek: so this is exactly the issue - we check only if we can build packages 15:45:22 but they started to fail on run 15:45:22 I also think that mass rebuilding *everything* on F32 is more dangerous than helpful 15:45:23 bookwar: how is that even possible? Are there no tests in those packages? 15:45:38 hehe 15:45:43 bookwar: I know that fweimer has some database which he uses to find what needs to be rebuilt if some glibc thing breaks 15:45:57 zbyszek: if you wanted to say - gating, i am with you on that :) 15:46:06 probably GCC folks can do similar thing? 15:46:13 Unfortunately I don't know more details to say anything here 15:46:13 is this affecting both 32 and rawhide? or just the version in 32? 15:46:55 I don't see any mention of crash details, just that "things are crashing". suggesting things like mass rebuilds or whatever is just throwing darts at the wall and hoping something works. is there any analysis as to what's actually crashing? it'd make determining a course of action easier 15:47:04 Does anyone have the bug numbers handy? 15:48:58 https://bodhi.fedoraproject.org/updates/FEDORA-2020-2c6c85202d this is the update, which has upstream fixes 15:49:37 this is the exampel bug https://bugzilla.redhat.com/show_bug.cgi?id=1827357 15:50:00 the issue i hit was rabbitmq-server[13887]: *** stack smashing detected ***: terminated 15:50:42 we don't have to decide here really, and mass rebuild is probably too large of an action anyway 15:50:56 bookwar: but after upgrading gcc, the problem is gone, right? so there is no rebuild of rabbitmq was needed. 15:51:00 we should deal with the problems on case by case basis, mass rebuild is too dangerous 15:51:02 but i was wondering which measures we should consider to prevent this 15:51:13 bookwar: have more tests 15:51:14 ignatenkobrain: if you rebuild everything - yes 15:51:39 bookwar: the issue seen in rabbitmq-server would suggest it needs fixing since SSP caught something. it's not always 100% perfect, but it's reasonable 15:51:44 mhroncok: that was my default answer :) the other part was - don't update gcc to pre-release version maybe ? 15:52:06 dcantrell: no, that's already resolved issue, and it was on gcc side 15:52:18 Wait, did upgrading libgcc solve the problem? 15:52:26 Or did it require a rebuild with 10.1.1? 15:52:35 bookwar: we cannot eat the cake and have it 15:52:41 bookwar: in that case, nobody in this world will ever properly test gcc 🙂 if we don't use pre-release GCC for our packages 15:52:48 that 15:52:55 bookwar: I'm also seeing the gcc changes page the very first thing mentioned is fixing an C++ ABI compatibility issue. That likely is affecting things right now 15:53:01 bookwar: Yeah, Fedora is pretty much the only way GCC gets tested 15:53:16 Not carrying pre-releases just moves the fixes to post-release. 15:53:24 https://bodhi.fedoraproject.org/updates/FEDORA-2020-073c252157 15:53:36 (ie, rabbitmq-server was rebuilt with the fixed gcc) 15:53:40 bookwar: some failure will always slip through. I don't think we need to change the process on the basis on the few bugs that slip through. 15:54:00 looks like the rabbit thing is not really severe, if people are not superkarming it up 15:54:03 It would be great to catch "service crashes at start" much earlier though. 15:54:21 I don't know, it just seems to me that GCC 10 / fedora 32 was worse than other releases in that regard :/ 15:54:29 zbyszek: it is not "some" this time, people in the community literally switching to clang for some projects now 15:54:34 we should create an even rawer rawhide :) 15:54:49 really-raw-rawhide 15:54:50 dcantrell: cowskin? 15:55:10 i think it was due to the fact that packages were compiling, rather then failing in mass rebuild, and it let more errors through our process 15:55:12 there you go. get your gcc snapshots and hourly kernels and everything 15:55:13 bookwar: what people? where is the mob with pitchforks? 15:55:28 previously we got more build failures, which we were able to catch 15:55:43 bookwar: dunno, for me the gcc10 transition was easier than the gcc9 transition. 15:55:57 (In the sense of upstream work in various projects.) 15:56:16 would this have been caught by a gating test that did 'program --help' and confirmed it exited 0? 15:56:23 anyway, I think the real solution here would be to run tests of whole distro after libgcc update, and in more generic words, on a dependency change - run tests of a component and report results to the one who made an update 15:56:24 zbyszek: that's what i think happened, we got errors hidden under seemingly nice package rebuilds 15:56:29 * sgallagh needs to drop in three minutes 15:56:49 ignatenkobrain: yes please 15:56:53 nirik: Probably, yes 15:56:58 nirik: yes, this what makes me think about covering mass rebuilds with additional testing 15:57:09 bookwar: how hard would it be to implement? 15:57:20 I mean my idea 15:57:27 currently i think mass rebuilds are excluded from gating 15:57:42 They are, because the load is enormous 15:57:45 yeah... they are. 15:57:58 sgallagh: thanks for the fesco voting email 15:58:05 Any time 15:58:05 I don't think bodhi can handle 25000 package side tag update :) 15:58:10 ignatenkobrain: your idea is the revdeps pipeline, we will set it up once we get jenkins running :) 15:58:16 bookwar: but if I update foo (which bar depends on), will gating tests run on bar? 15:58:22 I think rather than everything with a libgcc update, there be a short list of packages we test with it (_somehow_) and over time we just grow that list as necessary 15:58:48 so we probably need to include adamw in this conversation 15:59:02 he'd have a good idea of where to start with that 15:59:04 I'd pretty much prefer to have this async 15:59:05 we could possibly run eventhing thru gating, but it might take a while. 15:59:15 mhroncok: +1 15:59:18 OK, I have to drop. Thanks, folks. 15:59:19 nirik: I would be -1 for that 15:59:24 let me actually take an action on this 15:59:34 mhroncok: oh? why? due to the time? 15:59:43 nirik: I don't want for example a python 3.9 rebuild of 3000 packages be gated on a valgrind check in one of them 15:59:48 i need to talk with Fedora QA on how can we cover this cases better 16:00:35 mhroncok: well, I was thinking just the mass rebuild... and not to file it as some gigantic update, but to send each package in in a stream... like it would if everyone just rebuilt their packages 16:00:46 nirik: the gating land is hic sunt leones 16:01:06 #action bookwar to discuss with Fedora QA how to cover runtime testing better 16:01:07 nirik: oh, I misunderstood 16:01:12 * nirik can only parse that based on context. :) 16:01:38 so, anything else to discuss? 16:01:41 the only way to handle runtime testing better is to... have some runtime tests 16:01:46 no tests? no handling it better 16:02:01 mhroncok: well, even you have them - something should trigger them :) 16:02:14 i meant mass-rebuild case, not runtime test in general :) 16:02:41 Also, packages should have a 'foo --test >/dev/null || exit 1' in %check. 16:02:53 Not hard to implement, but really helps with many many stupid bugs. 16:03:30 zbyszek: yes, maybe the answer to any gcc10 complains is just go and submit the check like this 16:03:38 In particular writing to stderr from --help. 16:03:39 i should do it for reabbitmq right now.. 16:03:59 :) 16:04:42 so, I'll close meeting in 10 if nobody has anythign else to discuss 16:04:43 so, thank you all for the discussion :) 16:04:56 ignatenkobrain: 10 seconds? 16:05:03 ignatenkobrain++ for filing FTI bugs again 16:05:04 years 16:05:10 smooge++ 16:05:10 mhroncok: Karma for smooge changed to 10 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 16:05:21 * nirik wonders if we could make it generic... for everything in _bindir and _sbindir ... run --help 16:05:39 nirik: if only that would actually work :D 16:05:41 #endmeeting