16:29:09 #startmeeting fedora_coreos_meeting 16:29:09 Meeting started Wed Jul 15 16:29:09 2020 UTC. 16:29:09 This meeting is logged and archived in a public location. 16:29:09 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:29:09 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:29:09 The meeting name has been set to 'fedora_coreos_meeting' 16:29:12 #topic roll call 16:29:24 .hello2 16:29:25 slowrie: slowrie 'Stephen Lowrie' 16:29:33 .hello2 16:29:34 bgilbert: bgilbert 'Benjamin Gilbert' 16:29:40 .hello2 16:29:41 dustymabe: dustymabe 'Dusty Mabe' 16:29:53 .hello2 16:29:54 jlebon: jlebon 'None' 16:29:55 .hello2 16:29:57 lucab: lucab 'Luca Bruno' 16:30:11 .hello cverna 16:30:12 cverna: cverna 'Clement Verna' 16:30:24 .hello2 16:30:25 jdoss: jdoss 'Joe Doss' 16:30:29 o/ 16:30:32 #chair slowrie bgilbert lucab cverna jdoss jlebon cyberpear 16:30:32 Current chairs: bgilbert cverna cyberpear dustymabe jdoss jlebon lucab slowrie 16:30:33 .hello2 16:30:34 cyberpear: cyberpear 'James Cassell' 16:32:42 #topic Action items from last meeting 16:32:51 * bgilbert to enable LogBot 16:32:54 * jdoss to open a ticket to pin podman to 1.x series version in 16:32:56 testing-devel with dustymabe's help 16:32:59 .hello2 16:33:00 lorbus: lorbus 'Christian Glombek' 16:33:06 #info bgilbert enabled LogBot 16:33:42 #info jdoss and dustymabe were able to pin podman to 1.9.3 in testing-devel stream. the next stream will contain podman 2.x and we can gather feedback on the update from users there. 16:34:06 is someone managing security patches for the pinned version? or is it just a temporary pin until 2.x has baked in next? 16:34:21 The PRs that I have been watching got merged so 2.0.3 most likely will be what we want to push to Next as soon as it drops. 16:34:54 cyberpear: ideally a temporary pin. we just want to make sure there isn't any large fallout and also wanted to get a few things fixed first 16:35:30 major version bumps are major version bumps for a reason :) figured there should be some soak time even if they have the goal of full backwards compat 16:35:54 anything else before we dive into tickets 16:36:03 +1 16:36:05 thanks bgilbert for enabling logbot! 16:36:37 I like that logbot automatically trims the join/left messages 16:37:14 #topic Discussion: OKD release schedule and blocking FCOS releases on OKD e2e tests 16:37:21 #link https://github.com/coreos/fedora-coreos-tracker/issues/562 16:37:28 lorbus: you're up 16:37:34 this is two things really 16:37:42 the first thing is an announcement: 16:38:06 OKD will release roughly bi-weekly, alternating with the FCOS releases 16:38:22 i.e. an FCOS image gets an additional week of bake time for OKD 16:38:45 #info OKD will release roughly bi-weekly, alternating with the FCOS releases 16:39:16 * dustymabe waits for 2nd bullet point before diving in to discussion 16:39:32 the second part is a discussion item: Whether or not we want to block FCOS releases on working for OKD 16:39:45 I would say no - for a few reasons: 16:40:19 First of all changes in the base OS may require changes in the cluster/operator code 16:40:55 changes like that would make an OKD e2e test fail of course 16:41:10 so I don't think we can block on that 16:41:34 right, anything else? 16:41:36 well it was really just that one reason :D 16:41:44 ok 16:41:57 so the points are: 16:42:01 1. OKD will release roughly bi-weekly, alternating with the FCOS releases 16:42:02 what I would propose is that do run e2e OKD tests with any given FCOS testing image 16:42:10 2. Whether or not we want to block FCOS releases on working for OKD 16:42:26 but make them non-blocking - that will give us time to fix things on the OKD side 16:43:19 when we say "FCOS releases" we are only talking about `stable`? 16:43:39 yes, OKD releases will only use images from the stable stream 16:43:40 right. I agree that we should not "block" by default on OKD tests passing, but we do care about the OKD use case and we should inspect those failing tests. They may be indicitave of a bigger problem that could affect more than just OKD. 16:44:14 lorbus: i'm assuming OKD also has the same promotion pipeline as OCP, right? so e.g. a broken-for-OKD FCOS release normally wouldn't get auto-promoted 16:45:00 definitely +1 for OKD tests in f-c-c though 16:45:11 yes, OKD releases are done manually - we wouldn't promote a broken image to an OKD release 16:45:48 lorbus: in that case maybe we should run the tests on the testing-devel stream and not `testing` 16:45:51 (broken as in broken for the OKD use-csae) 16:45:55 or both 16:46:07 maybe both to be sure? 16:46:23 meh, i'd just turn it on in CI for everything 16:46:25 but that's debatable, maybe testing-devel will suffice 16:46:38 jlebon: +1 16:46:42 ("for everything" for all the branches) 16:46:45 if we can do that, I think we should 16:47:29 maybe, though, we might want to be able to turn it off easily in case there is a known issue that we're working on and don't want to get notified all the time about it 16:47:42
16:47:56 i think we can zoom out and try to write something down we agree on 16:48:37 +1 re. not blocking releases overall 16:48:39 first off, does anyone have concerns about triggering OKD tests whenever we have a new build of FCOS? 16:48:58 to be clear, the tests get started, but they don't block anything if they fail 16:50:15 fwiw I do not have concerns about that 16:50:48 #proposed the group agrees it would be a good idea to trigger OKD tests on new builds of FCOS so we can get feedback about breakage sooner than later. It's possible the breakage is a regression in the base OS OR a desired change that OKD needs to adapt for. Either way running tests will help both FCOS and OKD. 16:51:20 +1 16:51:38 +1 16:51:40 +1 16:52:04 +1 16:52:35 this effectively means as a CI hook for the fcos-config? Or what is the trigger and where does the notification go back to? 16:53:11 lucab: I would run them as a trigger on the pipeline (so not triggered on every PR to our git repos) 16:53:28 but only when we have a new development build or prod build 16:54:15 probably something like what we have for kola AWS and GCP tests right now 16:54:52 does that align with your thinking ? 16:55:09 +1 16:55:15 hmm, though prow hooked up to the github repo is probably the lowest friction path to enable testing 16:55:45 I can open an issue to discuss the Details on 16:55:50 i think we can discuss the details outside 16:55:50 I was just asking in order to get an idea. The good thing about PR hooks is that the red/green marks are quick and ubiquitous feedback 16:55:51 +1 16:56:00 ack 16:56:06 #agreed The group agrees it would be a good idea to trigger OKD tests on new builds of FCOS so we can get feedback about breakage sooner than later. It's possible the breakage is a regression in the base OS OR a desired change that OKD needs to adapt for. Either way running tests will help both FCOS and OKD. 16:56:25 ok real quick lorbus - I did want to talk about your first point briefly 16:56:36 1. OKD will release roughly bi-weekly, alternating with the FCOS releases 16:57:19 since the bi-weekly schedule for FCOS is not a hard guarantee (we loosely abide to it) you might want to be prepared for the week that we push a stable release to the following week for whatever reason 16:57:27 what do you do in that case, etc.. 16:57:31 that means we're planning to release OKD on any given FCOS version approx 1 week after the FCOS version as been released 16:57:47 do you indefinitely adjust the release schedule of OKD or do you try to get back on schedule? 16:58:14 hence the roughly - it's more about the cadence than the actual day 16:58:48 we're independently doing the releases, but we'll follow FCOS 16:59:09 right if the OKD community is OK with "approximately one week after FCOS stable" and it doesn't matter if it's 2 weeks or 3 weeks then that's cool 16:59:11 ofc in case there's an issue with the OpenShift code, we might do a second OKD release on top of the same FCOS version 16:59:19 just want to make sure we manage expectations there 16:59:31 yep, that sounds good 16:59:34 cool 16:59:54 #topic Change deadline for FCOS releases 17:00:03 #link https://github.com/coreos/fedora-coreos-tracker/issues/571 17:00:14 bgilbert: want to intro this one? 17:00:29 +1 17:00:38 this is mostly a question of our process internally, but wanted to put it out to the broader community 17:00:52 we have a nominal two-week cadence with well-defined release dates 17:01:33 for planning purposes, anyway. we've never made any promises about when new releases will land :-) 17:01:47 "RHST"? 17:01:48 sometimes we hold releases to fix a late-breaking bug. that seems fine 17:01:53 Red Hat Standard Time 17:01:57 EST/EDT 17:02:03 +1 17:02:12 but sometimes we hold them for features we'd like to land 17:02:44 and that doesn't seem like a great tradeoff. it pushes releases later in the week, which requires us to either set a shorter rollout or potentially roll out over a weekend 17:03:06 and sometimes those features have bugs, as with the releases this week, causing further holds 17:03:16 so if we can set an expectation that we will not hold releases for features, that would make things more predictable 17:03:44 +1 17:03:47 and would also give developers some clarity about when to get their PRs in :-) 17:04:08 +1 17:04:39 sounds good 17:04:52 I think of this more as a forcing function for us to do a little better job of thinking ahead about features we'd like to land 17:04:57 dustymabe: +1 17:05:05 I'll emphasize that the current proposal is not for a code freeze 17:05:19 imo: +1; the short 2 week cycle means that even if we slip getting the feature into a release it's not very long till we can get it in. Obviously we'd need the ability to apply discretion in cases where something like CVEs hitting. 17:05:25 it's okay to merge after the change deadline, for now. we can revisit that if we need to tighten things up further 17:05:58 slowrie: yup, we'd still apply judgment for bugfixes and CVEs 17:06:59 thoughts/objections? 17:07:21 +1 17:07:40 bgilbert: let's see what you think of this: 17:07:45 #proposed In order to get a little more clarity and reliability for our releases we'd like to implement a loose guideline that if changes don't land by the previous Friday of a scheduled release then we won't hold the release for that change. This gives the release wrangler some clear guidelines about when it's safe to proceed with running the release. 17:08:11 s/loose // 17:08:18 "guideline" still doesn't mean "rule" :-) 17:08:22 +1 17:08:33 will make that edit for agreed if everyone agrees :) 17:08:38 ack 17:08:45 ack 17:08:46 +1 17:09:07 +1 17:09:38 #agreed In order to get a little more clarity and reliability for our releases we'd like to implement a guideline that if changes don't land by the previous Friday of a scheduled release then we won't hold the release for that change. This gives the release wrangler some clear guidelines about when it's safe to proceed with running the release. 17:10:05 bgilbert: could you take an action to write this down somewhere? or is the ticket decision good enough? 17:10:18 #action bgilbert to update the ticket 17:10:38 I think that's good enough. release wranglers will know the guideline 17:10:57 also it might be nice to have a hackmd or something where we list out high level features that we may or may not want to track for the upcoming release(s) 17:11:04 but maybe that's too heavyweight, don't know 17:11:08 GH milestone? 17:11:16 i'm sure we'll incorporate whatever works betst 17:11:24 yeah GH milestone might work 17:11:29 I guess milestones are per-repo 17:11:35 so there'd have to be a tracker bug for each thing 17:11:49 org-level GH projects exist, but those are a bit heavy 17:11:55 hmm or the kanban board type thing we had for our roadmap 17:11:57 yeah 17:12:02 we'll figure out what works best I'm sure 17:12:05 yup 17:12:20 #topic forwarding NIC renaming udev rules from the initramfs 17:12:23 imo works as tribal knowledge with a working link to point new people at 17:12:26 #link https://github.com/coreos/fedora-coreos-tracker/issues/553 17:12:50 slowrie: +1 17:13:20 jlebon: had an update to the ticket in https://github.com/coreos/fedora-coreos-tracker/issues/553#issuecomment-657579430 17:13:42 more or less we still need all of the networking customizations we've done for the time being 17:14:07 should we draw the line at `ifname=` when it comes to propagating that info into the real root or should we not? 17:16:16 if no one is clamoring for it, i'd say just leave it be for now 17:17:04 i.e. /hold and see what else pops up? 17:17:29 are those kargs only meant to be there on first boot specifically? 17:17:54 lucab: they are added similar to the way `ip=` kargs are added 17:18:03 so typicall not persistently applied 17:19:23 dustymabe: yeah. let's see if this is something that other people will run into. at least there's an easy fix for it 17:19:41 then I guess an option is just to tell people to persist them directly, if they care about them? 17:20:00 lucab: exactly :) except we've already set precedence with the other network kargs 17:21:03 right 17:21:38 I have a feeling we are going to overhaul all of this anyway once we get to tackle the "unlock LUKS with a remote key on subsequent boots" story 17:21:53 i think what I brought up in the side discussion was: If i'm a user and I specify `ip=infra:192.168.1.101:xxxx ifname=infra:12:34:56:78:9a:bc` should I expect only half of it to get propagated? 17:22:03 so I'd side with jlebon and not pile more magic on it right now 17:22:20 lucab: right 17:22:28 let me put up a proposal 17:22:43 #proposed We haven't seen a lot of users needing this functionality just yet so we'd prefer to wait to add that funcionality in to see if it has greater demand. What we have for propagating networking karg information a bit hacky, but is required for now in order to not require every user who needs static networking to specify networking information twice. We'd prefer to not further the 17:22:46 behavior if we can get away with it. 17:22:58 hmm weird that it got broken up in to multiple lines 17:23:51 #proposed We haven't seen a lot of users needing this functionality just yet so we'd prefer to wait to add that funcionality. What we have for propagating networking karg information a bit hacky, but is required for now in order to not require every user who needs static networking to specify networking information twice. We'd prefer to not further the behavior for now. 17:23:55 ok i shortened it 17:24:12 and i'll try to fix spelling mistakes too 17:24:18 +1 from me 17:24:33 ack/nack ? 17:24:39 ack 17:24:43 +1 17:24:57 ack 17:24:58 ack 17:25:22 #agreed We haven't seen a lot of users needing this functionality just yet so we'd prefer to wait to add that functionality. What we have for propagating networking karg information a bit hacky, but is required for now in order to not require every user who needs static networking to specify networking information twice. We'd prefer to not further the behavior for now. 17:25:23 +1 17:25:56 hmm 5 minutes for last topic or open floor ? 17:26:20 it seems like a longer than 5 minutes topic 17:26:21 console= probably needs more than 5 minutes 17:26:22 yeah 17:26:30 #topic open floor 17:26:46 how about the fact that we came to some sort of resolution on each of the topics we discussed today! 17:26:49 woot! 17:27:05 🎉 17:27:29 \o/ 17:27:59 wanted to highlight the excellent presentation that @dustymabe @bgilbert and @lorbus did yesterday at the community central meeting 17:28:13 #info if you want to see OKD running on FCOS (and a general overview of FCOS) check out the presentation we did yesterday: https://www.youtube.com/watch?v=ErF_0xQmxrU 17:28:28 thanks miabbott! link above ^^ 17:28:35 yes, loved the presentation you all, thank you :) dustymabe++ bgilbert++ lorbus++ 17:28:35 siddharthvipul: Karma for bgilbert changed to 3 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:28:38 siddharthvipul: Karma for lorbus changed to 3 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:28:43 great presentation dustymabe++ 17:28:47 thanks siddharthvipul 17:28:53 new coreos-installer release is out with a bunch of nice stuff: --ignition-url, --{append,delete}-karg, clearer reporting of busy partitions 17:28:56 dustymabe++ 17:28:59 bgilbert++ 17:28:59 lorbus: Karma for bgilbert changed to 4 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:29:00 #link https://github.com/coreos/coreos-installer/releases/tag/v0.3.0 17:29:10 dustymabe++ lorbus++ bgilbert++ 17:29:12 cverna: Karma for dustymabe changed to 10 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:29:15 cverna: Karma for lorbus changed to 4 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:29:16 container image available now; will land in testing in two weeks 17:29:18 bgilbert: hats off to you for running the releases for ignition and coreos-installer 17:29:18 cverna: Karma for bgilbert changed to 5 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:29:34 bgilbert++ 17:29:42 agreed, that was really nice. and it reached a big audience too! 17:29:55 dustymabe++ 17:29:55 bgilbert: Karma for dustymabe changed to 11 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:30:18 * dustymabe will close out the meeting in a few minutes once conversation dies down 17:30:32 gonna drop here (lunch time) 17:31:17 thanks all! 17:31:40 #endmeeting