16:30:47 #startmeeting fedora_coreos_meeting 16:30:47 Meeting started Wed Aug 11 16:30:47 2021 UTC. 16:30:47 This meeting is logged and archived in a public location. 16:30:47 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:30:47 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:30:47 The meeting name has been set to 'fedora_coreos_meeting' 16:30:52 #topic roll call 16:30:55 .hi 16:30:56 dustymabe: dustymabe 'Dusty Mabe' 16:31:04 .hello2 16:31:05 jaimelm: jaimelm 'Jaime Magiera' 16:31:09 .hi 16:31:10 darkmuggle: darkmuggle 'None' 16:32:31 .hello siosm 16:32:32 travier: siosm 'Timothée Ravier' 16:32:48 .hi 16:32:49 bgilbert: bgilbert 'Benjamin Gilbert' 16:32:53 .hello2 16:32:54 miabbott: miabbott 'Micah Abbott' 16:34:12 .hello jasonbrooks 16:34:13 jbrooks: jasonbrooks 'Jason Brooks' 16:35:13 #chair jaimelm darkmuggle travier bgilbert miabbott jbrooks 16:35:13 Current chairs: bgilbert darkmuggle dustymabe jaimelm jbrooks miabbott travier 16:35:45 #topic Action items from last meeting 16:35:54 * dustymabe to re-index and look for newly submitted change proposals 16:35:55 for f35 that we need to consider 16:35:57 * - dustymabe to figure out how the cloud edition is handling the 16:35:59 ipv6.addr-gen-mode=stable-privacy problem 16:36:19 #info dustymabe re-indexed and found new items. See https://github.com/coreos/fedora-coreos-tracker/issues/856#issuecomment-896976066 16:37:00 #info dustymabe added info about how fedora cloud base handles ipv6 addr gen mode in https://github.com/coreos/fedora-coreos-tracker/issues/907#issuecomment-895455009 16:37:17 hopefully I don't just give myself action items this meeting 16:37:43 #topic Differing behavior for aarch64 vs x86_64 disk images 16:37:48 #link https://github.com/coreos/fedora-coreos-tracker/issues/855 16:37:57 * dustymabe waves at bgilbert 16:38:15 right 16:38:36 so, we decided to handle this via documentation, and then a bug happened. 16:38:51 🐛 16:39:06 it turns out that on aarch64 and ppc64le, Butane generates boot-disk RAID templates that don't match what we actually ship 16:39:29 specifically, they don't skip unused partition numbers. 16:40:08 that would be a trivial fix, _except_ for another Ignition behavior. if Ignition sees a partition number, it stops matching against the partition label when doinog config merges. 16:40:46 so, if we explicitly specify partition numbers in the RAID templates, everyone who is overriding the RAID template to set the partition size of the root partition would need to add "number: 4" to their override. 16:41:10 both OCP (as of 4.8) and FCOS have existing docs telling people to do the override by label only. 16:41:37 and changing that Ignition behavior is a breaking change. 16:42:18 so our options are: 16:43:04 1. break existing RAID template overrides (probably needs a Butane spec bump?) 16:43:22 2. change the Ignition matching behavior (needs Ignition spec 4 AFAICT) 16:43:53 3. ship empty partitions on aarch64/ppc64le so we don't skip any partition numbers 16:44:02 4. live with the inconsistency and update kola tests 16:44:11 questions? 16:44:51 What is your recommendation bgilbert 16:44:57 of course.. I was advocating for 3. before we found the bug bgilbert mentioned :) 16:45:03 Could 3 also enable us to do https://github.com/coreos/fedora-coreos-tracker/issues/855#issuecomment-889946742? Sorry I'm not familiar enough with aarch64/uboot to answer that 16:45:14 https://github.com/coreos/fedora-coreos-tracker/issues/855#issuecomment-889946742 (fixed link) 16:45:31 travier: yes 16:46:30 darkmuggle: I raised this because I'm wondering whether we should just ship the empty partitions. it's hacky and I don't like it, but it's the smallest fix and it's backward-compatible 16:47:29 is there a follow-on change to fix this "for real", if we decide to choose a shorter term workaround? 16:47:44 and.. bgilbert wanted to make dusty happy anyway so killing two birds with one stone 16:47:45 Right... 3 is the simplest fix. I thought I was missing some nuance, hence why I asked. That said, 3 seems reasonable with a backlog to fix in Spec4. 16:47:49 i.e. do the empty partitions now and then something more complete later 16:48:36 in the long run I do think we need to fix the Ignition matching behavior 16:49:15 Ignition treats labels as second-tier identifiers but our messaging has consistently been that people should use them over partnums 16:49:30 if we have a way out of option 3 in the long term, then it seems like that is the best choice 16:49:47 miabbott: but I don't assume we'd want to change out of option 3 16:49:54 it has other benefits 16:50:01 (I'm trying to figure out if we need some small empty partition space for u-boot to UEFI in arm32/64 setups but my knowledge is lacking) 16:50:22 once we start shipping the empty partitions, it'll be hard to justify the work to remove them, even without other benefits 16:50:35 dustymabe: understood. i just don't want us to get locked into something we can't escape 16:50:59 if we were to ship empty partitions, there's the question of how to do it 16:51:28 so.. official turn in conversation focusing on option 3 16:51:29 it wouldn't affect x86_64 at all, but Butane would need to be updated for the new partmaps in aarch64 and ppc64le 16:51:57 dustymabe: maybe not, just exploring the option space 16:52:04 +1 16:52:16 yeah we can go back to the higher level too, just wanted to note the change 16:52:24 +1 16:52:42 bgilbert: and we assume that butane update would be less disruptive? 16:52:51 so now we'd have a situation where newer Butane can't be used with older FCOS on those arches 16:52:57 (FCOS and RHCOS) 16:53:06 *can't be used for boot disk RAID 16:53:17 which, for FCOS, is a non-issue 16:53:21 yup 16:53:24 since we don't ship those arches yet 16:53:47 in OCP, well 16:54:01 so the problem set is OCP 4.8 (first time butane was supported) + aarch64/ppc64le + boot disk RAID 16:54:21 we don't bind Butane releases to OCP releases, but that also doesn't matter because users don't update their bootimages so they're stuck with old Ignition forever 16:54:38 dustymabe: Butane was supported in 4.7 specifically for the boot RAID case 16:54:46 ahh, I didn't know that 16:54:53 thought it was new with 4.8 16:54:59 it was generalized for 4.8 16:55:20 but.. was aarch64 supported with 4.7 ? 16:55:30 that's relatively new 16:55:39 no, but I think ppc64le was 16:55:44 got ya 16:55:44 aarch64 isn't supported in OCP until 4.9 16:55:58 we can do backports, but again, old bootimages 16:56:12 new image/old Butane should be fine. the problem is only old image/new Butane 16:56:19 https://docs.openshift.com/container-platform/4.7/installing/install_config/installing-customizing.html 16:56:37 maybe we decide not to care about old bootimages, since we've sorta done that for every new Ignition feature as-is 16:57:09 okay, end of tangent I think 16:57:26 back to the higher level 16:57:29 1. break existing RAID template overrides (probably needs a Butane spec bump?) 16:57:31 2. change the Ignition matching behavior (needs Ignition spec 4 AFAICT) 16:57:33 3. ship empty partitions on aarch64/ppc64le so we don't skip any partition numbers 16:57:35 4. live with the inconsistency and update kola tests 16:57:43 interested in people's general thoughts among those options 16:57:46 want to remove an option or two from the list because ETOOHARD ? 16:58:06 I don't think we should try to do 2 right now 16:58:29 ❌ 2. 16:59:48 i'll abstain from commenting because I originally opened the $topic issue and my desires were clear 17:00:15 also interested in jlebon's thoughts but he's AFK 17:00:34 yeah - we can take this to next meeting, though.. how time sensitive is it? 17:01:36 then again.. there seems to be a lot of good reason to go with 3. 17:01:44 the consequences of the current behavior are: 1) we had to disable kola tests on aarch64/ppc64le; 2) boot disk RAID puts the rootfs on partition 3 17:02:06 even more so if we could accomodate the uboot stuff that travier mentioned (though that shouldn't be a primary motiviator IMO) 17:02:06 some tools hardcode a partition 4 assumption but I don't think anything in the OS does 17:02:42 i.e., not that time sensitive AFAIK, except for the 4.9 code freeze 17:03:17 is there a good reason to solve this in another way than 3. considering the tangential benefits ? 17:03:21 I think we need a chat with arm-aware folks to figure out if this is needed / if that will help 17:03:31 (will set that up) 17:03:31 ^^ 17:03:32 travier: yeah 17:03:50 unfortunately if what we're proposing wouldn't help I don't see us making any other changes 17:04:51 dustymabe: I don't hear anyone arguing for breaking existing Butane configs, so 17:05:01 dustymabe: I think it comes down to fix bug vs. live with bug 17:05:11 s/fix/work around/ 17:05:42 travier: +1 17:05:52 bgilbert: and to be clear.. 3. would break existing butane configs 17:06:23 dustymabe: no. 3 would break existing releases of Butane. 17:06:37 ...except 17:07:17 wait, sorry, notional %undo 17:07:45 ok I'll try to update the issue with this discussion (unless bgilbert wants to) 17:08:28 using this feature with new Butane releases would not work with old OSes 17:09:03 old rendered configs would not change behavior, and old Butane configs would seamlessly switch to the new behavior when recompiled 17:09:14 dustymabe: +1, thanks 17:09:18 hopefully it would fail hard :) 17:09:23 dustymabe: yes 17:09:31 #topic F35: CHANGE: CompilerPolicy Change 17:09:32 at provisioning time 17:09:36 #link https://github.com/coreos/fedora-coreos-tracker/issues/872 17:09:46 jaimelm: want to speak to this one? 17:11:17 Well, your description pretty much lays it out in a nutshell. There will be more leeway now in terms of Clang/LLVM 17:11:36 Cursory look shows no downside. 17:12:10 i'm assuming there's no issue there for us (our tools specifically) 17:12:21 prefering clang/llvm isn't really a thing 17:12:30 right. 17:12:43 and for other tools we pull from the rest of Fedora, we don't have control over that other than reporting new bugs we find 17:12:48 so we should be good there 17:13:14 ok i'll move on to the next issue 17:13:33 #topic F35: CHANGE: More flexible use of SSSD fast cache for local users 17:13:37 #link https://github.com/coreos/fedora-coreos-tracker/issues/875 17:13:42 darkmuggle: let me know if you want me to punt 17:14:05 BTW, the bugzilla was updated yesterday for code complete deadline 17:14:12 no update after that 17:15:17 ok we'll punt 17:15:29 #topic tracker: Fedora 35 changes considerations 17:15:35 #link https://github.com/coreos/fedora-coreos-tracker/issues/856 17:15:42 this is the high level rollup issue 17:15:53 I just updated the description with all of the new accepted changes 17:16:04 the ones we haven't looked at yet are marked with a ❌ 17:16:39 dustymabe is really making use of that red x. 17:16:40 we'll spend time here going through to see if we can skip or need to investigate them 17:16:53 item: 1.7 x TRIAGE Boost 1.76 upgrade 17:17:04 skip or investigate ? 17:18:14 I can't think of anything this would touch on 17:18:36 yeah I think this is mostly introducing the new change and getting all dependent packages to compile/build 17:19:02 #info skipping Boost 1.76 upgrade because it should be contained to the build system (making sure dependent packages compile) 17:19:17 item: 1.8 x TRIAGE MinGW environment and toolchain update 17:20:03 i don't think I've ever come across MinGW - anyone know? 17:20:23 nope 17:21:02 created to support the GCC compiler on Windows systems 17:21:22 yup, it's a Windows cross-compiler 17:21:35 not relevant to us 17:21:44 #info skipping MinGW environment and toolchain update because it's a Windows cross-compiler, not relevant. 17:22:02 item: 1.10 x TRIAGE Make btrfs the default file system for Fedora Cloud 17:22:33 #info skipping "Make btrfs the default file system for Fedora Cloud" as it is only for Cloud edition. 17:22:48 item: 1.11 x TRIAGE Build Fedora Cloud Images with Hybrid BIOS+UEFI Boot Support 17:22:58 :-D 17:23:04 #info skipping "Build Fedora Cloud Images with Hybrid BIOS+UEFI Boot Support" as it is only for Cloud edition. 17:23:18 item: 1.12 x TRIAGE Adding Selected Flathub Applications 17:23:42 #info skipping "Adding Selected Flathub Applications" as we don't use flatpaks 17:23:52 item: 1.13 x TRIAGE Update firewalld to v1.0.0 17:24:21 while i'm personally interested in this one.. not relevant to FCOS 17:24:25 nope 17:24:26 #info skipping "Update firewalld to v1.0.0" as we don't use firewalld 17:24:35 item: 1.15 x TRIAGE Gconv package split in glibc 17:25:23 do we use glibc-gconv-extra for anything ? 17:26:10 weird. I don't see it installed on my FCOS machine 17:27:20 i'll look a little more into this one 17:27:41 we'll pick the item list up next time 17:27:45 #topic open floor 17:27:49 sorry that was a bit dry at the end 17:28:21 Well, better to go through it to be sure. 17:28:21 #info multi-arch pipeline work is ongoing. The first bits landed and the first runs are going now (finding and fixing issues along the way) 17:28:33 jaimelm++ 17:28:56 cool 17:29:10 bgilbert: you were just a day or two ahead of me enabling metal image builds for aarch64 and hitting the raid test failures 17:29:41 i guess it would happen for qemu too, but I'm just running basic tests for now (not the full suite) 17:29:46 dustymabe: wasn't me; the multi-arch folks hit it after RHCOS was rebased to cosa main 17:30:01 *switched to 17:30:01 dang - i was so close to finding it first in FCOS 17:30:24 oh well 17:30:26 we're at time 17:30:35 will close out in 30s unless more topics come up 17:32:14 #endmeeting