16:30:51 #startmeeting fedora_coreos_meeting 16:30:51 Meeting started Wed Oct 13 16:30:51 2021 UTC. 16:30:51 This meeting is logged and archived in a public location. 16:30:51 The chair is lucab. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions. 16:30:51 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:30:51 The meeting name has been set to 'fedora_coreos_meeting' 16:31:02 #topic roll call 16:31:09 .hi 16:31:10 lorbus: lorbus 'Christian Glombek' 16:31:17 .hello2 16:31:18 jaimelm: jaimelm 'Jaime Magiera' 16:32:09 .hi 16:32:10 dustymabe: dustymabe 'Dusty Mabe' 16:32:15 .hi 16:32:16 ravanelli: ravanelli 'Renata Andrade Matos Ravanelii' 16:32:50 .hello siosm 16:32:51 travier: siosm 'Timothée Ravier' 16:32:54 .hello jasonbrooks 16:32:56 jbrooks: jasonbrooks 'Jason Brooks' 16:33:21 #chair lorbus jaimelm ravanelli dustymabe travier jbrooks 16:33:21 Current chairs: dustymabe jaimelm jbrooks lorbus lucab ravanelli travier 16:33:29 .hi 16:33:30 bgilbert: bgilbert 'Benjamin Gilbert' 16:34:28 #chair bgilbert 16:34:28 Current chairs: bgilbert dustymabe jaimelm jbrooks lorbus lucab ravanelli travier 16:34:37 .hi 16:34:39 scorreia_: Sorry, but user 'scorreia_' does not exist 16:34:58 ok I'll start 16:35:11 scorreia_: you can say ".hello2 your-FAS-account-name" if you have a FAS account 16:35:17 #topic Action items from last meeting 16:35:26 * ".hello FAS-account" 16:35:48 lorbus and jlebon to reach out to the containers team to discuss what cri-o versions will be supported and how at the modular level to pull it off (context: https://github.com/coreos/fedora-coreos-tracker/issues/767) 16:35:50 dustymabe to talk to sumantro to try to get FCOS testing on the week of oct 11 16:35:57 dustymabe to schedule some time on Monday to identify docs test cases and prepare for F35 test day 16:36:00 .hello scorreia 16:36:01 scorreia: scorreia 'Sergio Correia' 16:36:32 jlebon to add some design details to the proposal in the ticket (#676) and also reach out to dghubble to see if this can be handled in typhoon 16:36:45 lucab: yep all of those were handled (see notes from last week's video meeting in the hackmd: https://hackmd.io/vMWlKGH5TAOsLKYiqce61Q?view) 16:36:56 uhm, although the last meeting was actually a video one 16:36:57 from last week I think the only tangible thing we had was... 16:37:05 #info jlebon created #990 to track running kubernetes node e2e tests in our CI 16:37:07 #link https://github.com/coreos/fedora-coreos-tracker/issues/990 16:37:28 oh yes sorry, I just went directly to the meetbot archive 16:37:37 +1 - yeah 16:37:52 we might ought to try to force some creation of meeting ino in meetbot for video meetings 16:38:16 we'll improve over time 16:38:29 np, I only realized at the end of my copy-pasting 16:39:18 #info re k8s and cri-o: https://github.com/coreos/fedora-coreos-tracker/issues/767 16:39:19 I actually had an action item myself, let me find it and link it 16:39:36 Copying Jonathan's comment over:... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/972e3e6feb0ec3739c2c35fdc478ba6f7c5c5d53) 16:40:06 (I hope all clients can see properly what I pasted) 16:40:22 it's a link - but yeah, it's clickable 16:40:41 #info FCOS auto-updates are now disabled on k8s e2e CI https://github.com/kubernetes/test-infra/pull/23926 16:42:07 ok, I think that was all 16:43:05 I'll pick the EFI FS ticket first 16:43:14 thanks for doing that lucab 16:43:18 #topic Change EFI-System partition format from fat16 to fat32 16:43:37 #link https://github.com/coreos/fedora-coreos-tracker/issues/993 16:43:45 bgilbert: ^^^ 16:44:00 so, the UEFI spec technically requires that the EFI System Partition (ESP) use FAT32. ours uses FAT16. 16:44:40 most firmware doesn't care, though. I'm aware of two pieces of hardware that have failed to boot with a FAT16 ESP: 16:45:10 an ASUS laptop in https://github.com/coreos/bugs/issues/2246, and a Raspberry Pi with older firmware in https://github.com/coreos/fedora-coreos-tracker/issues/993. 16:45:19 presumably there are others that go unreported. 16:45:42 "it's a one-line fix", except it turns out it isn't. 16:46:24 does seem to imply either a) they've written their own fat driver (like lunatics) or b) they're using a really, really old binary fat driver from EDK I (like lunatics) 16:46:25 a filesystem is defined to be FAT12/FAT16/FAT32 based on the number of clusters in the filesystem, which is related to filesystem size. 16:46:40 * bgilbert waves at pjones 16:47:15 our disk image uses a 127 MB ESP, but on 4K-sector disks, the minimum FS size for FAT32 is 257 MB. 16:47:22 I could swear I fixed the default for this in mkfs.vfat like... 10 years ago 16:47:36 ah, right, it can't do it in that case. 16:48:05 so fixing this completely requires a partition layout change, which is difficult because the layout is hardcoded in Butane to handle the boot-disk RAID case. 16:48:32 we can do it, but it's non-trivial. so the question is what strategy we want to use here. 16:48:46 pjones, I'd be very interested in your thoughts. how much effort is it worth to switch the ESP to FAT32? 16:50:47 bgilbert: if we were to bump size, we'd want to do it everywhere? 16:51:25 we could do it for only the 512b image. but then in the boot-disk-RAID case, the ESP would be reverted to FAT16. 16:51:28 and what would the size be? `257 MB` or would we go higher to a nicer round number like 512? 16:51:41 I know you are going to hate this because of the Butane pain, but a larger ESP (uniform for all images) would be a good thing IMHO 16:51:44 I don't think we need our ESP to be large, so I'd be inclined to go for the minimum 16:52:05 Butane can't split the difference and select FAT16/FAT32 at runtime because it doesn't know enough 16:53:22 so.. I guess some of this depends.. 16:53:34 lucab: do you have a use case in mind? 16:53:38 Is anything here a breaking change? 16:53:41 rationale is, I do expect sooner or later somebody to come up with a usecase where they hit the ceiling at 127MiB 16:53:48 how much of a problem we really think it is.. and how painful changing the sizes $everywhere would be 16:53:53 travier: kinda-sorta? 16:54:20 bgilbert: not a sane one and specific right now, no. Mostly gut feeling. 16:54:28 bgilbert: I guess it depends how much you care about the devices you've found? "there are presumably more" is true but also they're probably similar classes of devices as the ones you've already found. 16:54:33 yeah if we change the partition sizes I imagine the "re-install over previous install of FCOS and don't overwrite some FS" use case would be affected 16:54:38 there's the issue of version skew between Butane and cosa, which isn't breaking so much as surprising/inconsistent. but in principle, partition layouts in user Ignition configs could make assumptions about where the rootfs starts 16:54:42 Could we do a "manual" workaround as a butane config option? 16:55:24 that we would flit to be the default the first time we do a truly breaking change? 16:55:46 like "ESP: FAT32" (FAT16 being the default 16:55:47 ahh, yes, https://docs.fedoraproject.org/en-US/fedora-coreos/storage/ requires the _rootfs_ to be 8 GiB or larger 16:55:48 ) 16:56:48 so the expansion buffer we've defined would not be used to cover this case 16:57:31 (not a butane config option but a butane spec entry) 16:57:45 travier: there's a "layout" field; we could define e.g. "layout: x86_64_fat32" 16:58:02 my bad, we can not do that as we need this to work *before* we boot up 16:58:07 boot_device.layout 16:59:16 I think we should consider doing nothing 16:59:20 or, as a middle ground, doing nothing for now 16:59:34 yeah, "doing nothing" is an option 16:59:45 for the specific case of the Pi, it sounds like a firmware upgrade will avoid the issue 16:59:52 Is there an easy manual workaround for people hitting this issue? 17:00:09 we've gone multiple years without any reports, and when this happened in Container Linux, it also went multiple years without reports 17:00:16 If there is one, documenting it could be enough for now 17:00:56 travier: booting in BIOS mode, or copying out the contents of the ESP + reformatting FAT32 + copying back. both of those only work on 512b sectors. 17:01:24 the middle-ground option is: make a note of this, and fix it next time we have to redo the partition layout :-P 17:01:31 s/the/a/ 17:01:33 bgilbert: I think it's fine if we wnat to do nothing for now, but maybe nice to 1. document what we would do if we did want to fix the problem (requires discussion) 2. add docs for working around the problem 17:02:06 ^^ 17:02:22 If it's "just" a "dd partitions and re-create one larger and dd back" then this is a reasonable workaround 17:02:29 I do expect the RPi case to be common enough to justify a short FAQ with "update your FW first" 17:02:38 travier: filesystem-level copy, not dd 17:02:46 lucab: i'm actually working on that 17:02:49 lucab: +1 17:02:50 copy in case of the ESP one instead of dd but dd for the rootfs? 17:03:00 dustymabe++ 17:03:00 lucab: Karma for dustymabe changed to 9 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:03:14 dustymabe: +1 17:03:29 i've got some draft notes, but trying to get them polished and a PR submitted to the docs 17:03:30 That would be a good addition 17:03:36 travier: I wouldn't be surprised if no one ever hits this for the 4K-sector case 17:03:59 technically, are we breaking the EFI spec? Or is it a case of firmwares blindly assuming FAT32? 17:04:10 lucab: I _believe_ we are technically breaking the spec 17:04:11 bgilbert: i.e. one option is to fat32 on 512b and fat16 on 4kn ? 17:04:30 dustymabe: no, I'm saying I think we only need to document a workaround for 512b 17:04:35 bgilbert: for the removable vs non-removable point? 17:04:40 lucab: yes 17:04:45 I'm +1 for documenting, providing a workaround and noting that somewhere for the future if we ever change the layout 17:05:10 dustymabe: right, but i'm saying - if you think no one will ever hit this on 4kn, then maybe "fat32 on 512b and fat16 on 4kn" would be an option 17:05:21 dustymabe: changing 512b to FAT32 for everyone means the Butane RAID case will change it back, which seems weird and obscure 17:05:26 It would be nice if down the road FCOS wasn't "technically" the spec. So, a set time to revisit might be good. 17:05:49 bgilbert: but we can update that in a follow up, right? 17:05:57 dustymabe: how do you mean? 17:06:08 can we not update the butane RAID definition? 17:06:23 If we make a change, I think we should change everything at once 17:06:26 Butane doesn't know whether the target disk is 512b or 4Kn 17:06:32 i see 17:06:33 and we can't ask the user because the user may not know either 17:06:39 true 17:06:40 (but I don't think it's worth changing for that) 17:06:52 ok, yeah. that foils the plan 17:07:02 otherwise would have been a nice compromise 17:07:05 :) 17:07:16 yeah, it's a messy situation 17:07:30 ok so.. 1. let's do nothing for now and document how to get out of the situation if it's hit 17:07:40 I don't think you're technically breaking the spec, but that's because you're not producing an artifact that the spec speaks to directly. 17:07:51 2. let's agree that if we were to change it we'd bump the ESP size everywhere to XYZ MiB 17:08:14 pjones: could you expand on that? 17:08:28 I noticed that the spec language was careful not to forbid FAT12/16, just to declare it out of scope 17:08:32 dustymabe: yes, I think that's the sanest plan right now 17:08:35 bgilbert: the spec says what the firmware should do, and provides interfaces to talk to it. 17:08:53 bgilbert: so this could be a thing a specific firmware doesn't support, but there's no /violation/ either way 17:08:59 got it 17:09:21 (although 2.8 does say "EFI encompasses the use of FAT32 for a system partition, and FAT12 or FAT16 for removable media." in section 13.3) 17:09:31 heh 17:09:33 I would not assume this part of the spec is, erm, very well written. 17:10:04 yeah, "encompasses" is what I was referring to ^. I read it as 'FAT12/16 is not contemplated but not forbidden'. 17:10:29 Oh, no there's another bit that's normative about it: "The EFI firmware must support the FAT32, FAT16, and FAT12 variants of the EFI file system. What variant of EFI FAT to use is defined by the size of the media." 17:11:21 so those firmwares are broken with respect to that (though a careful reading of chapter two may tell you this is all optional...) 17:11:27 pjones: (rules lawyering) those aren't in conflict, though, right? "must support" because of removable media, and naturally a FAT32 ESP must be large enough to use FAT32. 17:11:35 righty 17:12:06 I also don't know the last time I saw removable media that was smaller than 260MB, but... I mean sure. 17:12:24 heh 17:12:42 * jaimelm pulls out a 100MB zip drive 17:12:50 yeah, it's been a while 17:13:01 jaimelm: 17:13:04 LOL 17:13:10 I was literally just going to type that 17:13:18 :-D 17:13:31 sorry but I'll try to close the topic here 17:13:42 +1 to dustymabe's proposal 17:13:49 +1 17:14:01 I think dustymabe basically proposed to do nothing now and agree on what to possibly do in the future 17:14:13 and docs 17:14:38 which would be uniformly enlarge ESP so that we get FAT32 everywhere 17:14:50 if we want to define the target size now, I propose 257 MB (or 260 if we want; that's what MS recommends). but I'm okay deferring details until we get there. 17:15:34 260M seems reasonable 17:16:02 I think "at least minimum to meet FAT32 according to spec" would do for now, we'll likely discover some new fun quirk at that point 17:16:25 +1 17:17:03 #proposed For now, we will document workarounds for systems that can't boot from FAT16, including older Raspberry Pi firmware. Next time we change the partition layout, we will switch to FAT32 in both the 512b and 4Kn disks, selecting an ESP size of at least 257 MB to fit a valid FAT32 filesystem. 17:17:14 s/disks/images/ 17:17:17 +1 17:17:17 +! 17:17:26 +1 17:17:27 +1 17:17:46 well done 17:18:20 dustymabe: self-action the doc writing? 17:18:25 dustymabe? 17:20:02 oops sorry 17:20:04 bgilbert: I think we lost him, but I'd say there is overall consensus 17:20:19 #action dustymabe to write docs for rpi4 including updating eeprom 17:20:26 #agreed For now, we will document workarounds for systems that can't boot from a FAT16 ESP, including older Raspberry Pi firmware. Next time we change the partition layout, we will switch the ESP to FAT32 in both the 512b and 4Kn images, selecting a size of at least 257 MB to fit a valid FAT32 filesystem. 17:20:29 .hello2 17:20:30 ^ minor copyediting 17:20:30 jlebon: jlebon 'None' 17:20:37 I think we need a more generic FAQ entry for this, though and would prefer not to write that 17:20:39 #chair jlebon 17:20:39 Current chairs: bgilbert dustymabe jaimelm jbrooks jlebon lorbus lucab ravanelli travier 17:20:45 thanks for the input, pjones! 17:20:54 hope it was helpful 17:21:06 absolutely 17:21:31 anybody else want to sign up for writing a FAW entry for this? including steps on how to convert to FAT32 if needed? 17:21:42 also does everyone agree we need this ^^ 17:21:48 FAQ* 17:22:18 dustymabe: I don't have the storage docs open now, but we can just directly put a NOTE there 17:22:20 I think it'll be difficult for a user to know that the case applies to them 17:22:26 agree with putting a NOTE in /storage/ 17:22:40 the FAQ is kind of a wasteland 17:22:41 bgilbert: that's true 17:22:54 i think of it more of a "user opens issue, we point them at FAQ" workflow 17:22:56 bgilbert: I'll leave the wording of that to you 17:23:10 yeah, I was starting on my 3rd party repo FAQ entry and saw how unorganized it is. 17:23:14 I really doubt they'll find this in the "storage" docs.. there system just won't boot and they'll open an issue 17:23:14 lucab: heh 17:23:14 pjones++ 17:23:36 #action bgilbert to write docs for switching to FAT32 17:23:44 categories would be nice 17:23:50 bgilbert++ 17:23:57 thanks for the discussion, all! 17:24:19 jaimelm: a bunch of entries should probably be dropped 17:24:27 ok it went a bit longer than expect 17:24:37 it was good discussion though 17:24:37 jaimelm: xref https://github.com/coreos/fedora-coreos-docs/issues/292 17:25:00 do we want to touch another ticket (quickly)? 17:25:11 probably out of time 17:25:14 I don't think we have time 17:25:15 open floor? 17:25:20 bgilbert: yeah 17:25:24 yes, same feeling 17:25:51 #topic Open floor 17:25:57 for visibility: I think we may have a path forward on VirtualBox support 17:25:57 #info Fedora CoreOS test day/week happening this week - please help execute test cases especially if you have access to obscure hardware platforms https://testdays.fedoraproject.org/events/122 17:26:00 #link https://github.com/coreos/ignition/pull/1269 17:26:26 bgilbert: nice 17:26:41 ueno, travier: looks like there is not enough time today, so can we discuss this one next meeting, perhaps? 17:26:44 I'll be testing on vSphere and an old Mac Pro trash can tomorrow :) 17:26:46 https://github.com/coreos/fedora-coreos-tracker/issues/982 17:27:15 scorreia: whoops I'm really sorry, had not realized you where there 17:27:25 scorreia: yep - we can make sure we get to it next time - sorry to have wasted your time 17:27:30 yes, let's make sure we discuss it next time 17:27:49 bgilbert: does this pave way for vagrant support at all - was thinking on that a bit over the weekend 17:27:51 same here sorry, I could have started with that otherwise 17:28:22 No worries :) 17:28:25 dustymabe: in principle, yes. I think we should have a discussion about whether we want to support Vagrant. 17:28:40 fair 17:28:52 probably for another day 17:28:55 yup 17:28:57 scorreia: personally, I'm missing the "at which point of the boot should this run" part 17:30:29 can we extend to be respectful to the guest? Does anyone need to be anywhere? 17:30:54 i can stay 17:31:13 actually I gotta leave. lucab: sure, I will check with ueno to address that before the discussion. 17:31:16 scorreia: because if the goal is to run this through a systemd unit after network bootup, it may as well be run through a privileged podman 17:31:42 scorreia: next time for sure. maybe the meeting will open with it to not make you wait. 17:32:05 alright, that sounds great. thanks everyone 17:32:06 (this assumes the agent is tweaked so that it can comfortably run from within a container first) 17:32:18 scorreia: np, I'll also note that in the ticket 17:33:00 ok, I don't see anything else raised, closing now 17:33:19 (I'll send notes/followups tomorrow) 17:33:19 #endmeeting