16:30:33 #startmeeting fedora_coreos_meeting 16:30:33 Meeting started Wed Sep 8 16:30:33 2021 UTC. 16:30:33 This meeting is logged and archived in a public location. 16:30:33 The chair is jlebon. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:30:33 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:30:33 The meeting name has been set to 'fedora_coreos_meeting' 16:30:52 #topic roll call 16:31:02 .hello jasonbrooks 16:31:03 jbrooks: Something blew up, please try again 16:31:05 .hi sohank2602 16:31:06 jbrooks: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:31:09 skunkerk: Something blew up, please try again 16:31:12 skunkerk: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:31:14 .hello2 16:31:19 jaimelm: Something blew up, please try again 16:31:22 jaimelm: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:31:22 hmm 16:31:41 .hi 16:31:42 dustymabe: Something blew up, please try again 16:31:45 dustymabe: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:31:50 .hello2 16:31:51 jlebon: Something blew up, please try again 16:31:55 jlebon: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:31:59 zodbot: help 16:31:59 Please see https://fedoraproject.org/wiki/Zodbot for general help and information about this Supybot - If you want information about a specific command, type .misc help 16:32:02 #chair jbrooks skunkerk dustymabe jaimelm 16:32:02 Current chairs: dustymabe jaimelm jbrooks jlebon skunkerk 16:32:04 might have to check with nb to see why zodbot is going cray 16:32:24 .hi siosm 16:32:26 travier: Something blew up, please try again 16:32:29 travier: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:32:33 :/ 16:32:36 dustymabe: https://github.com/coreos/fedora-coreos-tracker/issues/856 is tagged -- is there anything to bring up? 16:33:00 i noticed the last edit was 7 days ago, so I think we still need to sync with the latest first in case there are any new ones 16:33:16 #chair travier 16:33:16 Current chairs: dustymabe jaimelm jbrooks jlebon skunkerk travier 16:33:24 no progress on my side so far :/ 16:33:29 jlebon: mainly we just need to iterate over the remaining tickets (with f35-changes label) and ask for status updates from responsible parties 16:33:47 dustymabe: ack ok 16:33:55 let's wait a little bit more 16:34:28 .hi 16:34:29 bgilbert: Something blew up, please try again 16:34:32 bgilbert: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:34:33 #chair bgilbert 16:34:33 Current chairs: bgilbert dustymabe jaimelm jbrooks jlebon skunkerk travier 16:34:50 .hi 16:34:51 zodbot: wasn't me 16:34:52 ravanelli: Something blew up, please try again 16:34:55 ravanelli: An error has occurred and has been logged. Please contact this bot's administrator for more information. 16:35:18 #chair ravanelli 16:35:18 Current chairs: bgilbert dustymabe jaimelm jbrooks jlebon ravanelli skunkerk travier 16:35:29 ok cool, let's start! 16:36:13 #topic Action items from last meeting 16:36:18 dustymabe will open a ticket and look for volunteers to investigate "Third-party Software Mechanism" 16:36:21 dustymabe will open a ticket and look for volunteers to investigate "Remove authselect-compat package" 16:36:24 bgilbert to add empty partitions, update Butane, re-enable tests 16:36:43 jlebon: are those from last week? 16:36:50 hmm 16:37:12 there is an issue (i think) where logs aren't getting copied to that directory 16:37:17 i think kevin is aware 16:37:17 doesn't seem like it's there, yeah 16:37:24 https://meetbot-raw.fedoraproject.org/teams/fedora_coreos_meeting/ 16:37:30 i think we'll have to go off the email to the ML 16:37:36 k let me check 16:37:57 Action Items: 16:37:58 * jaimelm to write documentation on fedora-third-party rpm for our FAQ 16:38:15 ^^ in-progress 16:38:44 jaimelm: cool, i won't reaction it then 16:38:51 * jaimelm gives thumbs up 16:39:10 ok, we got a bunch of tickets! 16:39:22 #topic tracker: Fedora 35 changes considerations 16:39:25 #link https://github.com/coreos/fedora-coreos-tracker/issues/856 16:39:42 let's just quickly go over the pending ones 16:40:08 travier: any updates on SSSD? 16:40:34 jlebon: any updates on authselect? Answer: no 16:40:41 no updates sorry 16:41:07 hmm, looks like two of them still need volunteers 16:41:11 https://github.com/coreos/fedora-coreos-tracker/issues/935 16:41:14 https://github.com/coreos/fedora-coreos-tracker/issues/936 16:41:39 i'll take the libvirt one 16:41:50 jlebon++ 16:41:50 dustymabe: Karma for jlebon changed to 5 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 16:42:04 the LUKS one should be trivial 16:42:15 i can take it too, though would be nice to have someone else :) 16:43:17 ok, let's keep it unassigned for now and if it's still that way in the next few days, i'll take it 16:43:41 i want to volunteer, but feel like it would be best if I complete the other tasks on my plate first 16:43:48 will next week if no one else picks it up 16:44:22 dustymabe: ack, cool 16:44:35 moving on unless there's anything else on this topic 16:44:55 one thing 16:45:05 on f35 in general - I think we are getting closer to beta 16:45:16 I don't know where to start with this one, but I can help with the LUKS 16:45:17 which means we need to switch over our `next` stream IIUC 16:45:26 ravanelli++ 16:45:26 dustymabe: Karma for ravanelli changed to 1 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 16:45:51 ravanelli++ 16:45:51 jlebon: Karma for ravanelli changed to 2 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 16:46:28 jlebon: am I right about switching over `next` to f35 ? 16:46:29 oh wow yeah, beta release is 2021-09-14 16:46:35 that's... next week 16:46:52 dustymabe: yup, agreed 16:47:14 which means we should switch next-devel now 16:47:41 yep. though I dont't think we necessarily have to ship next week (can wait for our normal updates the following week) WDYT? 16:48:08 yup indeed, but might as well prime it and shake things out early 16:48:18 dustymabe: thanks for bringing this up 16:48:22 +1 16:48:32 i think that's already captured in the rebase checklist right? (i.e. no need for a separate ticket) 16:48:55 right - just need a victim/volunteer 16:49:14 hmm, no doesn't seem to be actually (looking at https://github.com/coreos/fedora-coreos-tracker/issues/884) 16:49:28 ahh - maybe something to add 16:49:32 +1 16:49:37 for some reason I thought it was in there 16:50:46 ok to pick that up if you'd like. it should be pretty smooth since the content is tracked in branched already, which is green AFAIK 16:51:13 +1 - though anyone, please feel free to volunteer to do this along with jlebon and learn the process a bit 16:51:33 +1 16:51:42 ok, let's move on 16:51:58 #topic console defaults for x86_64 qemu platform 16:52:02 #link https://github.com/coreos/fedora-coreos-tracker/issues/954 16:52:11 dustymabe: want to introduce this? 16:52:55 i don't have links handy, but... 16:53:22 basically a while back when we discussed removing the console= entry from bare metal we put off discussing what the defaults should be on other platfomrs 16:53:26 platforms 16:53:44 now that we are implementing it, it's a good time for discussion.. what should the defaults on other platforms be 16:54:02 this one is specifically for qemu (i.e. directly lauched with qemu or libvirt) 16:54:10 or virt-manager, etc.. 16:54:21 jlebon: I can be the victim to learn the process above with you 16:54:46 basically i still want my workflow `virt-install --graphics=none` to work 16:54:55 i.e. if there is no VGA console then give me serial please :) 16:55:07 ravanelli: cool, sounds good :) 16:56:00 so - I guess the open questions are 16:56:18 1. what are the ways we can make the fallback to serial console work? 16:56:35 ideally it wouldn't require a console= entry on the kernel command line 16:56:46 discuss... 16:57:58 dustymabe: based on bgilbert's comment, there might not be a way to do that 16:58:28 dustymabe: are you sure that "console=ttyS0 console=tty0" delivers the correct behavior? I haven't tested but that doesn't match my understanding 16:58:34 indeed - which means we'd need to explore something like `console=ttyS0 console=tty0` 16:58:56 or just say that we don't care about this use case at all, 16:59:11 it'd be nice to get this use case working if there's a clean way to do it 16:59:22 i don't think that will work either 16:59:30 jlebon: no? 17:00:11 we have the inverse right now, no? so we'd just be flipping where the problem happens 17:00:52 so instead of the primary console being serial, it'll be tty0 17:01:13 and if a virtual tty0 is always allocated even when there's no video card, then we'll have the same issue 17:01:22 but on serial 17:01:41 I just tested, and it doesn't 17:01:46 you only get secondary console 17:01:54 because of https://github.com/coreos/fedora-coreos-tracker/issues/954#issuecomment-914613360 17:02:10 ^ what jlebon said 17:02:14 from my experience and now looking in to the aws aarch64 issues there seems to be no clean solution for this. I'm worried if we want nice UX, we will have to set some default consoles for each architecture/platform combination 17:02:21 sorry for being late 17:02:22 bgilbert: :) 17:03:02 if we think secondary console is better than nothing, we could still set it 17:03:13 jcajka: yeah we now have the flexibility to set the console for each arch/platform individually 17:03:35 dustymabe: well, kinda. metal and qemu are still special cases that would require code changes if we want to support that 17:03:47 it seemed to me that should go away, eventually 17:03:58 what should go away? 17:04:13 setting the default console during the image build 17:04:20 why? 17:04:32 oh, you mean enabling it by default 17:04:37 it = serial 17:04:53 I though that is idea behind moving to console=, sorry if I have miss understood that 17:05:30 on AWS, for example, we still need serial console. so the idea is to have per-platform defaults. 17:05:58 but in most cases IMO we should set no console= kargs 17:06:17 * dustymabe wishes the virtual console thing wasn't an issue and `console=ttyS0 console=tty0` in the absense of a VGA console would work 17:06:28 dustymabe: for your use case, do you think it's better to have secondary serial console (which might be misleading), or no serial console (makes it clear that kargs need to be changed) 17:06:30 ? 17:06:46 bgilbert: i'm torn 17:07:11 usually at least secondary console can give you some indiciation of the problem 17:07:25 but then you can't do things like type on the emergency shell (right?) 17:07:43 you can't type on the emerg shell and you can't see Ignition failures 17:07:54 if Ignition dies, you just get silence 17:07:58 (and no login prompt) 17:08:10 nothing goes to the journal when ignition dies? 17:08:22 journal doesn't go to secondary console 17:08:25 only kernel messages do 17:08:31 ahh 17:08:36 is there an argument for still making serial primary given FCOS' use case? we don't have a GUI, and are heavily automation-oriented 17:09:10 i'd prefer it, but i've proven to be not the majority I think 17:09:23 IOW, leaving the current defaults as is just for QEMU 17:09:37 jlebon: I think the one use case for console is debugging, so then it's a question of how users expect to do that 17:09:52 (debugging = bad Ignition configs / boot failures / etc.) 17:10:29 and I think the answer is probably: look at the graphics card in virt-manager / GNOME Boxes / qemu popup window / VNC session 17:10:49 i think most qemu wrappers (libvirt, GNOME Boxes) allow access to the serial console, so we could print something on failure to make it clear to look there 17:11:02 bgilbert: I think you are *probably* right.. but man does debugging on the VGA console versus the serial console suck 17:11:08 at least IMO 17:11:33 eh, there's Shift-PgUp. it's never really bothered me fwiw 17:11:35 huge text, questionable scrollback, no copy/paste 17:11:46 Time check. It does not look like we need consensus on this one, mostly debug/investigation work AFAIU. 17:12:03 we do need a decision eventually 17:12:11 the per-platform console work is blocked on this 17:12:16 oh ok 17:12:19 because it governs whether I need to refactor the code 17:12:19 (sorry) 17:12:33 bgilbert: would it be reasonable for now to keep the current defaults for QEMU in this rework? 17:12:44 it doesn't preclude us from changing it in the future 17:13:13 it's doable. but IMO the current defaults are user-hostile and I've really been hoping to change them :-/ 17:13:55 let's dig into that more next time maybe? the user-hostile bit 17:14:17 so, we've uncoupled this from the aarch64 work, so there's not really urgency 17:14:18 also - does anyone else not copy/paste text when debugging? 17:14:24 if we want to follow up in a week, the work can wait 17:14:24 Maybe we should make the QEMU serial debug case a special COSA feature: drop the command line changes in the QEMU image directly to avoid shipping that as default for eveyybody? 17:14:47 travier: cosa has already been changed to do that for 'cosa run' etc. 17:15:05 we could ship a helper command, sure 17:15:10 dustymabe: I usually do test/debug with cosa run to be able to copy/paste 17:15:27 worth mentioning it's a non-trivial operation that cosa does 17:15:55 dustymabe: I do, but I'm also one that runs qemu from cli or use cosa run 17:15:56 but definitely not a super big deal 17:16:00 it's also worth mentioning that the more we rely on COSA the more we're not testing what our users are doing 17:16:06 dustymabe: true 17:16:06 if it's easy to do once and reuse that tweaked qemu image for debugging that would help you dustymabe? 17:16:25 maybe we can add caching somehow so that when you rerun the exact same image, it doesn't need to go through the libguestfs dance again 17:16:42 ftr, I know how to workaround this, it's not ideal but basically I extract the kernel arguments out of the image and direct kernl launch 17:17:15 https://github.com/coreos/fedora-coreos-config/blob/testing-devel/tests/manual/coreos-network-testing.sh#L533-L537 17:17:22 jlebon and I were musing about a coreos-installer kargs embed command that worked on the qemu image 17:17:34 it's... doable, and is one possible general solution 17:17:45 I would vote for our defaults to be useful / best for production workloads and then make it easy to debug as much as possible in this order 17:18:01 travier: +1 17:18:12 travier: yeah 17:18:13 that's the question though... do production workloads really prefer vga over serial? 17:18:22 anyway, maybe let's table this for now 17:18:31 there's 12 minutes left 17:18:32 i think we still have time for another ticket at least 17:18:35 jlebon: yeah 17:18:52 bgilbert: +1 embed kargs via installer 17:18:53 jlebon: lots of production workloads on Windows, so the world can't assume serial is useful 17:19:22 +1 to table 17:19:34 #topic design journal persistence for early provisioning 17:19:38 #link https://github.com/coreos/fedora-coreos-tracker/issues/955 17:19:49 walters: around? :) 17:19:58 sorry maybe the other one as it is older? 17:20:07 (sorry should have raised that earlier) 17:20:13 was thinking that, but i think it'd be nice to have lucab for that one 17:20:19 oh, true 17:20:22 missed that 17:20:28 go on! 17:20:40 +1 17:20:53 hmm ok, doesn't seem like walters is around, so i can do a TL;DR 17:21:29 when using Ignition kargs, journal messages of the truly first boot are lost because we obviously never reach real root 17:21:54 so the RFE is to add some custom goop to persist it somehow 17:22:38 this is useful for example if you want to look at the logs from ignition-kargs.service on a successful boot 17:23:08 +1 on this 17:23:28 I'm not 100% clear on the security implications here as we could be storing things unencrypted in /boot 17:23:34 are those logs expected to be useful? 17:24:08 bgilbert: yeah, that's my primary concern as well. there's a complexity vs utility tradeoff there 17:24:08 (in some cases where people expect those logs to never be stored on an unencrypted partition) 17:24:23 travier: good point 17:24:40 (writing that in the ticket) 17:25:05 to play devil's advocate, those logs are unlikely to be sensitive 17:25:20 true 17:26:05 We could run a few tests, verify, and make a note for users in the docs. 17:26:08 but this is another untrusted thing that we would feed the final system / the journal 17:26:43 which makes measured boot for FCOS harder 17:26:48 honestly I think people developing FCOS and troubleshooting user issues would want this a lot! 17:26:55 ^^ 17:27:05 also, when using kargs to configure cluster nodes 17:27:24 (not saying that we should not do this) 17:27:25 for live troubleshooting, you can just look at the serial console :P 17:27:32 heh 17:27:33 the story here is: "user complains something isn't working, provides ignition config, no evidence in logs $thing didn't work" 17:27:46 It's a sad story. 17:27:51 one of despair. 17:28:06 2 minute warning. 17:28:10 dustymabe: why? the kargs boot is successful by definition (or we wouldn't have rebooted) and there's very little there that doesn't run again on the next boot 17:28:13 it'd have to be something which somehow originates from the first boot, but breaks the second boot 17:28:20 which seems unlikely 17:28:58 what bgilbert said :) 17:29:17 It can be a sanity check for that first boot ("Did I pass...") 17:29:31 the disk GUID randomization might happen during the first boot, but also I don't remember that ever failing 17:29:36 bgilbert: yeah, maybe the failure cases are limited in number 17:29:49 i can think of a case where it failed 17:30:06 we're at time 17:30:08 ok, we're at the hour. i'll write up some of the thoughts from here into the ticket and let's do a quick open floor 17:30:22 #topic 17:30:22 #topic Open Floor 17:30:27 #undo 17:30:27 Removing item from minutes: 17:30:30 #undo 17:30:30 Removing item from minutes: 17:30:32 #topic Open Floor 17:30:52 anything anyone wants to mention? 17:31:37 i'll mention: latest FCOS testing/next releases now has experimental support for modules 17:31:40 #info aarch64 images have shipped in the most recent set of releases 17:32:01 #info you can now do e.g. `rpm-ostree ex module install cri-o:1.20/default` in the latest testing release 17:32:05 jlebon, we need to figure out what we need to do to the main website to get it to show content 17:32:20 dustymabe: indeed 17:32:34 i guess another drop-down 17:33:11 closing in 45 seconds 17:33:23 i guess I should open an issue and ask for any volunteers 17:33:43 +1 17:33:53 As I've said before we should make sure that the wording on our aarch64 FCOS announcement is clear. Maybe this would be best to add to Fedora 35 release notes as well even thought it's not strictly related (but would give us more publicity) 17:33:55 /me misses allen 17:34:05 travier: +1 17:34:21 travier: +1 17:34:25 #endmeeting