16:29:14 #startmeeting fedora_coreos_meeting 16:29:14 Meeting started Wed Jun 28 16:29:14 2023 UTC. 16:29:14 This meeting is logged and archived in a public location. 16:29:14 The chair is dustymabe. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions. 16:29:14 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:29:14 The meeting name has been set to 'fedora_coreos_meeting' 16:29:19 #topic roll call 16:29:43 .hi 16:29:44 ravanelli: ravanelli 'Renata Ravanelli' 16:29:49 .hi 16:29:50 jdoss: jdoss 'Joe Doss' 16:30:02 .hi 16:30:03 gshomo: gshomo 'greg shomo' 16:30:04 .hi 16:30:07 quentin9696[m]: Sorry, but user 'quentin9696 [m]' does not exist 16:30:59 #chair ravanelli jdoss gshomo quentin9696[m] 16:30:59 Current chairs: dustymabe gshomo jdoss quentin9696[m] ravanelli 16:31:33 .hello marmijo 16:31:34 marmijo[m]: marmijo 'Michael Armijo' 16:32:08 #chair marmijo[m] Guidon 16:32:08 Current chairs: Guidon dustymabe gshomo jdoss marmijo[m] quentin9696[m] ravanelli 16:32:25 .hello c4rt0 16:32:26 apiaseck: c4rt0 'Adam Piasecki' 16:32:38 #chair apiaseck 16:32:38 Current chairs: Guidon apiaseck dustymabe gshomo jdoss marmijo[m] quentin9696[m] ravanelli 16:33:23 ok let's get started 16:33:28 #topic Action items from last meeting 16:33:35 #info there were no action items from last meeting 16:33:57 but I do know that last time we did report that travier opened https://github.com/coreos/fedora-coreos-tracker/issues/1512 16:34:08 .hello siosm 16:34:09 travier: siosm 'Timothée Ravier' 16:34:13 ravanelli: I know you were working on that with him.. were you able to get a change proposal submitted? 16:34:41 .hello mnguyen 16:34:42 mnguyen_: mnguyen 'Michael Nguyen' 16:34:47 #chair travier mnguyen_ 16:34:47 Current chairs: Guidon apiaseck dustymabe gshomo jdoss marmijo[m] mnguyen_ quentin9696[m] ravanelli travier 16:34:55 dustymabe: I created https://fedoraproject.org/wiki/Changes/EnableFwupdRefreshByDefault 16:35:24 I would appreciate some reviews/inputs on it 16:35:26 Nice! 16:35:30 So we can move it to page complete 16:35:54 ravanelli: Can you paste the link to that page in the tracking issue? 16:36:00 #info please review https://fedoraproject.org/wiki/Changes/EnableFwupdRefreshByDefault and provide feedback for travier and ravanelli 16:36:16 We'll have to share it with the server & iot groups after that 16:36:41 issue: https://github.com/coreos/fedora-coreos-tracker/issues/1512 16:36:56 nice work to you both 16:37:02 i'll try to give my feedback 16:37:20 👍 16:37:26 #topic F39 Change: No default fedora-repos-modular [Consideration] 16:37:26 yes, I meant: could you paste the link to the wiki in the github issue? 16:37:40 travier: ooh, yes 16:37:43 #link https://github.com/coreos/fedora-coreos-tracker/issues/1513 16:37:47 👍 16:38:09 .hi 16:38:10 spresti: spresti 'Steven Presti' 16:38:23 We discussed the current $topic a bit last week (summarized in https://github.com/coreos/fedora-coreos-tracker/issues/1513#issuecomment-1601239985) 16:38:55 any thoughts on the path forward for us on the two options presented in that comment? 16:39:05 A) try to detect when someone has layered packages and add in fedora-repos-modular as a layered package itself 16:39:08 I'd would lean towards communication only 16:39:12 B) send communications and give users commands to use to convert the package to a layered package before the change lands 16:39:16 so B 16:40:06 The only problem with B (IMO) is it is kind of a silent fail right? 16:40:18 spresti: indeed 16:40:21 It should be easy to fix (?) and we can't always guarentee that arbitrary layered packages will always work on updates 16:40:35 travier: correct 16:41:02 I think the only semi-common case of layering a modular package may be cri-o 16:41:23 I think this is part of the things that we can not guarentee. We can not guarentee updates will always work once you layer packages 16:41:45 Could we create a unit that detects this? and logs warnings (saying that repo x is no longer included) then roll back to fix their update path? 16:42:37 spresti: not sure I understand - the machine won't need rolling back (IIUC) because it failed the update and it's just happy sitting there on the old version 16:43:23 Yeah I used bad verbage "rollback" I mean default the setup aka option A 16:43:51 I think I am leaning more to A (from my perspective) 16:43:58 Sorry for the confusion. 16:44:32 yeah, option A is an option. The question is mostly whether it's worth the effort here (i.e. developing a solution, testing the solution, making sure it keeps working over time) 16:45:02 and also trying to make it only apply the workaround if actually needed (i.e. modular packages exist) 16:45:12 modular layered packages* 16:45:53 I think I'm leaning towards B here too. 16:46:21 but would be willing to revisit our solution if beta testing (i.e. when `next` switches to F39) yields bad results or new information 16:46:33 So at what point is a non-updated server a breaking problem? 16:47:12 define "breaking problem"? 16:48:23 Breaking being not able to update after applying change 16:48:44 I think there is some nuance here.. 16:49:06 in this particular case the core OSTree doesn't have any problem updating 16:49:24 it's IF you applied layered packages AND they were modular packages coming from modular repos 16:49:47 when you layer packages you assume some risk (i.e. we don't test that upgrade path in FCOS CI) 16:50:22 so I don't see it as a big deal breaker that some machines could stop to update.. after all there are multiple ways people could ID the problem and resolve it 16:50:26 ^ thats what I needed thank you; yeah B makes most sense. 16:50:36 1 - watch coreos-status email announcements (where we will tell them about this) 16:51:02 2 - run `next` and preview what's coming. If your `next` nodes stop updating then you should try to figure out why 16:52:09 +1 to B 16:52:22 B++ 16:53:00 #proposed We will communicate the removal of modular repos from Fedora CoreOS in F39 and give users steps to resolve the problems on the systems. If users are running `next` or `testing` streams they will also notice updates not coming in before there `stable` systems stop receiving updates. 16:53:59 votes? 16:54:02 +1 16:54:06 dustymabe: Would mind explaning how users would solve it? 16:54:28 would it be before trying to update? 16:54:48 +1 16:55:17 uninstalling the pacakge and re-installing the non-modular version should do it 16:55:21 +1 16:55:30 agree that we should give an example with the announcemment 16:55:35 ravanelli: assuming the modular package that the user was layering is still a modular package, then I think they could just resolve the problem by re-layering in the `fedora-repos-modular` package (i.e. if it is currently a base package then it will be tracked as an inactive base package, but the request will exist). 16:55:56 or indeed re-adding the repo works 16:56:21 apparently modules will be going away in Fedora at some point given the lack of traction 16:56:31 Thanks dustymabe and travier 16:56:35 +1 so 16:56:35 ravanelli: but yes, there will be a way for the user to run a command on their current system (non-updated) to put their machine in a position to auto-update again 16:56:56 #agreed We will communicate the removal of modular repos from Fedora CoreOS in F39 and give users steps to resolve the problems on the systems. If users are running `next` or `testing` streams they will also notice updates not coming in before there `stable` systems stop receiving updates. 16:57:03 yep.. I think examples will be useful 16:57:09 ok moving on to the next ticket 16:57:17 #topic Adding an LVM devices file by default 16:57:21 #link https://github.com/coreos/fedora-coreos-tracker/issues/1517 16:57:41 * dustymabe will give some time here for people to read the description 16:58:46 TL;DR both Host (CoreOS) and VM Guest (could be any linux) are trying to control/access block devices with LVM signatures created by the guest. 16:59:49 This seems like a good addition. I am using FCOS with qemu to run VMs. It makes the a great VM host node OS. Anything that makes running VMs easier is a +1 from me. I wish qemu could be included too so I didn't have to layer it, but I know that is a bigger discussion. 17:00:11 jdoss: +1 17:01:09 It doesn't help that qemu pulls the kitchen sink in with it. 17:01:38 ha - but that's part of what makes qemu great.. it is a swiss army knife 17:01:52 yeah for sure 17:02:22 The approach listed in the ticket is to: 17:02:29 1) create an empty devices file for new installs 17:02:38 2) migrate existing systems to use and lvmdevices file 17:03:17 looks good to me 17:03:29 this goal is that 2) is a one time migration and 1) is something we do for new installs for now, but hopefully have lvm2 RPM own that piece in the future (I opened an RFE at https://bugzilla.redhat.com/show_bug.cgi?id=2217510) 17:03:35 not sure we should provide the transition script but we can try 17:04:24 any glaring problems with this proposal (before I do #proposed)? 17:05:15 actually - a good question here is when we should roll this out? immediately? do we need to give some warnings to users? 17:05:44 the migration script should behave such that existing LVM devices attached to a system (i.e. if someone was using a block device with LVM on it for /var/) should continue to work 17:06:14 what wouldn't work is if say someone did a new install and then attached an existing block device to the newly installed system with a previously created LV 17:07:02 I don't think that's something we can reasonably support 17:07:04 but they could be easily imported with a single command 17:07:45 for PXE / ephemeral setups, it would mean that we would need to keep the migration script forever 17:08:00 (maybe?) 17:08:22 well.. it would just mean the user would have to change their provisioning to include a file or an import step 17:08:26 🤔 17:09:01 or they could remove the file via Ignition? 17:09:05 so they could write the lvmdevices file via Ignition with the contents they want or run an import command (provided by LVM) to do the same 17:09:14 travier: correct.. they could remove the file too 17:09:41 to get back to the current behavior 17:09:50 hum, can we actually remove files with Ignition? 17:09:57 I believe we can 17:10:16 spresti: might recall off the top. I'd need to look up docs 17:10:20 https://github.com/coreos/ignition/issues/739 17:10:45 ahh - ok then :) 17:11:03 Sorry not of the top of my head 17:11:06 so this needs a tmpfiles.d entry or a script but it's doable 17:11:12 right 17:12:10 travier: so given that limitation, what do you still think of 1) and 2) ? 17:12:41 I think we should do 1 17:12:52 2 is more complex, not sure if we should do it 17:13:20 but if we end up doing it for RHCOS, then might as well have it in FCOS 17:13:40 well. we kind of have to do 2 (at least a one time migration) if we do 1. because otherwise people's upgrading systems stop working 17:14:10 we kind of don't support LVM in ignition but fair 17:14:12 or I see.. 1) only applies to new installs 17:14:42 yes, only for new installations, but that's maybe harder to do 17:15:24 we could definitely do it - it just wouldn't be using a file (there would have to be a unit with some logic) 17:16:42 those kind of automated migations are really hard to do as we don't know what users have put in place 17:16:53 what if there is already a file there? 17:17:02 maybe we can go with this proposal for now and if we find any new information during testing we can bring it back to the ticket? 17:17:08 #proposed we will ship an empty lvmdevices file for new installs and also add a migration script for existing systems that will populate an lvmdevices file with appropriate content so that existing systems using LVM will continue to work. 17:17:22 +1 17:17:27 +1 17:18:25 any other votes? 17:18:38 I need time to digest it 17:18:52 🥪 17:18:57 +1 17:19:08 +1 17:19:15 +1 17:19:23 #agreed we will ship an empty lvmdevices file for new installs and also add a migration script for existing systems that will populate an lvmdevices file with appropriate content so that existing systems using LVM will continue to work. 17:19:46 Of course, if new information comes out or if we find a better way to achieve the goal. We'll bring that information to the ticket and pivot. 17:19:59 #topic tracker: Fedora 39 changes considerations 17:20:04 #link https://github.com/coreos/fedora-coreos-tracker/issues/1491 17:20:09 👍 17:20:23 ok - I updated the description this morning with new changes that have come in 17:20:35 subtopic 117. Retire AWS CLI version 1 package awscli 17:21:44 I think maybe the only thing that this implies for us is that we need to make sure the new package and not the old package is getting installed in COSA... 17:21:50 and... I just checked 17:21:58 [coreos-assembler]$ rpm -qa | grep aws 17:22:00 python3-awscrt-0.16.19-1.fc38.x86_64 17:22:02 awscli2-2.11.18-1.fc38.noarch 17:22:07 so we already have `awscli2` 17:22:33 so no action for us? 17:23:08 quixk question about taht 17:23:09 Well that turned out to be easy. 17:23:31 👍 17:23:32 quentin9696[m]: go for it 17:23:32 why FCOS don't include by default awscli on AWS AMI ? 17:23:54 quentin9696[m]: maybe let's cover that in open floor 17:24:06 dustymabe: sure 17:24:08 subtopic 118. No fedora-repos-modular in default installation 17:24:26 ok this is the topic we discussed earlier - covered by https://github.com/coreos/fedora-coreos-tracker/issues/1513 17:24:39 subtopic 119. LIBFFI 34 static trampolines 17:25:18 I think this one should be transparent to us. No changes for us to implement since we just consume packages. 17:25:31 👍/👎 ? 17:26:06 If its just consumed then +1 17:26:14 do we ship that? 17:26:27 we do 17:26:45 subtopic 120. Flatpaks without Modules 17:26:55 we dont ship flatpak, nothing for us to do. 17:27:05 subtopic 121. Increase vm.max_map_count value 17:27:08 119: should be transparent 17:27:51 increasing vm.max_map_count should be transparent to us. I don't know of a reason this would cause problems 17:28:09 +1 17:28:11 subtopic 122. Make Toolbx a release-blocking deliverable and have release-blocking test criteria 17:29:06 We ship the toolbox software but not the toolbox image (gets pulled from registry). Overall this shouldn't affect us (other than making sure the toolbox container images are in good shape before release). 17:29:30 subtopic 217. Aspell Deprecation 17:29:42 I'm pretty sure we don't ship this. 17:29:53 subtopic 218. Automatic Cloud Reboot On Updates 17:30:06 This has to do with the Fedora Cloud image and cloud-init specifically. Doesn't apply to us. 17:30:15 subtopic 219. Vagrant 2.3 17:30:19 We don't ship vagrant 17:30:27 :) 17:30:45 Anything on those that we should go over before moving to open floor? 17:30:53 Wow awesome thank you for running through them so fast 17:31:10 spresti: we were running out of time and the last few were easy :) 17:31:14 #topic open floor 17:31:17 quentin9696[m]: you had something? 17:31:55 yes, can we add by default AWS CLI on AWS AMI provided by FCOS ? 17:31:58 LGTM for the other changes 17:32:05 FYI: I won't be around next week for the meeting. I know there will be a bunch of other people out too for Holiday. Someone could volunteer to run the meeting OR we could cancel the next one. 17:32:34 we don't include any cloud tools in the image (all images have the same packages) 17:32:54 quentin9696[m]: we ship the same FCOS image everywhere, so that would mean we'd need to ship the AWS CLI packages and gcloud packages and Azure client packages everywhere 17:33:01 the list goes on 17:33:01 I should be able to run the meeting 17:33:08 travier++ 17:33:14 thank you travier 17:33:21 thank you, everyone ! 17:33:41 quentin9696[m]: so it gets rather large if you try to include them all.. running them from a container however - pretty easy to do 17:34:02 you could even do it in toolbox if you like (just install the packages you want/need) 17:34:13 (gcloud is very big from memory) 17:34:27 dustymabe: oh ok, in my mind you shipped modified version to adapt to the cloud provider. That's ok for me, it was just curious about that 17:34:54 quentin9696[m]: the only real modifications we make to images is to ship a different platform ID in the kernel arguments for the image 17:34:56 dustymabe: actually we run a container with AWS CLI 17:35:14 all the software inside the image can then "differ" in behavior based on that platform ID kernel argument 17:35:25 so it's all the same bits inside, they just may behave slightly different 17:35:49 any other topics for open floor? 17:36:01 For aws-cli: just so you know the issue: running a container is very heavy (RAM/CPU). It’s really a pain to manager. 17:36:13 Especially on low-RAM machines 17:36:16 thanks for the explanations 17:37:05 Guidon: it depends on what that container is doing :) after all a container is mostly just a linux process 17:37:46 maybe grabbing the container image and laying it out on the filesystem is a bit compute/network intensive, but once the setup happens and the process is running I wouldn't expect to have a different footprint than running the process natively 17:38:32 * dustymabe will close the meeting now 17:38:38 #endmeeting