16:30:50 #startmeeting fedora_coreos_meeting 16:30:50 Meeting started Wed Feb 26 16:30:50 2020 UTC. 16:30:50 This meeting is logged and archived in a public location. 16:30:50 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:30:50 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:30:50 The meeting name has been set to 'fedora_coreos_meeting' 16:30:54 #topic roll call 16:30:58 .hello2 16:30:59 bgilbert: bgilbert 'Benjamin Gilbert' 16:31:01 .hello mnguyen 16:31:02 mnguyen_: mnguyen 'Michael Nguyen' 16:31:05 .hello2 16:31:06 dustymabe: dustymabe 'Dusty Mabe' 16:31:34 .hello2 16:31:35 miabbott: miabbott 'Micah Abbott' 16:31:38 .hello2 16:31:39 cyberpear: cyberpear 'James Cassell' 16:31:52 .hello sinnykumari 16:31:52 ksinny: sinnykumari 'Sinny Kumari' 16:32:43 .hello2 16:32:44 jlebon: jlebon 'None' 16:32:44 #chair mnguyen_ miabbott cyberpear ksinny darkmuggle jlebon 16:32:44 Current chairs: cyberpear darkmuggle dustymabe jlebon ksinny miabbott mnguyen_ 16:33:09 .hello2 16:33:10 darkmuggle: darkmuggle 'None' 16:33:53 #chair bgilbert 16:33:53 Current chairs: bgilbert cyberpear darkmuggle dustymabe jlebon ksinny miabbott mnguyen_ 16:33:58 #topic Action items from last meeting 16:34:35 ok only topic is the one about the infographic.. i've got that on my todo, i'm going to stop re-actioning it now as it's a reminder of my failure :) 16:34:38 .hello2 16:34:39 walters: walters 'Colin Walters' 16:34:43 welcome walters 16:34:48 #chair walters 16:34:48 Current chairs: bgilbert cyberpear darkmuggle dustymabe jlebon ksinny miabbott mnguyen_ walters 16:36:06 #topic cosa/mantle integration 16:36:19 #link https://github.com/coreos/coreos-assembler/issues/163 16:36:47 .hello2 16:36:48 jdoss: jdoss 'Joe Doss' 16:36:49 +1 to shrink the 3G cosa container 16:37:15 Our teams have been working with the coreos-assembler and mantle repositories to build and deliver Fedora CoreOS and Red Hat CoreOS 16:37:36 mantle is a repo that was built up in the CoreOS Inc days to deliver container linux 16:37:53 #chair jdoss 16:37:53 Current chairs: bgilbert cyberpear darkmuggle dustymabe jdoss jlebon ksinny miabbott mnguyen_ walters 16:38:01 we've borrowed many things from it, but now that Container Linux is going away we've decided to evaluate merging COSA and mantle together 16:38:29 here is a summary of a discussion the team had yesterday: https://github.com/coreos/coreos-assembler/issues/163#issuecomment-590995844 16:38:39 cyberpear: i don't think this will change the cosa img size much :) 16:38:58 and there is a WIP PR to merge mantle into COSA: https://github.com/coreos/coreos-assembler/pull/1152 16:39:41 This should help us with the "does this new feature belong in mantle or COSA" dilemma we've had 16:40:09 anyone have and questions or comments or things to add to what I said above ? 16:40:40 The motivation is important highlight 16:40:52 * lorbus says whoops he's late 16:40:53 .hello2 16:40:54 lorbus: lorbus 'Christian Glombek' 16:41:00 #chair lorbus 16:41:00 Current chairs: bgilbert cyberpear darkmuggle dustymabe jdoss jlebon ksinny lorbus miabbott mnguyen_ walters 16:41:24 We have been slouching towards a tight coupling and we realized that the divide between them is no longer useful to developers and users. 16:42:12 #info we're evaluating merging the mantle repo into coreos-assembler as the line between the two is blurring. With container linux going away the need to have them sepearate is even less. See https://github.com/coreos/coreos-assembler/issues/163 16:42:25 darkmuggle: agreed 16:42:46 * dustymabe waits briefly for more comments 16:44:04 #topic Request for usbguard package inclusion 16:44:11 #link https://github.com/coreos/fedora-coreos-tracker/issues/326 16:44:27 ok, we've bounced this one back and forth a bit 16:45:22 assuming it doesn't add a bunch of deps and doesn't run anything by default, should we include it so we can scratch this itch ? 16:46:37 I think every time we do that other projects (wireguard, openvswitch, kata, ...) are going to ask why we don't do the same for them since they need to support traditional systems too and don't want to maintain a container too 16:46:53 did anyone have opinions on my "semi-supported packages to layer" approach? 16:47:34 walters: i've wanted something like that for a while 16:47:37 I think it'd be good to have a list of "if you need these packages, it's okay to layer them rather than try containerizing them" 16:47:57 except it'd be nice if it was just some sort of 'addon' you enabled 16:48:10 if we filtered the fedora repo down it would also likely cut the rpm-md size down from like 70MB to 1-2 16:48:11 and maybe eventually have a "fat" version of the distro and a "skinny" version 16:48:19 i.e. we have the base ostree.. and then we have ONE other layer that includes stuff that fits in this grey area 16:48:30 As a end user I have heard the "Don't add a layer or install things" because it will make automated updates not work? I am not sure how much that is true but it makes me not want to deal with it. 16:48:36 the issue with package layering isn't really "whitelisting" good ones as much as it's about changing the nature of automatic updates 16:48:50 jlebon: my suggestion would help there I think 16:49:42 dustymabe: hmm, so e.g. we'd ship two OSTrees per stream each release? 16:49:54 it does in *some* cases introduce https://github.com/projectatomic/rpm-ostree/issues/415 - in others though, e.g. wireguard would be extremely unlikely (IMO) to break anything 16:49:54 jlebon: something like that, yes 16:50:16 jlebon: and we'd try to test them both 16:50:49 i.e. rather than having a yum repo of "ok to layer packages" we have a base ostree with an addons layer 16:51:07 so we just have 1 extra set of tests 16:51:18 in https://blog.verbum.org/2019/12/23/starting-from-open-and-foss/ I argue of thing of RPM layering like "Firefox extensions for the OS" - the problem domains around updates breaking extensions are quite analogous 16:51:20 rather than N extra sets of possible tests 16:51:31 hmm 16:51:52 * dustymabe notes this is mostly a random idea 16:52:02 hmm, ok so you're saying an "overlay" OSTree rather than a whole other tree? 16:52:17 right, would prefer it derive from the real thing 16:52:38 i think the problem root is lifecycle binding/versioning/testing extensions (rpms) with the OS - we could also (as discussed in that thread) teach rpm-ostree how to find an rpm-md repo whose "version" corresponds with the OS 16:52:42 but, of course. there would be a lot of details to work out too 16:52:47 shipping an overlay ostree is one implementation of binding 16:53:26 walters: so if we solved the yum repo versioning problem you think it would solve this problem ? 16:53:35 walters: yeah agreed. i think if we had the binding story figured out, IMO it'd be much easier to recommend pkglayering 16:53:51 i'm not opposed to the "extras ref"; if e.g. we say that nothing in there does anything by default (e.g. perhaps one has to explicitly turn on the units via Ignition) then it seems safer 16:54:10 i know otaylor was looking at this a while back, as it's relevant to Silverblue, though not sure if something came of that 16:54:17 (I'd be surprised if e.g. wanting wireguard suddenly started running openvswitch) 16:54:24 jlebon: yeah and lorbus just saw a customer hit it too 16:54:59 tangent: appropriately the solution we've been talking about applies to FCOS 16:55:12 I do believe the request for usbguard has also been made for RHCOS 16:55:31 so if we were to say "package layer it" in FCOS, how would that apply to FCOS? 16:55:51 on a side note: SUSE also has a way of doing transactional rpm installs client-side...not sure if we can learn anything from the way they're doing it 16:55:52 sorry, RHCOS 16:56:12 lorbus: they're using btrfs snapshots IIUC 16:56:20 lorbus: right, the microOS thing is much closer to traditional RPM - i.e. depsolving *always* happens client side 16:56:23 so they're just using the package manager (zypper) 16:56:33 it means you can get state drift; it's not really an "image system" by default 16:56:50 true..https://github.com/openSUSE/transactional-update 16:57:02 jlebon: Nothing came of it, my attention went elsewhere, still an outstanding problem :-) ... most approaches would require somewhat significant Fedora infrastructure work 16:57:03 nvmd thenso 16:57:04 https://github.com/openSUSE/transactional-update#caveats is basically all stuff rpm-ostree fixes 16:57:09 dustymabe: right, it'd probably end up having to be baked in or just shipped as RPMs a-la-kernel-rt 16:57:18 imo, since RHCOS use case is much more narrow than FCOS, we have different requirements/expectations from customers. it's good to have the discussion about including new pkgs in the RHCOS base, but i don't think the decision tree is going to be the same as FCOS 16:58:05 Well, the other big RHCOS distinction is that it's part of OpenShift 4 which is oriented around container images entirely - so shipping extensions there would likely be via container images 16:58:17 i mean, it already is with kernel-rt 16:58:19 otaylor: :( i think this is going to become much more relevant again once silverblue starts slowing down its cadence 16:59:02 ok, let's try to wrangle this discussion back in 16:59:13 anybody with a suggested way forward here ? 16:59:28 we obviously have a macro problem, which is we can't include the world in the host 16:59:52 but we do have a lot of small needs that aren't easily suited or desired to be run in a container 16:59:53 hmm, one path is to own the side yum repo ourselves 16:59:59 I really think for FCOS we need a good framework for helping users customize the OS in a best practice kind of way. I am still not sure what that really is from my angle besides shove it in a container and do a bunch of work on my end. 17:00:30 so basically every release also dumps known to match "good" RPMs into a yumrepo. and we ship the repo config in the host and disable the others by default 17:00:58 jlebon: define "good" ? 17:01:35 essentially walters' idea of whitelisting, and the whitelist controls what goes in the repo 17:01:35 something like a yaml file with a list of packages, we run a service that depsolves vs base OS and maintains multiple versions of them? 17:01:43 I think the kernel-rt way walters mentioned may be a good way to go here? expand that to work for other RPMs as well 17:02:00 to me good could mean two different things 17:02:01 the crucial part is that we *know* it layers/depsolves successfully at compose time 17:02:13 1. packages in this blessed list can be installed 17:02:23 2. packages in this blessed list have been tested to work 17:02:42 right, this would unlock meaningful testing too 17:02:50 and of course we could even run tests associated with that package but before we run there we need to integrate with the existing gating for the base probably... 17:03:38 basically, similar to RHCOS' approach of shipping the kernel-rt RPMs in the container, but we ship it in a yum repo :) 17:04:12 ok. I like the brainstorming we're doing here, but I do think implementing anything like this is going to be some time off 17:04:15 would anyone disagree ? 17:04:15 (For reference: https://docs.google.com/document/d/1yS0PTaUPmD-CkQkdlJ9OvY-Z5hBOIMlRKDyqutEG_ks/edit?usp=sharing was my analysis of fixing the issue on Fedora) 17:04:35 (Of course for OKD...raises the interesting question of whether OKD would match OCP and mirror extras into a container) 17:05:10 I think that'd be preferable.. 17:05:27 dustymabe: agreed, though i think implementation wise it's probably the easiest path 17:05:43 it's the rpm-ostree vs pivot way of updating the OS 17:05:46 (while still addressing the lifecycle problem) 17:05:51 jlebon: we probably need to have more discussion about the implementation 17:06:23 but just in general we think we'd like to hold off on usbguard because we'd like to implement some sort of "reliable extensions" framework to handle use cases like this 17:07:14 +1, IMO i think it's worth thinking more on this before we ship it 17:08:05 +1, like the new approach of shipping additional whitelisted apcakge in separate repo 17:08:06 i think my bottom line take on usbguard is for now anyone who wants it on FCOS can pkg layer and that should just work; for OpenShift 4...we could bake it into RHCOS short term but hopefully the fact that it's on the host is an implementation detail and e.g. we could switch to having it be a daemonset mostly transparently or so if we later decide to do that? 17:09:45 walters: not to derail too much, but for RHCOS, hopefully this be another install-config.yaml knob? if so, then yeah we can easily swap alternatives down the road 17:09:50 #proposed usbguard fits into a category of small OS utility/daemon that is not easy or desirable to containerize but is also not something we immediately want to include in the host because if we include every utility/daemon we end up with a kitchen sink OS. We'd like to develop a framework for "reliable extensions" that we can use to deliver usbguard and other utilities/daemons 17:10:02 the other related thing here is where Ubuntu is going with https://snapcraft.io/ - they've made it a nice experience to have "containerized" apps that work for CLI,GUI,Servers but are not aligned with the Docker/Kubernetes ecosytem and do require people to create both .deb and snaps (if relevant) 17:11:12 how does my statement look ? 17:11:30 ack 17:11:49 ack 17:12:01 ack 17:12:31 #agreed usbguard fits into a category of small OS utility/daemon that is not easy or desirable to containerize but is also not something we immediately want to include in the host because if we include every utility/daemon we end up with a kitchen sink OS. We'd like to develop a framework for "reliable extensions" that we can use to deliver usbguard and other utilities/daemons 17:12:58 #action dustymabe to create issue to discuss possibilities for a "reliable extensions" framework 17:13:05 ack 17:13:23 ok moving on to the next topic 17:13:47 #topic Add factory reset capability 17:13:51 #link https://github.com/coreos/fedora-coreos-tracker/issues/399 17:14:09 I think this idea has been floated before 17:14:23 but basically: we encourage users to reprovision from scratch whenever they have config changes 17:14:42 that's straightforward on VMs and clouds, and also on bare metal when there's PXE install infrastructure in place 17:15:04 but for a real single-node setup, like an air-gapped embedded appliance, it's not. 17:15:25 so I wonder whether it'd 1) make sense and 2) be feasible to offer a factory reset capability. 17:15:52 run a command, give it your new Ignition config, and it'll 1) put the Ignition config in /boot, 2) re-enable first-boot kargs, and 3) reboot. 17:16:04 the initramfs would not only do all the first-boot stuff, but delete any customizations first. 17:16:18 now, that probably doesn't work for **all** customizations. 17:16:28 if we committed the ignition config into the ostree repo by default this would be a bit easier 17:16:31 e.g. if you've moved the root FS, we probably won't move it back 17:16:54 ah yeah, resetting the rootfs would be...interesting 17:16:57 and so, conceivably, this feature could get bogged down in a pile of special cases 17:17:40 coreos-installer did use to be in the initramfs - so technically every system could be re-installed by just changing kernel CLI args in grub 17:17:43 but it seemed like it might be plausible. the work to support moving the root FS might help, since maybe we could reuse some of that infra. 17:18:21 dustymabe: yeah 17:18:27 thoughts? 17:19:00 seems good to me, though I'd be interested to know how much work we think it would be 17:19:22 dustymabe: that doesn't work if it's running from the disk it needs to install to though 17:19:59 jlebon: hence why I mentioned it used to be in the initRAMfs 17:20:02 In general we can reboot into 17:20:14 The initrd and run from ram 17:20:17 but yeah, maybe a higher-level approach is figuring out a way to rerun coreos-installer while still targeting that disk 17:20:50 dustymabe: ahh sorry, your "could" there is past tense :) 17:21:15 ...so actually, the stage2 discussion fits in here 17:21:22 could we "stage" the node somehow to reboot into live mode? 17:21:45 if there's a way to use our existing kernel and initrd in /boot to pivot into a live system by fetching the root FS from the network 17:21:50 and then run coreos-installer from that 17:22:07 that fixes _some_ cases. not the air-gapped single system, though. 17:22:48 we could even fetch before the reboot and stash somewhere 17:23:12 (uhh, kinda. if the install fails, and wipes the install image from ROOT, you don't get a second chance) 17:23:14 coreos-installer is a binary right? I wonder if all external deps are already in our initramfs 17:23:14 the initrd can mount the rootfs and copy out stuff like /usr/bin/coreos-installer into RAM, then blow it away 17:23:18 don't need to redownload 17:23:42 I've been assuming we don't actually want to run c-i from the initramfs proper 17:23:50 it's probably easier now than with the old shell script, true 17:23:54 only has a couple external deps 17:23:58 (but one of them is GPG) 17:25:11 oh wow. rube goldberg device: 17:25:33 * dustymabe notes time 17:25:35 1. before reboot, assemble a live squashfs from the ostree we already have 17:25:52 2. from the initramfs, mount the old root filesystem, copy the squashfs into ram 17:26:04 Yeah, that is the generalization 17:26:10 3. create a temporary partition at the end of the disk, copy the squashfs into it for safety 17:26:22 4. run coreos-installer. if it fails, that's okay, we still have the safety partition 17:26:31 5. reboot into new system 17:27:00 yeah.. seems like there are a lot of moving parts though 17:27:12 I'm not sure it's a serious proposal :_) 17:27:14 :-) 17:27:20 i think we'd have to consider how important something like this is 17:27:30 depending on the complexity of the proposed solution 17:27:35 yeah, agreed 17:27:37 obviously more complex == harder to maintain 17:27:43 to be clear, the original ticket wasn't contemplating rerunning the installer, just going through and deleting user customizations 17:27:56 which is messier in its own way, since we're probably not getting 100% back to pristine state 17:28:17 i'm confused though, even for air-gapped systems, they were installed somehow, right? 17:28:33 sure. maybe before the hardware was emplaced. 17:29:04 or maybe by hand 17:29:15 Folks, sometimes air-gapped systems just install themselves... 17:29:37 spontaneous installation 17:29:54 factory-installed FCOS 17:29:57 skynet is that you? 17:30:24 bgilbert: would it get us half-way there if we just fix the "keep state while rerunning coreos-installer" path? 17:30:24 bgilbert: I don't see anyone against factory reset 17:30:36 there are a few different ways to do it that we discussed 17:30:45 dustymabe: yup, we got some discussion out of it, which was my goal here 17:30:51 jlebon: sorry, which path? 17:30:56 should we add a summary to the ticket and see if we get any more discussion? 17:31:01 dustymabe: +1 17:31:35 bgilbert: being able to rerun coreos-installer, while keeping e.g. /var 17:31:48 on the same disk 17:31:57 #info all seem to be in agreement that factory reset would be nice to have.. there are a few different options for how to go about doing that. bgilbert will add them to ticket #399 17:31:59 jlebon: that should already be fixed as of the next c-i release. but no, doesn't help here 17:32:21 we're over time so I'm going to skip to open floor to see if there is anything 17:32:24 #topic open floor 17:32:39 jlebon: I think the primary point is having a way to reprovision without needing anything off-machine 17:32:59 ack 17:33:21 #info we did a new FCOS release for testing and stable starting today 17:33:30 thanks ksinny for running the release 17:33:52 thanks ksinny! 17:33:58 enjoyed working on the release :) 17:34:42 #info we migrated our production fedora ostree repos into a netapp volume to be accessed via various openshift projects that our teams will use to do automated imports and prunes of OSTree commits for Fedora CoreOS 17:34:53 anything else ? 17:35:00 \o/ 17:35:06 dustymabe: *awesome* 17:35:18 nice work 17:35:30 jlebon: still working on one more issue with permissions https://pagure.io/releng/issue/8811#comment-628901 17:35:49 but we're getting close 17:35:56 I haven't unleashed the importer just yet 17:36:04 dustymabe: sorry i couldn't chat this morning, let's discuss after food! :) 17:36:09 ok will end meeting in one minute 17:37:06 #endmeeting