16:29:29 <dustymabe> #startmeeting fedora_coreos_meeting 16:29:29 <zodbot> Meeting started Wed Feb 15 16:29:29 2023 UTC. 16:29:29 <zodbot> This meeting is logged and archived in a public location. 16:29:29 <zodbot> The chair is dustymabe. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions. 16:29:29 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:29:29 <zodbot> The meeting name has been set to 'fedora_coreos_meeting' 16:29:31 <dustymabe> #topic roll call 16:29:32 <dustymabe> .hi 16:29:33 <zodbot> dustymabe: dustymabe 'Dusty Mabe' <dusty@dustymabe.com> 16:30:13 <travier> .hello siosm 16:30:14 <zodbot> travier: siosm 'Timothée Ravier' <travier@redhat.com> 16:30:31 <bgilbert> .hi 16:30:31 <zodbot> bgilbert: bgilbert 'Benjamin Gilbert' <bgilbert@backtick.net> 16:30:59 <jmarrero> .hi 16:31:00 <zodbot> jmarrero: jmarrero 'Joseph Marrero' <jmarrero@redhat.com> 16:31:17 <jlebon> .hello2 16:31:18 <zodbot> jlebon: jlebon 'None' <jonathan@jlebon.com> 16:31:40 <aaradhak> .hi 16:31:41 <zodbot> aaradhak: aaradhak 'Aashish Radhakrishnan' <aaradhak@redhat.com> 16:32:08 <dustymabe> #chazir travier bgilbert jmarrero jlebon aaradhak 16:32:53 <copperi[m]> .hello copperi 16:32:54 <zodbot> copperi[m]: copperi 'Jan Kuparinen' <copper_fin@hotmail.com> 16:33:41 <dustymabe> #chair copperi[m] 16:33:41 <zodbot> Current chairs: copperi[m] dustymabe 16:33:44 <dustymabe> welcome all 16:33:48 <travier> #chair travier bgilbert jmarrero jlebon aaradhak 16:34:06 <travier> there was a spurious 'z' in the previous one :) 16:34:15 <dustymabe> travier: haha 16:34:18 <dustymabe> oops 16:34:27 <dustymabe> #topic Action items from last meeting 16:34:36 <dustymabe> * dustymabe will communicate our feedback on the website redesign 16:34:37 <travier> but I think you need to do it 16:34:47 <dustymabe> #chair travier bgilbert jmarrero jlebon aaradhak copperi[m] 16:34:47 <zodbot> Current chairs: aaradhak bgilbert copperi[m] dustymabe jlebon jmarrero travier 16:35:12 <dustymabe> #info dustymabe took the feedback from last meeting to the websites team: https://gitlab.com/fedora/websites-apps/fedora-websites/fedora-websites-3.0/-/issues/89#note_1271079731 16:35:57 <dustymabe> #topic New Package Request: audit 16:35:59 <spresti[m]> .hello spresti 16:36:01 <zodbot> spresti[m]: spresti 'Steven Presti' <spresti@redhat.com> 16:36:03 <spresti[m]> sorry I am late all! 16:36:04 <dustymabe> #link https://github.com/coreos/fedora-coreos-tracker/issues/1362 16:36:07 <dustymabe> #chair spresti[m] 16:36:07 <zodbot> Current chairs: aaradhak bgilbert copperi[m] dustymabe jlebon jmarrero spresti[m] travier 16:36:12 <dustymabe> welcome spresti[m] 16:36:54 <travier> Will introduce this one 16:37:19 <travier> The idea is that we want to include the audit package in the system as it's a base system tool 16:37:43 <travier> The problem is that is comes with the legacy `service` command line 16:38:25 <travier> it's required for "compliance" reasons as we can not use systemd to stop/restart the audit daemon directly 16:38:44 <travier> it has to be traceable which user asked for audit to stop 16:38:46 <mnguyen> .hello mnguyen 16:38:47 <zodbot> mnguyen: mnguyen 'Michael Nguyen' <mnguyen@redhat.com> 16:39:46 <travier> systemctl/systemd "by-passes" that as it uses a daemon/control model that does not directly link to the user via the audit id stored in the kernel for each process and assigned on login 16:40:10 <travier> so in the end, it needs to use a legacy script to perform operation on the service 16:40:29 <travier> So we have several options to move forward: 16:41:00 <travier> The audit script already includes the legacy scripts that are run by the service command 16:41:13 <travier> /usr/libexec/initscripts/legacy-actions/auditd/restart, etc. 16:41:24 <travier> Option A: The short option is thus just to remove the service binary and man page in a post-script. 16:41:34 <travier> Option B: The long option is to rewrite those as a proper standalone script that is not correlated to the service binary. 16:41:44 <travier> Option C: Another option is to move the service binary somewhere else and include a wrapper script that only accepts auditd as an option for calls to service auditd <stop|restart|...> and rejects everything else. 16:41:53 <travier> (eoi) 16:41:56 <travier> end of intro 16:42:43 <jlebon> ideally we'd fix the audit package itself, so not A or C 16:42:52 <dustymabe> travier: mind if I ask.. what has changed since the last time we discussed this? 16:43:37 <travier> The last time was a while ago and there was things that needed to be removed from the package. We're now down to just this issue 16:43:44 <travier> were* things 16:44:43 <jlebon> maybe audit can rework the scripts so they're shipped in /usr/sbin instead and then make the service pkg a weak dep 16:44:50 <travier> Option B has the problem that we might be told to "just ship service" 16:44:50 <dustymabe> i.e. there were some other things (like python scripts) that were removed from the package ? 16:45:10 <travier> and the docs everywhere on the net mention service as a workarond 16:45:22 <travier> not problem -> downside 16:46:49 <travier> dustymabe: yes, there were some python deps that got removed / split 16:47:00 <dustymabe> travier: 👍 16:47:18 <travier> and the full initscript package got split into scripts & service sub packages 16:48:12 <dustymabe> Option A isn't ideal, because we've said in the past we wanted to minimize postprocess hacking and slashing 16:48:36 <jlebon> so e.g. have a `/usr/sbin/auditdctl [verb]` which is called by the service wrapper to not break people who still want to use it, but could also be called directly (which we'd recommend on FCOS) 16:48:37 <dustymabe> I think I like Option B the best, but maybe not move it somewhere else 16:48:41 <travier> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/security_guide/sec-starting_the_audit_service 16:48:57 <dustymabe> sorry Option C 16:49:11 <travier> https://access.redhat.com/solutions/2664811 16:49:12 <dustymabe> (given that option B is #harder) 16:49:37 <jmarrero> If we need this fast, I would be OK with A temporarily while B is implemented. 16:50:46 <jmarrero> *but no A if B is never gonna be implemented. 16:51:02 <dustymabe> i advocate for option C 16:51:10 <jlebon> i think we should get together with the maintainer and chat about options 16:51:14 <dustymabe> leave service binary in place but patch it to only support `audit` 16:51:55 <jlebon> i don't think we should do any processing FCOS-side without trying to do this in at the packaging level first 16:52:05 <travier> jlebon: agree, I should reach out to the audit maintainers to see which approach they would accept upstream 16:52:22 <jlebon> we're not the only ones who want to drop the dep on service 16:53:10 <dustymabe> jlebon: I fully support engagement upstream.. but that's kind of where we were two years ago (I think it was that long) 16:53:18 <dustymabe> and we're back here 16:53:47 <jlebon> dustymabe: IIUC, i think two years ago what was attempted was advocating for systemctl, which lead to no movement 16:54:28 <jlebon> obviously that'd still be ideal fix, but barring that, there's still room for cleaner solutions 16:54:29 <dustymabe> i think at the time the author was open to a auditctl (some controlling utility) 16:54:37 <dustymabe> but wasn't willing to work on it 16:54:42 <dustymabe> which is OK 16:54:49 <dustymabe> let me see if I can find a link 16:57:16 <dustymabe> hmm. can't seem to find it right now 16:57:20 <dustymabe> maybe it was in an email 16:57:50 <travier> it was likely in the service split bz 16:58:09 <travier> https://bugzilla.redhat.com/show_bug.cgi?id=1768815 maybe 16:59:35 <travier> let's move to something else 16:59:43 <travier> I'll reach out to the audit maintainer 16:59:51 <dustymabe> travier: +1 16:59:52 <jlebon> travier: +1 17:00:01 <dustymabe> i think after that discussion then we can make a decision 17:00:10 <dustymabe> but the added (new) context will help 17:00:12 <travier> #proposed We'll reach out to the audit maintainer to try option B, while keeping option C as a backup 17:00:30 <dustymabe> #topic Ship the shimx64 binary in the CoreOS ISO image kind/enhancement 17:00:36 <dustymabe> #link https://github.com/coreos/fedora-coreos-tracker/issues/1413 17:00:46 <dustymabe> cc bgilbert 17:01:14 <bgilbert> there's a PXE use case that we don't really document 17:01:27 <bgilbert> which is: netbooting from UEFI with Secure Boot enabled 17:01:36 <bgilbert> (I'm not sure if that's still literally PXE but anyway) 17:02:00 <bgilbert> it requires shim, to chain from the MS signing keys in the firmware to the Fedora keys 17:02:33 <bgilbert> and of course, any random shim won't work. it needs to be a Fedora one with the Fedora keys 17:02:40 <bgilbert> (unlike, say, pxelinux.0, which can come from anywhere) 17:03:07 <bgilbert> and we don't currently provide a way to get the Fedora shim, without resorting to things like "finding the RPM" 17:03:29 <bgilbert> for a while it was accidentally possible to extract efiboot.img from the ISO and shim from efiboot.img 17:03:56 <bgilbert> but that was a redundant copy and we removed it. shim is still in the ISO, under the unhelpful name BOOTX64.EFI 17:04:20 <bgilbert> (and technically there's no contract that that file is shim) 17:04:27 <dustymabe> this is a problem :( but I wonder if it's one we can just throw a sledge hammer at rather than giving an elegant solution 17:04:38 <bgilbert> yeah, the question is the size of the hammer 17:05:00 <dustymabe> podman cp quay.io/fedora/fedora-coreos:stable:/path/to/shim-binary ./shim-binary 17:05:00 <bgilbert> any of the proposed solutions aren't very much work 17:05:18 <bgilbert> hmm 17:05:30 <dustymabe> it's downloading a huge amount of data for a tiny piece of it 17:05:49 <bgilbert> the furthest we could go is: add shim to the stream metadata as a fourth PXE artifact, and to the ISO image in /images/pxeboot for the same reason 17:05:56 <bgilbert> the latter so that `coreos-installer iso extract pxe` will work 17:06:19 <bgilbert> and the former so that `coreos-installer download -f pxe` will work, and the website will list it 17:06:56 <dustymabe> bgilbert: yeah, that sounds ideal. that would be the solution I would go with if we had infinite resources 17:07:08 <bgilbert> dustymabe, as you say, we could document a hack for extracting from the image. shim doesn't change very much 17:07:18 <dustymabe> though we shouldn't discount the size of the binary (i.e. extra size in ISO) 17:07:24 <bgilbert> shim is small 17:07:30 <dustymabe> 👍 17:07:32 <bgilbert> the main reason I'm hesitating is user confusion 17:07:39 <bgilbert> "what's this artifact? what should I do with it?" 17:07:52 <dustymabe> "If you have to ask, you can't afford it" :) 17:07:56 <bgilbert> and there might be scripts that would be confused by the installer downloading an extra thing 17:08:38 <dustymabe> i'm not too worried about user confusion on this front, but maybe I should be 17:08:52 <jlebon> bgilbert: you mentioned we're already shipping it in the ISO? could we just document how to extract it from there? 17:09:14 <bgilbert> there's multiple reasons not to document extracting it from its current path 17:09:25 <travier> I don't think it's likely that BOOTX64.efi would not be shim any time soon 17:09:26 <bgilbert> it's inside a VFAT image file inside the ISO, and it has a generic name 17:09:36 <bgilbert> but we could put a second copy inside the ISO directly 17:09:52 <bgilbert> if we put it in /images/efiboot, `coreos-installer iso pxe extract` will automatically extract it 17:10:00 <bgilbert> */images/pxeboot 17:10:33 <bgilbert> it's 925K 17:10:51 <jlebon> that sounds reasonable to me 17:10:55 <bgilbert> which? 17:10:56 <dustymabe> ok so order of preference: 17:11:06 <dustymabe> 1. make `coreos-installer iso pxe extract` work 17:11:23 <dustymabe> 2. add to pxe artifacts so downloading pxe using coreos-installer gives you shim 17:11:29 <jlebon> bgilbert: putting it in /images/efiboot 17:11:44 <dustymabe> we could always do 2. later if demand increases 17:12:10 <bgilbert> `iso pxe extract` is "supposed" to deliver the same artifacts as `coreos-installer download -f pxe` fwiw 17:12:15 <bgilbert> I'm not sure how important that is 17:12:32 <dustymabe> yeah, would be nice to keep it consistent 17:12:39 <jlebon> so the argument for 2. is that right now users don't need the ISO at all for pxe booting, and this would change that? 17:12:46 <jlebon> and consistency with `coreos-installer download` 17:12:54 <jlebon> gotcha 17:13:06 <bgilbert> eh, I'm not so concerned about needing the ISO to extract shim 17:13:18 <bgilbert> it's a relatively obscure use case (though maybe it shouldn't be) and shim changes seldom 17:13:25 <bgilbert> unlike the other artifacts, which change on every release 17:14:14 <bgilbert> re compat, it does feel odd to constrain ourselves not to add artifacts 17:14:17 <dustymabe> for me it's 1. - and then do 2. if you have time 17:14:32 <dustymabe> or really - we could just document this 17:14:33 <travier> We could also have a one liner dnf download from a container to get the RPM and extract the binary from it 17:14:43 <dustymabe> i.e. tell the users how to get shim from the RPM 17:14:56 <dustymabe> travier: yeah 17:15:07 <dustymabe> i think I highlighted an easier way above with my `podman cp` 17:15:20 <bgilbert> (I should maybe mention that the original reporter actually wants this for RHCOS, so FCOS docs don't 100% solve the problem) 17:15:26 <travier> I agree that the other options are "cleaner" but are they worth it? 17:15:34 <jlebon> and since it changes rarely, it's a one time cost when setting up your PXE server 17:15:45 <dustymabe> bgilbert: yeah, that's imporant - things aren't as easily accessible in that scenario 17:15:57 <travier> If we ship it as an artifacts then this adds up in storage, etc. for eveybody 17:16:09 <jlebon> but it's awkward to add a requirement on a new tool where before just `coreos-installer` sufficed. 1. maintains that property which is nice 17:16:15 <travier> but agree it's not much storage compared to the rest 17:16:56 <dustymabe> bgilbert: thoughts on a #proposed here? 17:17:35 <bgilbert> none, really. there's benefits to any of the approaches 17:17:45 <bgilbert> anyone feel strongly about an option? 17:18:15 <dustymabe> well - `podman cp` or `download the RPM and extract` aren't really good options for RHCOS 17:18:19 <jlebon> i'm ok with either 1 or documenting hacks to get it, but the UX for 1 is much nicer 17:19:02 <dustymabe> I'm ok with 1. (and 2. being optional TBH, though consistency is nice) 17:19:09 <jmarrero> downloading the rpm is less data but the podman example seems like most people will get right away. 17:19:09 <dustymabe> either way we needs docs to mention this use case 17:19:33 <bgilbert> jlebon: views on 2? 17:19:40 <travier> in the RHCOS case you already have the container image from the release image 17:19:56 <travier> well, not on your system 17:19:57 <jlebon> bgilbert: seems premature for now 17:20:41 <dustymabe> maybe we do 1. and then open an issue for 2. detailing steps and rationale (and link from the code implementing 1.) - then we can implement it later if we want to 17:21:08 <bgilbert> I'm vaguely uncomfortable with `iso extract pxe` not matching `download` 17:21:12 <bgilbert> seems gratuitous 17:21:23 <bgilbert> (though we could put shim elsewhere in the ISO for manual extraction) 17:21:31 <bgilbert> `iso extract shim` :-P 17:21:35 <dustymabe> :) 17:21:56 <dustymabe> i'm cool with 2. too - just from my end didn't want to require it, but also cool if we as a group do decide to require it 17:23:02 <jlebon> bgilbert: that's an interesting idea actually. it emphasizes the fact that you shouldn't normally need this 17:23:15 <bgilbert> yeah, it's tempting 17:23:25 <dustymabe> jlebon: am I correct in understanding that you're not opposed to 2. - just maybe don't see the benefits of the extra effort ? 17:23:49 <dustymabe> jlebon: I guess that depends - "you shouldn't normally need this" - are we encouraging people use secureboot or not? 17:24:03 <jlebon> dustymabe: that, and also whether we're ok with the messaging it implies 17:24:07 <travier> dustymabe: it's needed only for PXE Secure Boot 17:24:12 <bgilbert> that's the thing. we "should" probably be rewriting our docs to encourage UEFI SB 17:24:31 <dustymabe> bgilbert: in which case we'd want to encourage this workflow more? 17:24:38 <bgilbert> yeah 17:24:58 <dustymabe> seems like supporting arguments for making it more a part of the "normal PXE workflow" 17:25:28 <jlebon> indeed :) 17:25:44 <bgilbert> maybe we should defer user-visible changes until we have draft docs 17:25:55 <bgilbert> so we don't get ahead of our understanding of the workflow 17:26:25 <dustymabe> bgilbert: WFM - do you have proposed next steps? 17:26:49 <bgilbert> find someone to work on rewriting the PXE docs to add and emphasize a SB section 17:27:04 <bgilbert> I'm not planning to work on it soon 17:27:27 <bgilbert> and put a note in the issue to consult this discussion re ways to expose shim to users 17:27:36 <dustymabe> ok so what should we take to the ticket? 17:27:58 <dustymabe> should we doc a workaround in the ticket for now? 17:28:27 <bgilbert> container extraction workaround seems fine as a workaround 17:28:36 <dustymabe> +1 17:28:36 <jlebon> +1 17:28:37 <bgilbert> first draft of the docs can use that too 17:28:54 <jmarrero> +1 17:28:56 <dustymabe> bgilbert: do you want to update the ticket or should I? 17:29:07 <bgilbert> do you want it? 17:29:21 <dustymabe> not really, but I am running the meeting so I will 17:29:26 <bgilbert> okay, ty 17:29:29 <bgilbert> I think this discussion was still useful to explore the option space, even if we're deferring a long-term decision 17:29:34 <dustymabe> +1 17:29:39 * dustymabe will update the ticket 17:29:41 <jlebon> yeah, agreed 17:29:48 <dustymabe> #topic open floor 17:30:30 <dustymabe> i went to open floor because we're out of time but I do want to point out that we seem to be at a dead end for our two paths we were pursuing for https://github.com/coreos/fedora-coreos-tracker/issues/1247 (which blocks ppc64le being released in our prod streams) 17:30:59 <dustymabe> the kernel package itself looks unlikely to change because of limitations with various possible boot firmwards for ppc64le 17:31:28 <dustymabe> and jmarrero reported that the opportunistic cleanup in rpm-ostree is sufficiently complex that we don't want to pursue that path either 17:32:02 <dustymabe> so I guess we're #opentoideas on next steps to take, I guess we can start to explore resizing our boot partition again 17:32:12 <dustymabe> i was just hoping not to have to wait for that to ship ppc64le 17:32:14 <jlebon> "Getting petitboot updated so it can boot a gzipped vmlinux could be done, but AFAIK petitboot is mostly unmaintained these days." is sad 17:32:25 <dustymabe> jlebon: indeed 17:32:37 <dustymabe> maybe we should just call it a day and drop the arch altogether 17:32:56 <bgilbert> can we drop the others too? 17:33:04 <jlebon> why again is this not a problem in RHCOS? 17:33:06 <jmarrero> +1 17:33:19 <dustymabe> jlebon: I think it is 17:33:53 <dustymabe> https://bugzilla.redhat.com/show_bug.cgi?id=2104619 17:33:55 <jlebon> dustymabe: ok. so we do have to find a solution for this regardless 17:33:58 <jmarrero> RHCOS is where we saw it initially IIRC 17:34:20 <jlebon> oh right, it was hacked around in the MCO 17:34:30 <jlebon> which... is something zincati could do too 17:34:40 <jlebon> cleanup the rollback before staging a new deployment 17:34:40 <dustymabe> what's the hack? 17:35:11 <dustymabe> yeah, that's kind of what we wanted rpm-ostree to do, but only if it was needed 17:35:12 <jmarrero> MCO cleans up the old deployment 17:35:14 <jlebon> obviously the inconsistency there is not great, and it has reliability implications 17:35:43 <dustymabe> jmarrero: does the MCO only do that on ppc64le ? 17:35:57 <jlebon> doing it in zincati would be worse because it increases the window between the point of no return and finding out the deployment you're on is broken 17:36:19 <jmarrero> mmm let me dig the PR 17:36:21 <dustymabe> jlebon: indeed - i.e. if your upgrade window isn't until the weekend - zincati stages today 17:36:47 <dustymabe> though I guess that wouldn't help even if it was implemented in rpm-ostree would it? 17:36:57 <jlebon> https://github.com/openshift/machine-config-operator/pull/3243/ 17:36:58 <dustymabe> or does this step only happen in the finalize-staged step ? 17:38:01 <jlebon> dustymabe: not sure i follow 17:38:12 <jlebon> maybe we can continue in #fedora-coreos 17:38:14 <travier> the MCO does it on all arches AFAIK 17:38:25 <jmarrero> It happens for all Arches 17:38:44 <travier> https://github.com/openshift/machine-config-operator/pull/3243/#issuecomment-1180668694 17:38:47 <dustymabe> i.e. if we implementing "opportunistic cleanup" in rpm-ostree (like we started to try) would that have the same problem of "staged update today but not going to reboot to apply until saturday" 17:39:31 * dustymabe closes this meeting soon and we'll head over to #fedora-coreos to discuss more 17:39:34 <travier> I think we need to consider the "increase" /boot option 17:39:42 <dustymabe> travier: yeah 17:39:49 <jlebon> dustymabe: yeah let's chat there :) 17:39:54 <dustymabe> #endmeeting