16:30:33 #startmeeting fedora_coreos_meeting 16:30:33 Meeting started Wed Apr 1 16:30:33 2020 UTC. 16:30:33 This meeting is logged and archived in a public location. 16:30:33 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:30:33 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:30:33 The meeting name has been set to 'fedora_coreos_meeting' 16:30:38 #topic roll call 16:30:42 .hello2 16:30:45 dustymabe: dustymabe 'Dusty Mabe' 16:31:05 .hello2 16:31:06 cyberpear: cyberpear 'James Cassell' 16:31:29 .hello2 16:31:30 jlebon: jlebon 'None' 16:31:32 .hello2 16:31:34 jdoss: jdoss 'Joe Doss' 16:33:00 #chair cyberpear jlebon jdoss skunkerk 16:33:00 Current chairs: cyberpear dustymabe jdoss jlebon skunkerk 16:33:26 * dustymabe thinks we're missing quite a few people today 16:34:07 yeah, looks like :| 16:34:09 kaeso[m]: bgilbert: around today? 16:34:09 .hello lucab 16:34:10 kaeso[m]: lucab 'Luca Bruno' 16:34:17 \o/ - yay 16:34:29 (i almost didn't make it myself; just had a super long meeting right before this :) ) 16:34:31 welcome back kaeso[m] ! you've been missed 16:34:45 #chair kaeso[m] 16:34:45 Current chairs: cyberpear dustymabe jdoss jlebon kaeso[m] skunkerk 16:35:05 #topic Action items from last meeting 16:35:26 no action items from last meeting - woot! 16:35:36 #topic news 16:36:01 #info we came out with a testing release last week that moved us to using networkmanager in the initrd 16:36:28 #link https://github.com/coreos/fedora-coreos-tracker/issues/394 16:36:41 .hello2 16:36:42 lorbus: lorbus 'Christian Glombek' 16:36:43 that build introduced a regression on some cloud platforms 16:36:59 #link https://github.com/coreos/fedora-coreos-tracker/issues/440 16:37:30 .hello sohank2602 16:37:31 skunkerk: sohank2602 'Sohan Kunkerkar' 16:37:34 dustymabe: nice work on pushing the NM work all the way through! 16:37:38 #info we put out another testing release yesterday (31.20200323.2.1) that fixed the regression 16:37:46 thanks jlebon 16:37:52 #chair lorbus 16:37:52 Current chairs: cyberpear dustymabe jdoss jlebon kaeso[m] lorbus skunkerk 16:38:14 the NM in initrd work should land in next week's stable release 16:38:39 #info please test out the latest testing release to test NM in the initrd well before we release it to stable in one week 16:38:58 ok i'll move on to meeting itmes 16:39:05 * dustymabe can't spell today 16:39:07 #topic 16:39:11 #undo 16:39:11 Removing item from minutes: 16:39:39 #topic Missing libsss_sudo 16:39:45 #link https://github.com/coreos/fedora-coreos-tracker/issues/445 16:40:06 this one was just opened today but it seems like it's worth a quick look 16:40:41 jlebon: do you know if the user's proposed solution is the right thing to do (i.e. is just including that package sufficient) ? 16:41:04 it seems reasonable, yes 16:41:28 ok, the other question is - is this something we'd want to do in the base ? 16:41:44 we already ship sssd. i would've expected that plugin to already be there 16:41:50 or relegate to https://github.com/coreos/fedora-coreos-tracker/issues/401 16:42:32 ok, so this is more along the lines of "oops, that should have been included already" ? 16:43:16 i think so, but interested in other opinions 16:43:31 i haven't used sssd much myself 16:44:04 anyone else have an opinion here? 16:44:07 .hello2 16:44:08 bgilbert: bgilbert 'Benjamin Gilbert' 16:44:12 #chair bgilbert 16:44:12 Current chairs: bgilbert cyberpear dustymabe jdoss jlebon kaeso[m] lorbus skunkerk 16:44:28 or "no opinion" :) 16:45:19 I was checking the package content earlier today, and I think it's just a small .so which we could ship 16:45:39 FWIW, it's on silverblue at least by default :) 16:46:04 it sounds like it was also in CL 16:46:18 I haven't used that plugin, but it seems reasonable to include, assuming not many deps 16:46:19 size is ~40kB, voting to include :) 16:46:36 +1 from me for including 16:47:15 #proposed This fucntionality provided by libsss_sudo is missing functionality that we should include by default. The fact that it is currently missing was an oversight. 16:47:34 i'm lukewarm on that second sentence 16:48:05 yeah, i think i'm more "lean towards including it, but defer to sssd SMEs" 16:48:05 and I'll spell better in the #agreed 16:48:47 jlebon: that implies we would need to ask another team? 16:49:04 or just wait for somebody who knows better to come along? 16:49:11 I'd ask for a sample fcct snippet in exchange, as I have no idea how it can be configured via Ignition 16:49:12 dustymabe: i meant more the latter, but the former works too 16:49:40 jlebon: is it a strong enough reason to include it if CL had it? 16:49:50 anybody have CL booted up right now? 16:50:20 IOW, it *seems* like obvious missing functionality we want 16:50:26 dustymabe: probably, yeah 16:51:27 #proposed The functionality provided by libsss_sudo appears to be missing functionality that we want in FCOS. We are leaning towards including it in the base unless good reasoning for not doing so surfaces. 16:51:45 +1 16:51:49 ack/nack? 16:52:18 kaeso[m]: we can also ask for the sample fcct 16:52:21 +1 16:52:37 I use sssd, but not this plugin. +1 to #proposed, +1 to asking for snippet. 16:53:35 +1 16:53:44 I think it'd be good to collect use cases and fcct samples whenever we add a package 16:53:54 #agreed The functionality provided by libsss_sudo appears to be missing functionality that we want in FCOS. We are leaning towards including it in the base unless good reasoning for not doing so surfaces. 16:54:08 ok I'll update the ticket with this after the meeting 16:54:28 #topic Don't bring up networking in the initramfs on first boot by default 16:54:36 #link https://github.com/coreos/fedora-coreos-tracker/issues/443 16:55:14 ahh yup, i can expand on this 16:55:35 (libsss_sudo.so is 24kb btw, no other deps) 16:56:12 so, now we have the live ISO and promoting it one of the primary paths for doing bare metal installs 16:57:10 it works great, but it doesn't mesh well with the network configuration UX we want 16:57:53 right now the live ISO matches all the other FCOS deliverables and try to bring up networking during the initrd 16:58:16 this is an issue if part of your install process *is* to set up networking on the bare metal machine 16:58:50 so we have a primary goal to break that cycle 16:59:01 however, there's a higher-level issue here 16:59:21 which is that the only reason we bring up networking in the initrd is because ignition *may* require it to fetch resources 16:59:56 jlebon: right. so we have a more narrowly scoped issue about not requiring network to boot the Live ISO (presumably for an install): https://github.com/coreos/fedora-coreos-tracker/issues/349 17:00:00 so a bigger win is to fix that instead by only bringing up networking when it's required 17:00:28 what you're talking about is to solve this problem more generically, not just for the live ISO 17:00:44 and have network only be brought up in the initramfs if it's needed? 17:00:54 this is not really a big difference really for most cloud images, since they majority of them definitely always need networking for ignition to even fetch its config 17:01:45 but e.g. qemu, vmware, s390x, and of course the live ISO for example don't inherently need networking 17:02:29 so the approach we're suggesting is to (1) provide an easy fix to get the live ISO working offline, and (2) fix the general problem long-term 17:03:24 +1 17:03:44 I have a short term fix for the live ISO in https://github.com/coreos/fedora-coreos-config/pull/326 17:04:38 one might say "meh, who cares about qemu always needing networking", but i for one think it'd be really nice to be able to take FCOS for a spin entirely offline 17:05:18 it also fixes the live ISO path more correctly, since one actually *may need* networking if the embedded ignition config has remote resources 17:06:04 anyone have any qeustions/concerns for jlebon? 17:06:15 (on subsequent boots, does initrd do net?) 17:06:27 (no, that doesn't change) 17:06:40 (hehe) 17:07:09 cyberpear: not currently. but there are cases in the future where it might need to 17:07:16 e.g. cryptroot 17:07:39 bgilbert: right, that would probably be case by case, though, right? 17:07:42 yup 17:07:46 +1 17:07:47 right. essentially, whatever needs networking needs to declare it 17:07:54 right, clevis etc 17:08:12 this issue is specifically about ignition only declaring it when it needs it 17:09:25 thanks jlebon for bringing this up. seems like a reasonable idea and a good thing to do long term. maybe let's carry on the discussion in the ticket? 17:09:33 yup, +1 17:09:51 #topic Publish the initrd and rootfs images separately for PXE booting 17:09:58 #link https://github.com/coreos/fedora-coreos-tracker/issues/390 17:10:49 so we talked about this a few weeks ago 17:10:57 bgilbert: re. https://github.com/coreos/fedora-coreos-tracker/issues/390#issuecomment-605370179 is this something about iPXE itself, or the link was just bad? 17:11:01 ahh sorry 17:11:15 dustymabe asked me to bring it up again, because of one new data point: 17:11:59 I did some work on iPXE-booting FCOS on Packet, and indeed, iPXE takes takes ~5 minutes to fetch the initrd over HTTPS from our release bucket 17:12:12 I did not try HTTP, though I wonder whether that would help 17:12:23 depending how bad the iPXE crypto implementation is 17:13:02 I don't think it changes my view on the problem, but FWIW. /shrug 17:13:02 i'm not going to lie it's taking me several minutes to download a qcow from our bucket before (just curl on my laptop) - is this iPXE or is it the remote ? 17:13:06 asking the same question again: to sanity-check, we're sure this is about iPXE sucking at HTTPS and not just the connection being slow, right? 17:13:25 we are not 17:13:30 there's also cloudfront 17:13:42 if we want to run more comprehensive tests, we certainly can 17:13:47 but in any event it's not a one-off 17:14:03 +1 for more testing 17:14:16 if it's the remote then it won't matter what we do 17:14:54 maybe we could work with the packet people to place a file in their infra and see how long it takes then 17:16:06 anyone opposed to asking for more testing on this? 17:16:21 so essentially: (1) do more testing to sanity check the current approach is viable, then (2) if it's viable, go ahead with splitter tool 17:16:50 (1) do more testing to make sure it's not the remote server that is causing the slowdown 17:17:01 (2) we'd need to re-evaluate probably 17:17:13 dustymabe: +1 17:17:22 because we'd kind of want iPXE from packet to just work against our published artifacts 17:17:23 i think we're saying the same thing :) 17:17:35 +1 17:18:02 #info we'll do more testing for #390 to see if it's the remote server causing the slowdown or actually a slowdown in iPXE 17:18:31 #topic Booting on OpenStack does not retrieve SSH keys from MD service 17:18:36 #link https://github.com/coreos/fedora-coreos-tracker/issues/422 17:18:45 cyberpear: I think you opened this issue 17:18:57 yes 17:19:22 * dustymabe will let cyberpear frame the topic 17:20:32 every "cloud image" for other OSes I've found supports the MD service on OpenStack, same as on ec2, etc 17:20:50 FCOS image does not, unless you pass special cmdline options 17:21:25 ubuntu, fedora, RHEL, CentOS, and probably CL, but I didn't actually test CL on OpenStack 17:21:50 "how hard would it be to automatically detect the MD service?" 17:22:06 hmm, i wonder if https://github.com/coreos/ignition/issues/903 could help here 17:22:08 is the problem that we have a single openstack image but two possibly places where metadata could come from? 17:22:35 cc kaeso[m] 17:22:49 basically instead of doing the same hack as ignition to opportunistically try, have afterburn key off of what ignition found out 17:22:49 I don't know how non-MD supporting OpenStack installs otherwise provide MD, but maybe someone here knows? 17:23:14 via a config drive 17:23:22 for ignition user-data 17:23:26 it's not a question of whether it's hard, it's a question of whether it's safe 17:23:41 why is it safe for Fedora to do it, but not for us? 17:23:54 and RHEL? 17:24:03 CL has the same behavior as FCOS. The user flow would be to write (via Ignition) a systemd dropin to specify the openstack-flavor. 17:24:04 just because other people are doing it doesn't mean that it's safe :-) 17:24:19 it's harder to justify different behavior between Ignition and Afterburn though 17:24:35 bgilbert: ignition goes against that argument too though 17:24:40 jlebon: yup 17:24:55 WDYT about keying off of ignition here? 17:24:56 so does ignition try to fetch from either source today? 17:25:07 but afterburn doesn't ? 17:25:33 that way, the discovery is centralized in ignition and it and afterburn are consistent between each other 17:25:45 Afterburn updates ssh keys on each boot though 17:25:47 last time we discussed this internally, we were going to talk to OpenStack folks about whether in 2020 we can just assume a metadata service exists 17:26:48 I would be in favor of updating the baseline so that platform `openstack` actually means "an openstack platform with a metadata endpoint" 17:26:49 bgilbert: is the metadata service part of swift? 17:27:15 I don't think it is 17:27:16 that means dropping some code from Ignition though, I guess 17:27:17 because we did recently have an ask to be able to support swiftless openstack :) 17:27:37 kaeso[m]: right, it might be more approprate to flip the default to metadata and have users who want to use the configdrive source to add the configuration? 17:27:37 nova-api or nova-api-metadata 17:27:41 (my OpenStack doesn't have swift, but does have Ceph which might be providing something similar) 17:27:50 ok, ack 17:28:12 when was the metadata service introduced? 17:28:28 i'd definitely be in favour of shedding the config drive code if feasible 17:28:44 jlebon: one step in that direction is to change the default 17:28:58 from looking at the docs real quick, it appears that the service is still optional 17:29:34 dustymabe: there is no default today though :) 17:29:47 jlebon: oh? you have to configure afterburn either way? 17:30:06 dustymabe: that's what we're discussing 17:30:09 * cyberpear is +1 for swiftless OpenStack support (it's my current blocker for OKD on OpenStack) 17:30:15 afterburn doesn't support config-drive and needs to be told to use metadata 17:30:26 hmm ok 17:30:45 so the "harm" here is that afterburn would run and not reach the metadata service? 17:31:02 or reach some other node on the network 17:31:25 where that node happened to be using the same network that the metadata service usually ran as 17:31:26 which can provide whatever metadata it wants, even though it's not "the" metadata server 17:31:31 pull SSH keys from some node that acquired that particular link-local IP 17:31:49 but that is a problem for every other cloud distro already 17:31:56 as cyberpear mentioned 17:32:27 wait, but OKD/OCP anyway are not supposed to pull SSH keys from the cloud 17:32:30 IOW, yes that's a problem but is it one we really need to worry about 17:32:48 dustymabe: I'd be in favor of making good decisions, not typical ones :-) but I agree that the current situation isn't really tenable 17:32:49 (If some node has 169.254.169.254 address, such network is broken IMO) 17:32:59 cyberpear: why? 17:33:12 nodes can pick their own link-local addresses 17:33:57 right. so my suggestion is make afterburn key off of ignition, so at least we're consistent. but also ask the openstack devs if it's unreasonable nowadays to not support config drive anymore 17:34:24 jlebon: so we'd need to find some way for ignition to write out a file that afterburn can then key off of on every boot 17:34:33 since it runs on every boot 17:34:33 bgilbert: it'd only be a problem in the case w/ MD but w/o DHCP 17:34:53 w/ DHCP, the link-local address wouldn't be routabel 17:35:00 jlebon: something like `/var/lib/ignition/providers/openstack/metadata` and `/config-drive`? 17:35:27 cyberpear: it's not a question of routability. it's a security vulnerability to allow a random node on the same network segment to inject SSH keys into instances. 17:35:46 * dustymabe notes we're running over time, but do think this discussion is currently productive 17:36:12 dustymabe, kaeso[m]: i linked to https://github.com/coreos/fedora-coreos-tracker/issues/390 earlier since it might help, but maybe not either since we want something more permanent 17:36:24 bgilbert: is there a better solution? -- Do we convince the OpenStack folks to support the qemu way of passing ign? 17:36:40 via firmware_cfg? 17:36:55 heh, well, fw_cfg is itself controversial 17:37:07 (since most non-VMware OpenStack is just qemu-kvm in the end) 17:37:32 I'd personally prefer the metadata service to be mandatory but I get why that may not be possible 17:37:35 do we want to end the discussion here and investigate the "queue off of ignition" route ? 17:37:35 i think introducing another way to get metadata is counter productive :) 17:38:09 sigh - i think I used the wrong word 17:38:13 jlebon dustymabe: FWIW I don't think keying off Ignition is useful 17:38:20 Maybe someone can ask Red Hat security folks to evaluate it for the RHEL case, and we can inherit their solution? 17:38:28 s/queue/cue/ 17:38:36 as long as Ignition unconditionally tries metadata, the horse has left the barn 17:38:46 and IMO Afterburn might as well do the same thing 17:39:02 bgilbert: so you think we should change Ignition to not do that? 17:39:10 cyberpear: RHEL (RHCOS really) does no use cloud ssh keys at all, this is for FCOS really 17:39:10 * dustymabe assumes Ignition currently does that 17:39:28 kaeso[m]: it does use MD for ign, doesn't it? 17:39:34 RHEL does, RHCOS doesn't 17:39:37 bgilbert: that's my point though. if we're going to auto-discover, at least we should auto-discover once only 17:39:57 so we're sure whatever we settled on is what we roll with 17:40:27 that's entirely separate from whether we should auto-discover in the first place, which i agree we shouldn't 17:40:34 jlebon: sigh. yes... but also it's another piece of state we're talking about having Ignition write out 17:40:34 I think jlebon's one is a reasonable improvement 17:41:27 bgilbert: are you thinking about re-running Ignition cases? 17:42:07 kaeso[m]: no, I'm just worried about the pressure on Ignition to write out all sorts of flags and internal state. it's growing tendrils into all parts of the system 17:42:09 bgilbert: yeah... that's why i alluded to https://github.com/coreos/ignition/issues/903 earlier 17:42:57 i.e. let's not have ignition write 10 separate state files 17:43:08 well, sure 17:43:19 the least resistance path is to write FCOS docs / fcct sugar to write the systemd dropin 17:43:20 it's becoming too tightly coupled, is my point 17:43:40 bgilbert: yeah, i don't disagree :( 17:44:18 i'm going to try a proposal 17:44:29 FWIW, git master cloud-init still has the configdrive code btw 17:44:31 i'm not going to get offended if it gets shot down 17:44:39 #proposed we'd like to solve this itch. Ignition already does auto-discovery of the metadata service. We should investigate the feasibility of having Afterburn key off of Ignition's discovery 17:45:08 * dustymabe should mention openstack somewhere in there 17:45:28 +1 17:45:34 if this is not what we want we can punt to next meeting or whatever, I know some of us probably need to leave soon 17:45:42 "OpenStack metadata service"? 17:45:47 #proposed Modify Afterburn to read from the metadata service by default. Document that Afterburn requires the metadata service (or implement config-drive support). 17:46:08 ...since Ignition already gives a chance for 169.254.169.254 to compromise the system 17:46:24 I'm fine with that bgilbert 17:46:37 dustymabe: I think it's feasible, I just think we shouldn't keep adding flags'n'such to Ignition 17:46:53 how hard would it be to add config-drive support? 17:47:07 i suspect not very 17:47:28 to afterburn? 17:47:34 yeah 17:47:36 I think not very 17:47:44 kaeso[m]: thoughts on the proposals? 17:47:49 basically, if we're doomed to support both, then let's support both 17:47:52 if it was go then we could just copy the ignition code probably 17:48:13 it depends on the format, ibmcloud has 3x similar "ssh keys on config drive" and they go from "very simple" to "who came up with this" 17:48:32 hehe 17:48:44 bgilbert: so basically aliasing openstack-metadata to openstack in afterburn? 17:48:48 yeah 17:49:02 ohhhh there's one key difference btw between ignition and afterburn 17:49:30 if it's config-drive, then afterburn can assume that it's already ready and available because ignition must've gotten it from there too 17:49:48 not on subsequent boots 17:49:51 so afterburn would be more of a "check config-drive first, then fall back to metadata server" 17:49:51 but it does run much later, so 17:50:01 more time for the config-drive to settle 17:50:22 hmmm, true. but right, yeah 17:50:27 jlebon: it can still do even without Ignition, yes 17:50:50 ok, that makes me feel better 17:51:02 (re. not hard binding the two) 17:51:05 #proposed Modify Afterburn to read from the metadata service by default. Document that Afterburn requires the metadata service. Later possibly add support for config drive and update documentation 17:51:22 so basically 1) wire the `openstack` platform id 2) if no config-drive found, fetch from metadata 17:51:45 ack/nack? 17:51:54 dustymabe: i think i want something more like kaeso[m]'s proposal :) 17:52:07 where is that proposal? 17:52:48 #proposed modify afterburn to only know about `openstack` and in that mode, read from the config drive first then fallback to fetch from metadata server 17:53:37 +1 17:53:42 right, that's pretty much what i was proposing but doesn't require config drive support to exist before we fix the problem 17:53:53 ack 17:53:55 +1 17:54:00 seems like the best long-term solution 17:54:03 +1 17:54:19 #agreed modify afterburn to only know about `openstack` and in that mode, read from the config drive first then fallback to fetch from metadata server 17:54:24 dustymabe: the config-drive bit is key though because it helps ensure it matches ignition 17:54:28 it's more like "use metadata if there is no config-drive though" 17:54:33 +1 17:54:38 #topic open floor 17:54:42 *" though 17:54:45 I really apologize if anyone has been waiting for this 17:54:52 anything for open floor? 17:55:00 the latest release of the testing stream now has a DigitalOcean image 17:55:07 and the next stable release will have one also 17:55:12 #info the latest release of the testing stream now has a DigitalOcean image 17:55:22 this is suitable for the DigitalOcean "custom image" flow 17:55:26 bgilbert: we should probably add a docs page to show people how to use it 17:55:27 i.e. users will need to upload it themselves 17:55:30 dustymabe: +1 17:55:46 nice! 17:55:46 DigitalOcean can't ship FCOS directly to users yet because that requires more technical work on our end 17:55:54 (custom images get DHCP, standard images don't) 17:56:00 #undo 17:56:00 Removing item from minutes: INFO by dustymabe at 17:55:12 : the latest release of the testing stream now has a DigitalOcean image 17:56:07 #info the latest release of the testing stream now has a DigitalOcean "custom image" 17:56:12 (i find that different intriguing) 17:56:18 s/different/difference/ 17:56:51 will close out meeting in 60 seconds 17:56:54 bgilbert++ 17:56:54 jlebon: Karma for bgilbert changed to 4 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:57:57 #endmeeting