16:30:16 #startmeeting fedora_coreos_meeting 16:30:16 Meeting started Wed Jul 29 16:30:16 2020 UTC. 16:30:16 This meeting is logged and archived in a public location. 16:30:16 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:30:16 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:30:16 The meeting name has been set to 'fedora_coreos_meeting' 16:30:21 #topic roll call 16:30:25 .hello2 16:30:27 dustymabe: dustymabe 'Dusty Mabe' 16:30:47 ..hello2 16:30:53 .hello2 16:30:54 cyberpear: cyberpear 'James Cassell' 16:30:56 .hello sohank2602 16:30:59 skunkerk: sohank2602 'Sohan Kunkerkar' 16:31:08 .hello2 16:31:09 nasirhm: nasirhm 'Nasir Hussain' 16:32:17 .hello2 16:32:17 #chair cyberpear skunkerk nasirhm 16:32:17 Current chairs: cyberpear dustymabe nasirhm skunkerk 16:32:17 darkmuggle: darkmuggle 'None' 16:32:21 #chair darkmuggle 16:32:21 Current chairs: cyberpear darkmuggle dustymabe nasirhm skunkerk 16:32:26 .hello2 16:32:27 .hello2 16:32:29 davdunc: davdunc 'David Duncan' 16:32:32 jdoss: jdoss 'Joe Doss' 16:32:34 I know jlebon said he would miss the first half of the meeting 16:32:42 👋 16:33:05 .hello2 16:33:06 bgilbert: bgilbert 'Benjamin Gilbert' 16:33:10 #chair davdunc jdoss dghubble bgilbert 16:33:10 Current chairs: bgilbert cyberpear darkmuggle davdunc dghubble dustymabe jdoss nasirhm skunkerk 16:33:11 * bgilbert waves at dghubble 16:33:36 * dustymabe gives a two hand wave 👋👋 16:33:58 .hello mnguyen 16:33:59 mnguyen_: mnguyen 'Michael Nguyen' 16:34:04 #topic Action items from last meeting 16:34:08 #chair mnguyen_ 16:34:08 Current chairs: bgilbert cyberpear darkmuggle davdunc dghubble dustymabe jdoss mnguyen_ nasirhm skunkerk 16:34:24 we had no declared action items from last weeks meeting :) 16:34:40 we can move on to meeting tickets 16:34:55 #topic Publish the initrd and rootfs images separately for PXE booting 16:35:01 #link https://github.com/coreos/fedora-coreos-tracker/issues/390 16:35:08 this is an FYI topic 16:35:09 .hello2 16:35:09 lucab: lucab 'Luca Bruno' 16:35:35 bgilbert has been working hard to act on the plan we laid out to deliver rootfs separately 16:35:51 the current migration timeline is in https://github.com/coreos/fedora-coreos-tracker/issues/390#issuecomment-661986987 16:36:33 I believe we'll be sending out a coreos-status post in the next release cycle to give people pertinent information 16:36:53 end of FYI... will pause briefly before moving to the next topic 16:37:15 guess I should have used PSA rather than FYI 16:37:22 yup, we're holding the coreos-status post until there are actually artifacts to use 16:37:35 👍 16:37:56 #topic K8s CoreDns not starting on 32.20200629.3.0 (Pods on master can't talk to the apiserver) 16:38:03 #link https://github.com/coreos/fedora-coreos-tracker/issues/574 16:38:14 nice work dghubble digging down into the problem on this one 16:38:29 👍 16:38:41 i'll try to add some context 16:39:27 There is a problem that was introduced recently in our stream history that caused a lack of pod-to-pod connectivity across node boundaries. 16:39:57 dghubble: tracked the problem down to where we re-added the 99-default.link file back into our base OSTree 16:40:06 https://github.com/coreos/fedora-coreos-config/compare/de77a65dc5d424adea2b6be78d6a9c317c89dad3..2a5c2abc796ac645d705700bf445b50d4cda8f5f 16:40:24 which he was able to track to an upstream flannel bug 16:40:43 where flannel and systemd conflict over choice of mac address for the flannel devices 16:41:20 so users can workaround by adding their own .link file that matches flannel interfaces 16:41:37 OR we could possibly ship something in the OS so they wouldn't have to do that 16:42:02 It looks like luca found where CL was doing something that also caused the problem to not exist for users 16:42:22 we should add the link/networks units to the flannel RPM 16:42:33 Should we attempt to ship some configuration for this type of issue, or should we leave it to the user? 16:42:43 but I think most people are not consuming it that way 16:42:50 it=flannel 16:43:15 lucab: yeah, probably. I don't know much about flannel, unfortunately 16:43:24 it's software installed on the host? 16:43:50 pro: fairly unobtrusive to the user to add flannel.link 16:43:50 con: another cni provider might expect the same in future 16:44:19 flannel runs as a host network daemonset and creates the link interface on host 16:44:53 dghubble: got ya. 16:45:30 so I think luca touched on a valid point. Should the .link configuration be delivered with flannel (however it is delivered)? i.e. via RPM if someone installed it that way, or maybe the daemonset lays it down. 16:45:53 re: daemonset, would that work? 16:46:27 it maybe could, I'd probably just bake it into the host ignition 16:47:24 I think I'd be in favor of shipping something in the host (it's relatively small and only applies in cases where it's needed). It's similar to me as us shipping udev rules for AWS specific disks. 16:47:40 though obviously we don't want it to go too far and scope creep 16:47:54 I don't feel strongly either way, though 16:48:18 * dustymabe certainly interested in other input here 16:49:07 that would be nice so flannel users don't run into this 16:49:25 meh, I wouldn't ship in the host unless people install the RPM 16:50:14 lucab: yeah, if we knew most users were using the rpm then putting it in the rpm would suffice (wouldn't need it in our base layer) 16:50:44 lucab++ 16:50:46 nasirhm: Karma for lucab changed to 4 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 16:50:47 I'm not using the rpm for that, but I buy the argument and can add via ignition 16:51:25 I'd lean toward shipping the config for the reasons dustymabe said, but not a strong preference 16:51:30 dustymabe: indeed, the larger rationale is "whatever brings in flannel is also responsible for tweaking the environment as it needs" 16:51:44 it's small and reduces friction 16:52:05 but IMO lucab is correct in the general case :-) 16:52:27 lucab: right. I think if we could find the 2,3,4 ways that flannel is typically delivered and get the change applied there then I would favor that approach (keeps the separation of concerns nice) 16:53:06 if we can't then I think I favor adding the snippet with heavy comments 16:53:36 changing the rpm should be easy 16:53:46 what about the daemonset? 16:54:04 are flannel daemonsets usually homegrown or do they typically follow one recipe from upstream? 16:54:23 I've not tried having the DaemonSet have an init container and mount /etc/systemd/networks and try to have it add a link before flannel starts 16:55:06 do we think flannel upstream would be opposed? 16:55:23 * dustymabe mostly talking about things he is not super familiar with 16:55:37 I'd guess at this point they're homegrown. Upstream is rather inactive, flannel was the main choice in Container Linux days, but less now 16:55:50 got ya 16:56:06 so we've got two options 16:56:12 anyone with strong preferences here? 16:56:47 no strong preference, happy to add the link via ignition for now and think about other ways 16:57:25 bgilbert: no strong preference from you, but you lean to include it 16:57:35 lucab: no strong preference from you, but you lean to exclude it 16:57:44 anybody else in the jury? 16:57:52 IIRC flannel pod already has some glue to copy CNI configs to the host 16:58:08 It does, to /opt/cni generally 16:58:20 * nasirhm has no strong preferences. 16:58:32 lucab: is there a single flannel pod that *most* users use? i.e. could we change one source and capture a large set of users? 16:58:43 if so, that would be my preference 16:58:50 Need to check on the details of mounting /etc/systemd/networks and make sure perms/selinux are ok, it runs as a privileged pod (spc_r) 16:59:45 example https://github.com/poseidon/terraform-render-bootstrap/blob/master/resources/flannel/daemonset.yaml 17:00:50 ok maybe I'll take a summary of this discussion back to the ticket and we can re-visit next week. maybe some new info/investigation will come to light between now and then. 17:01:04 sound good? 17:01:06 dustymabe: I guess most people are using the coreos one or the typhoon one 17:01:14 lucab: +1 17:01:35 if we can get it down to 2 or 3 sources it would be nice to change the sources 17:01:47 i'll summarize and push to next week 17:01:51 next topic 17:02:09 #topic Dynamically sized bare-metal image can clobber data partition on reprovision 17:02:15 #link https://github.com/coreos/fedora-coreos-tracker/issues/586 17:02:21 bgilbert: i'll let you introduce the problem here 17:02:51 the FCOS (and RHCOS) bare-metal images are dynamically sized at build time. 17:03:23 we compute the ostree size, add some padding, and that's the size of the root filesystem. 17:03:46 AIUI this is done to avoid the extra I/O at install time to write out gigabytes of zeroes. 17:03:47 .hello2 17:03:48 jlebon: jlebon 'None' 17:04:06 #chair jlebon lucab 17:04:06 Current chairs: bgilbert cyberpear darkmuggle davdunc dghubble dustymabe jdoss jlebon lucab mnguyen_ nasirhm skunkerk 17:04:08 it doesn't affect the shipped artifact size much, since it's shipped compressed 17:04:25 the problem is that the size of the root filesystem is a contract with the user. 17:04:46 if the user puts a data partition (or /var) immediately after the root FS, and later reprovisions the machine... 17:04:59 ...and the reprovision is done with a newer, larger image... 17:05:11 we'll clobber the superblock of their data filesystem. 17:05:30 interesting.. I have questions when you're done :) 17:05:41 and we want users to reprovision often. immutable infrastructure, after all. :-) 17:05:43 ahh yes. related to this is the "trapping of the rootfs" problem 17:06:00 we seem to have gotten away with this so far, in the sense that I haven't heard any complaints 17:06:29 but the failure mode is bad, and we're getting more interest in data partitions and separate /var, so we should get this fixed. 17:06:42 EOM 17:06:59 ok, so you say the rootfs is dyamically sized, but what about the actual partition ? 17:07:04 the partition, yeah 17:07:11 it's also dynamically sized 17:07:17 ? 17:07:18 yup 17:07:29 yeah, ok, I agree that we shouldn't do that I don't think 17:07:41 I thought about fancy tricks where we make the partition larger than the disk image 17:07:46 but I think that's too dangerous 17:07:59 we can dynamically size the rootfs (to prevent writing the zeroes), but we should leave the partition size the same probably 17:08:27 does that make sense ^^ ? 17:08:36 I don't think that helps 17:08:46 we have to write every bit of the image we ship 17:08:59 and the backup GPT goes at the end of the disk 17:09:13 so a 10 GB partition -> 10 GB of uncompressed data 17:09:26 oh sorry, I thought you meant prevent writing the zeroes in our pipeline when we build the image, not when coreos-installer writes them out 17:09:27 ignoring the fancy tricks I mentioned 17:09:37 no, coreos-installer is the issue 17:09:45 can we hardcode the minimum partition size in a base ignition config? 17:09:58 and consider any changes to that to be a breaking change 17:10:19 though we should address https://github.com/coreos/ignition/issues/924 first 17:10:20 committing to a long term cap forever is a tough call 17:10:21 hmm, and have Ignition resize the partition first thing? 17:10:40 walters: not "forever". we just have to treat changes as a breaking change 17:10:47 i think we leave the door open to changing the cap. we just need to be noisy 17:10:51 announcement, deprecation period, etc. basically people have to reprovision at that point 17:11:17 +1 to re-provision (we won't make the decision lightly, of course) 17:11:20 that's ironic though because it more only breaks *if* they try to reprovision-but-keep-data right? 17:11:34 also having ignition or coreos-installer fail with appropriate error message, rather than wipe data, would be good 17:11:48 walters: yeah 17:12:08 do we actually care about this case? can we just hard-fail in coreos-installer if it detects it doesn't have enough room? 17:12:12 dustymabe: by the time Ignition runs, it's too late 17:12:18 dustymabe: and coreos-installer doesn't know enough to enforce 17:12:22 i do feel generally the people with nontrivial "pet" data they want to preserve are also going to have large disks, and should feel fine allocating something large like 20G to the OS 17:12:31 lucab: how can it detect that? 17:12:41 if our default install is ever over 20G we've clearly failed ;) 17:14:06 having coreos-installer error out on existing partitions now would be a breaking change, though maybe that's OK 17:14:20 bgilbert: there is an existing non-OS partition that starts before the end of the last partition of the new image, perhaps? 17:14:28 what's a non-OS partition? 17:15:11 I think it is currently being defined as "index >= 5" 17:15:17 lucab: no 17:15:25 (but I could be wrong) 17:15:54 if we start erroring out whenever there's any existing partition, we just induce people to pass --force every time 17:16:27 we could detect that it was a coreos install, and in that case we know what the "OS partitions" are 17:16:28 and I don't know how to detect which partitions people "want" to keep, from arbitrary previous partition tables 17:16:41 hmm 17:16:55 so e.g. non-CoreoOS -> ask if existing partitions, CoreOS -> ask if existing non-OS partitions 17:17:12 there's no opportunity to ask if c-i is running noninteractive 17:17:25 and it still doesn't solve the underlying problem 17:17:29 not clobbering data is an improvement 17:17:41 but if we tell the user "sorry, can't reprovision without losing data", we've still broken them 17:18:10 I gather no one is in favor of going back to a fixed-size image? 17:18:11 whatever cap we choose that could theoretically happen though 17:18:46 bgilbert: i was proposing fixed size root partition end point at least, but I think you said that's not ideal because we have to write all the zeroes 17:19:03 right, sorry, I mean the simplest approach of just writing all the zeroes 17:19:07 on first boot the root partition would get extended 17:19:13 I think there are a couple possible alternatives, just wanted to probe that one 17:19:31 bgilbert: so current proposal is: 17:19:53 1. fixed size root partition endpoint - we might find a creative way to not have to write all the zeroes 17:19:57 discuss 17:20:35 does "fixed" imply not auto-growing even if there's no partition after the rootfs? 17:20:39 no 17:20:49 ok, just checking :) 17:20:57 I think fixed implies that is the value you can guarantee the shipped images comes with 17:21:06 on first boot it can be grown 17:21:51 positives/negatives? 17:21:56 i think it's an improvement over the moving target we have today 17:22:06 also, does it solve the problem you describe bgilbert? (just confirming) 17:22:09 dustymabe: I'm confused by the 1. 17:22:30 bgilbert: if we get other proposals we'll increment the counter so we can refer to each one in discussion 17:22:33 ah, okay 17:22:53 can we maintain the current dynamic size, but enforce any extra partitions leave room to grow? 17:22:53 I think we need to define some safe offset for user data 17:23:33 bgilbert: is the "data partition preservation" opt-in or default-on, from the point of view of coreos-installer logic? 17:23:36 keep the small partition, create extra ones at +20GiB offset? 17:23:39 I think it's okay to just brainstorm approaches for now, and not decide on one until we experiment 17:23:58 +1 17:24:08 lucab: I'm hoping to avoid c-i logic entirely, because I don't think there are good heuristics 17:24:33 i.e. stick to a contract instead 17:24:55 cyberpear: the Ignition base config proposal would do that 17:25:01 so it would be the semantics of each `install` run 17:25:27 what do we think is a safe size for a rootfilesystem contract ? 17:25:34 10G ? 17:25:56 conservative 17:26:20 hmm, was thinking more something like 5G 17:26:40 jlebon: i'm thinking if the parition doens't get extended 17:26:48 then we've got 5G, should be enough 17:26:54 I'd take what we have today, add back the docs and languages, add python and other interpreters and commonly layered packages, then double it 17:27:10 but we do keep old ostree commits around.. I think 10G would be safer 17:27:11 jlebon: a random build I have sitting here has 2.1 GB in root 17:27:17 and we need 2x for Fedora major bumps, right? 17:27:18 more or less 17:27:33 sort of a "worst case" for users doing heavy layering 17:27:41 bgilbert: yeah, let's say 17:27:45 maybe nVidia stuff too... 17:28:19 cyberpear: users can always allocate more than the minimum 17:28:45 since layering is disrecommended I'm inclined not to factor that in 17:29:06 * dustymabe notes the time 17:29:19 #action bgilbert to summarize discussion in the ticket 17:29:20 any other proposals we'd like to spitball (and then summarize in the ticket) 17:29:27 I think this was a good start for brainstorming 17:29:39 cool.. we'll investigate more on 1. then ? 17:29:44 +1 17:29:49 SGTM 17:29:51 #topic open floor 17:29:52 thanks all! 17:29:59 anyone with open floor topics ? 17:30:27 I have one 17:30:33 #info dustymabe started discussion with IoT/Silverblue/Releng groups to discuss possible solution to package layering problems from #400 17:30:44 one from me on the docs side 17:30:55 (I'd pick a "safe offset" for extra partitions and allow it to be customized in ign) 17:30:58 #link https://github.com/coreos/fedora-coreos-tracker/issues/401#issuecomment-664759768 17:31:14 nasirhm: go ahead 17:31:33 I was going through provisioning docs today, we have pages covering every single platform except for `aliyun` and `openstack` 17:32:33 lucab: proposing to add those? 17:32:40 i'm very much in favor 17:32:45 On this Docs ticket about the Tutorial, Do we need to put all the scenarios in a single doc or break it into 3 (Basic, Intermediate and Advanced) ? 17:33:04 #link https://github.com/coreos/fedora-coreos-docs/issues/106 17:33:36 nasirhm: we might be better off making little mini tutorials 17:33:40 dustymabe: yes, I'll maybe take the former at some point, the latter is up 17:34:14 if I get access to an openstack env I could try to do the latter (anyone know of a public cloud that offers vanilla openstack, i.e. can use openstack tools etc)? 17:34:27 nasirhm: I'd prefer smaller tutorials showing some real topics, the bottom one with zincati and pkg diff has a good flow 17:34:31 lucab: we could do those for the docs hackfest 17:35:16 dustymabe: as well, I've opened tickets for both (but I don't expect people with aliyun access to show uo) 17:35:18 Will create a Tutorial category and add the 3 in mini tutorials. 17:35:19 *up 17:35:34 nasirhm: sounds great! 17:35:50 dustymabe: packstack can spin one up on a laptop if you need it (I've got instructions if you want) 17:35:55 lucab dustymabe : Thank You 17:35:56 nasirhm: +1 - we might be able to name them something more descriptive than Basic, Intermediate, and Advanced, but we can start with those names for now and interate to make everything better 17:35:57 nasirhm: you can even start with a PR with a basic one and we incrementally go from there 17:36:16 cyberpear: that would be good 17:36:36 * cyberpear will dig up the gist 17:36:41 i am really suprised there is no one just offering vanilla openstack out there though 17:36:42 Awesome, Would make the PR today. 17:36:56 maybe rackspace 17:37:01 i'd be inclined to try to add docs for them as a cloud provider in our docs 17:37:05 dustymabe: I think I've seen people using vexxhost for that in the past 17:37:13 lucab: cool. I'll look into it 17:37:22 (but I don't know what's the vanilla/baseline for openstack) 17:37:39 #info we are planning to try to do a docs hackfest at Flock/Nest next weekend. Look out for details! 17:38:06 lucab: to me the baseline would be that I can use the openstack client tools/api 17:38:15 and not something specific to a cloud provider 17:38:17 #link https://pagure.io/flock/issue/274 17:38:26 any more topics for open floor? 17:39:11 Nothing from my side. 17:39:12 #endmeeting