15:59:18 #startmeeting fedora_cloud_meeting 15:59:18 Meeting started Tue Jun 22 15:59:18 2021 UTC. 15:59:18 This meeting is logged and archived in a public location. 15:59:18 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:59:18 Useful Commands: #action #agreed #halp #info #idea #link #topic. 15:59:18 The meeting name has been set to 'fedora_cloud_meeting' 15:59:24 #topic roll call 15:59:27 .hi 15:59:28 dustymabe: dustymabe 'Dusty Mabe' 16:00:41 .hi 16:00:42 dcavalca: dcavalca 'Davide Cavalca' 16:00:52 #chair dcavalca 16:00:52 Current chairs: dcavalca dustymabe 16:01:05 davdunc: around today? 16:02:36 .hello ngompa 16:02:37 Eighth_Doctor: ngompa 'Neal Gompa' 16:03:38 #chair Eighth_Doctor 16:03:38 Current chairs: Eighth_Doctor dcavalca dustymabe 16:04:00 .hello2 16:04:01 davdunc: davdunc 'David Duncan' 16:04:09 ah there you are :D 16:04:13 I just tried to ping you :D 16:04:40 ok let's get started 16:04:55 #chair davdunc 16:04:55 Current chairs: Eighth_Doctor davdunc dcavalca dustymabe 16:04:58 #topic Action items from last meeting 16:05:18 I don't see any action items in the minutes from last meeting 🎉 16:05:32 #topic new meeting time 16:05:37 #link https://pagure.io/cloud-sig/issue/333 16:05:49 davdunc: has a standing conflict at this time 16:05:56 can we sort out a new time? 16:06:11 i have an hour long meeting at 9:00 PDT 16:06:22 * dustymabe notes that we stick with UTC so we'd need to consider daylight savings time adjustments when choosing the new time 16:06:33 The OKD WG meeting is right after this one 16:06:40 so it'd be kind of crappy to conflict with that 16:06:52 earlier? 16:07:02 would earlier work? 16:07:13 I could do an hour earlier 16:07:16 yes 16:07:24 davdunc: ? 16:07:29 I can do 10:30am-11:30am EDT 16:07:40 I can do an hour earlier. 16:07:58 Eighth_Doctor: the meetings sometimes run longer than 30 16:08:01 I assume that's OK 16:08:18 11:45am-12pm is daily standup with my team 16:08:20 sorry 16:08:20 other than that, it's fine 16:08:28 #chair cmurf[m] 16:08:28 Current chairs: Eighth_Doctor cmurf[m] davdunc dcavalca dustymabe 16:08:54 #proposed we'll move the bi-weekly meeting earlier by one hour to accomodate a conflict for davdunc 16:09:02 hmm. ok let's think about this 16:09:14 my brain hurts when it comes to daylight savings 16:09:27 when the time changes back in the fall, will we be back in this situation again? 16:09:36 yeah 16:09:47 because we set times in UTC, we'll have this collision again and have to fix it again 16:10:03 having to fix this twice a year isn't too bad though 16:10:05 davdunc: does your meeting move with UTC (I doubt it) 16:10:24 sorry I should have phrased that "stick with UTC" 16:10:29 :D I am good with UTC. just talking to engineering team at the same time. 16:10:43 couldn't convert and talk at the same time. :( 16:10:52 all of my meetings are EDT, which makes this painful 16:11:02 err US/Eastern 16:11:07 davdunc: yeah but I think we're saying if we only move the meeting an hour then we'll have the same problem in a few months 16:11:07 so they float across EDT and EST 16:11:37 I see your point. 16:11:47 let's maybe consider a new option.. how about the same current meeting time, but on thursday 16:11:55 I'd push it even earlier, but the Workstation WG meeting is at 9:30am EDT 16:11:58 that would work for me. 16:11:59 * cmurf[m] bans daylight saving time by global edict, problem solved 16:12:06 cmurf[m]: +1 16:12:11 oops, small problem is cmurf isn't the benevolent ruler 16:12:13 this is the one immovable meeting I have. 16:12:19 can't do Thursday, because FPC meeting at that time 16:12:46 Eighth_Doctor: Thursday + one hour earlier? 16:12:48 can't do Wednesday either because of other meetings 16:12:48 i'm flexible btw 16:12:52 Thu an hour earlier? 16:12:56 dustymabe: let me check that 16:12:57 yeah I can't do Wed either 16:13:16 Eighth_Doctor: /me hopes FPC meeting sticks with UTC :) 16:13:20 dustymabe: standup conflict 16:13:37 Eighth_Doctor: same standup conflict as tuesday - which you could live with 16:13:43 dustymabe: actually most of my Fedora meetings are increasingly no longer using UTC 16:13:45 dustymabe: yep 16:13:52 yea. UTC++ 16:14:02 Eighth_Doctor: does the FPC one use UTC? 16:14:07 let me check 16:14:49 yes 16:15:30 #proposed we'll move the meeting one hour earlier and to thursday 16:15:43 fedocal is running nice and slow this morning 16:16:08 anyone want to weigh in ? ^^ 16:16:13 I wonder what day is going to be my meeting heavy day 16:16:29 The answer is - the day we want to move the meeting to 16:16:31 I used to be able to just "screw Tuesday", but it may change to Thursday now... 16:17:38 Eighth_Doctor: that's a plan I support. 16:17:53 #proposed we'll move the meeting one hour earlier and to thursday (bi-weekly on thursday at 1500 UTC) 16:17:57 please vote ^^ 16:18:11 +1 16:18:16 If we could move it to 9am EDT on Thursday, I'd be good with that 16:18:24 no conflicts and no manager complaints 16:18:35 but that might be hard for some other folks 16:19:04 (9am EDT ~ 1pm UTC) 16:19:10 yeah that's too early for the west coast 16:19:12 let's try for current proposed if possible 16:19:18 I could make it as a one off, but not reliably 16:19:44 what about 17:00 UTC (1pm EDT)? 16:19:47 davdunc: good with ^^ 16:20:01 but otherwise, current proposal is fine 16:20:12 +1 16:20:22 as long as we're okay with me not being there for part of it consistently 16:20:43 Eighth_Doctor: hopefully the meetings are shortish 16:20:45 Eighth_Doctor: I have a conflict for 17-1730 UTC :( 16:20:58 #agreed we'll move the meeting one hour earlier and to thursday (bi-weekly on thursday at 1500 UTC) 16:20:59 though I could make that work if needed 16:21:04 (which is better than when I was triple booked with FESCo + OKD + work) 16:21:05 You are important enough that we can work with what time you do have. 16:21:11 let's see how the current new time works and then adjust as needed 16:21:15 thanks! 16:21:16 aww 16:21:32 well, davdunc is more important than I am in this 16:21:34 I'm cloud-dumb after all :) 16:21:44 #topic Fedora-34 nightly images in AWS do not boot 16:21:50 #link https://pagure.io/cloud-sig/issue/331 16:22:22 this is an odd duck for sure 16:22:49 cmurf[m]: do you want to give a summary 16:22:54 adamw is going to look into it but hasn't had time yet; maybe mboddu has an idea what controls the firmware selection for an image build VM 16:23:13 cmurf[m]: that should be in imagefactory 16:23:39 the gist is that only f34 x86_64 nightly cloud base images are being created in a UEFI VM, therefore have a UEFI partition layout and UEFI GRUB 16:23:58 but the image is tested in openqa using a BIOS VM so all the tests fail, it'll actually work in a UEFI VM 16:24:24 but f34 GA is built for BIOS VM, so are the nightlies for rawhide and f33 16:24:33 confused yet? :D 16:24:49 cmurf[m]: do we know if rawhide works? 16:24:55 it does 16:25:00 it's passing openqa tests 16:25:01 and f33 does too 16:25:04 weird 16:26:00 any other ideas for debugging? 16:26:13 https://pagure.io/pungi-fedora/blob/f34/f/fedora-final.conf#_374 16:26:20 that comment looks like our problem, but it's the wrong image 16:26:27 that's for GCP 16:27:03 cmurf[m]: yeah but let me check the actual config that gets used for the nightlies 16:27:38 https://pagure.io/pungi-fedora/blob/f34/f/fedora-cloud.conf 16:28:47 it's got a different distro entry 16:29:21 I honestly think this was just bad branching 16:29:39 if you compare the config to what's in rawhide it never got set to `Fedora-34` 16:29:58 I think we can fix it by setting it back 16:30:12 but honestly I question whether we should be building nightlies IMHO 16:30:24 we have no bandwidth for testing them or fixing issues 16:30:38 we need them because we can't build the images locally 16:30:48 at least, I can't 16:30:56 :) 16:30:58 sure you can 16:31:06 it's just complicated - which sucks 16:31:17 nope, the imgfac tooling crashes when I try to use it 16:31:39 and when davdunc and I paired on it, we got different results than what the koji builders got 16:32:10 unless davdunc made progress on that, it's basically the only way I have to see how changes get implemented 16:32:47 so this is the opposite of reproducible 16:32:59 yup 16:33:13 I'm this close to proposing we switch to kiwi for building images 16:33:20 because of this insanity 16:33:47 the amount of effort it'd take to make the switch is what stops me 16:33:53 ok this should fix it: https://paste.centos.org/view/29d7cb42 16:33:56 ok this is odd.. 16:33:56 but I'm tempted 16:33:57 https://pagure.io/pungi-fedora/blob/f34/f/fedora-cloud.conf#_225 16:34:04 https://pagure.io/pungi-fedora/blob/main/f/fedora-cloud.conf#_226 16:34:15 cmurf[m]: see fpaste 16:34:37 haha ok 16:34:39 i'll post up the patch 16:34:45 but yeah why does main says Fedora-22? 16:34:48 and yet it works as expected 16:34:56 cmurf[m]: it's a "profile" 16:35:15 it's like a common set of characteristics when starting the VM 16:35:30 it looks odd, I agree 16:35:38 but that's more or less what it is 16:35:42 ok and these profiles are elsewhere which is why i couldn't figure it out 16:35:45 yeah, that's normal 16:35:48 why isn't it just Fedora-BIOS and Fedora-UEFI? 16:35:55 cmurf[m]: yeah they're in the imagefactory source code I think 16:35:55 +1 16:36:24 #info we think we should be able to fix this problem with a patch to pungi-fedora 16:36:25 and Fedora-hybrid now 16:36:26 anyhow, on stopping the nightlies - wasn't the point of doing nightlies that some other thing uses them as inputs? they are useful as just 'updated base fedora disk images" for some testin workflow or something 16:36:42 adamw: yeah I think some CI processes use them 16:36:46 adamw: well, QA stuff uses them, and we use them for test days 16:36:53 and there are some CI thingies too 16:37:07 and at some point we did want to release updated images to the public, but we haven't made any progress on that 16:37:52 anybody want to submit that PR (with the diff from the fpaste) to pungi-fedora repo? 16:38:20 I can do it 16:39:26 #action Eighth_Doctor to submit PR to pungi-fedora to fix BIOS booting issues for f34 cloud nightlights 16:39:36 sigh 16:39:41 at least it's a cool typo 16:39:51 haha 16:40:03 #topic updates on f35 changes 16:40:04 hah. 16:40:20 davdunc: Eighth_Doctor: do you want to summarize and highlight the f35 change proposals and status? 16:40:23 that's one way to title this topic 16:40:36 yeah sure 16:40:49 so... we've implemented the change to Btrfs for the Cloud images 16:41:03 they've been spinning and booting quite nicely as Rawhide nightlies for a couple of days now 16:41:40 we've run into *issues* trying to implement hybrid GPT (which is where davdunc and I attempted to pair program it and couldn't get imgfac to work right) 16:41:53 #info change to btrfs has been implemented in rawhide cloud image builds - working as expected 16:42:10 well, hybrid boot, not hybrid GPT 16:42:17 it's GPT with hybrid boot 16:43:19 Eighth_Doctor: if you want to schedule some time on Friday I can try to walk the beaten path I've traveled in the past and work with you to get you unblocked 16:43:42 note one thing is "off" that we haven't firmly decided: ext4 images are "spare" files as a result of anaconda running e2fsck -E discard following umount. The btrfs images are mostly allocated. davdunc says it's probably preferred images aren't sparse, basically as a bug avoidance strategy. I guess some things don't like sparse files. 16:43:59 dustymabe: let's work out something with davdunc and we can schedule a call on Friday 16:44:02 sigh the typos :\ 16:44:32 cmurf: I wasn't aware that we were sparsifying the ext4 based images 16:44:48 if that's the case, we probably should fix this in anaconda to be fs-agnostic 16:45:10 yeah i've got a bug/rfe for anaconda for using fstrim if it should be used 16:45:13 Eighth_Doctor: in the past I used to set up everything in vagrant 16:45:24 here's my vagrantfile: https://paste.centos.org/view/8a44a7dd 16:45:37 should be able to run the steps on any VM (with nested virt enabled) 16:45:48 the question here is whether they should be sparse or preallocated, then we can make sure the right thing is happening 16:46:12 #info hitting some issues getting hybrid boot + GPT working - still working through the issues 16:46:28 #topic sparse disk images or not 16:46:41 hmm. why on earth would you not want the disk image to be sparse ? 16:47:04 my mind interprets that as you downloading multiple GB of zeroes when you download a cloud image 16:47:50 what currently happens by i guess imagefactory (?) is after it deletes all the rpms in /home that were downloaded and installed, it writes a big file of zeros using dd if=/dev/zero 16:47:52 then deletes that 16:48:02 so it in effect zeros all the garbage that was deleted 16:48:43 in my testing it's about a 10MiB difference between xz compressed zeros, and xz compressed holes (sparse) 16:48:53 favoring holes 16:49:00 cmurf[m]: https://pagure.io/fedora-kickstarts/blob/main/f/fedora-cloud-base.ks#_109 16:49:40 haha yes that's it, what I saw in the oz log was "(Don't worry -- that out-of-space error was expected.)" 16:49:43 and had a laugh 16:50:27 so what's the proposal? 16:50:39 followed by facepalm; a better way to do this is fstrim the file system just before umount; umount it; then fallocate the image file if you don't want it to be sparse. 16:50:57 well my proposal would be, decide whether they should be sparse or preallocated. 16:50:58 i don't know the answer to that question 16:51:15 the files should be as small as possible for network transfer 16:51:26 I guess that means sparsing them and compressing the resulting file 16:51:31 yeah, honestly fstrim is tricky - my understanding is that it only works if you configure things correctly 16:51:57 https://dustymabe.com/2013/06/11/recover-space-from-vm-disk-images-by-using-discardfstrim/ 16:51:59 it'll pass through loop driver by default 16:52:15 dcavalca: do you think we could get offline discard added to btrfs-progs? 16:52:23 similar to what exists for ext4? 16:52:38 loop? i thought we were mounting emulated scsi disks 16:52:43 i don't really want offline discard added, and esandeen mentioned to me a while ago it probably shouldn't be in e2fsck either 16:52:54 so then what? 16:52:56 what do? 16:52:59 Eighth_Doctor: I can certainly ask if needed 16:53:00 it should just be done on a mounted fs, using fstrim, which calls a kernel ioctl for it and works on all file systems 16:53:17 could we call fstrim from the kickstart? 16:53:29 I think it has more to do with the backing storage 16:53:32 calling fstrim from %post should work 16:53:47 for example in the blog post I linked I needed to add `discard='unmap'` to the disk XML description 16:54:12 oh that's a good point since this happening in a VM 16:54:49 cmurf[m]: and now you get to go dig through imagefactory/oz to see if you can wire everything through 16:54:56 :) 16:54:56 ok so it's not on loop, it's a qemu device? 16:55:15 i thought i saw the logs showing the image on a loop device, hmmm 16:55:22 yeah I mean it's just a disk image I assume 16:55:37 * dustymabe can't confirm 100% - that's just what I assume 16:56:10 well my idea would be (a) fallocate the image instead of using truncate and (b) do not use that file's filesystem, i.e. /home, as a staging area for rpms that create a log of garbage to clean up later on 16:56:33 if we did that, we don't have to worry about fstrim, fallocate, dd zeros, etc. later on 16:56:44 hmm. i'm not familiar with the "/home, as a staging area for rpms that create a log of garbage" issue 16:56:56 Is that specific to anaconda? 16:56:58 i don't know why the image itself is used for staging rpms, seems kinda sloppy honestly 16:57:01 yeah it's also in the log 16:57:05 yes 16:57:09 netinstallers do it too 16:57:27 ahh, ok. I leave that issue to the anaconda team then :) 16:57:37 i think it's based on netinstalls on baremetal 16:57:47 where else to stage them? 16:57:57 the image itself is probably used to stage rpms because systems with low memory wouldn't be able to do it otherwise 16:58:15 it doesn't distinguish between the netinstall on baremetal case, and netinstall to create an image case 16:58:25 it's an anaconda thing that can't be fixed 16:58:29 right a build system could just use /var/tmp 16:58:44 * Eighth_Doctor has had this argument before 16:58:48 cmurf[m]: correct. a system with lots of RAM could just use that 16:58:52 you'd need a "build image" mode for anaconda 16:59:16 ok I have to run - cmurf[m] maybe create an issue for this with some relevant details 16:59:20 #topic open floor 16:59:21 they don't want to build one and Red Hat is generally reducing investment in Anaconda's ability to do this stuff outside of the absolutely required effort 16:59:24 k 16:59:33 Eighth_Doctor: can you close the meeting out when open floor discussion has finished 16:59:37 I have to go 16:59:41 dustymabe: sure 16:59:48 yeah so in that case i'd say, we need to focus this functionality on osbuild 16:59:58 and not worry about what we think will be legacy ways of building images 17:00:10 that's going to be very hard 17:00:21 which part 17:00:23 because there's no infrastructure to build images with osbuild 17:00:37 and osbuild does not support most of the customizations we do today for images 17:01:00 but that's something I would like to explore with the osbuild team. 17:01:10 ok well we might have to have a higher level discussion about all those customizations and if they are even really needed and then which ones osbuild needs to learn 17:01:27 i prefer standardization whenever possible 17:01:36 there's a larger conversation to have about image building 17:01:46 the customizations are why we have 8 image building tools, or whatever the count is up to 17:01:51 but that's hugely out of scope for now 17:01:54 (9 tools) 17:02:00 i was close! :D 17:02:10 i thought i was exaggerating too :\ 17:02:38 anyway let's fix this when it's a problem 17:03:48 right now the btrfs raw image is about 4.1G/5G (about 1G of holes) compared to ext4 625M/5G. But xz smashes them down to ~180M and ~300M respectively. 17:04:09 The big difference there isn't the sparseness issue either. I can explain it if anyone cares. But it's just the way it is. 17:04:43 certainly something we should figure out if we can shrink 17:04:43 I'd consider it not a problem but room for optimization. 17:04:56 Not by much. 17:05:10 the issue is that btrfs is compressing with zstd:1 and xz can't compress it further. 17:05:52 ahhh 17:06:03 we'd need a way to create images with a different (higher) zstd level than the fstab that's in it 17:07:23 I think we can go on and on for this topic, so let's table it 17:07:36 cmurf: can you make a ticket about this? 17:08:09 a cloud-sig ticket for tracking? 17:08:10 yeah 17:08:20 #action cmurf to make a ticket about image creation optimizations 17:08:50 So finally, one last thing for me to bring up... 17:09:04 #topic New lead for Cloud WG 17:09:23 I'm bringing this up because it's increasingly clear dustymabe is being stretched too thin on this 17:09:49 his efforts and job around Fedora CoreOS are making harder for him to drive Fedora Cloud and that's not fair to him or everyone else 17:10:23 recently though, davdunc has been stepping up to do amazing work on Fedora Cloud, and he clearly cares *a lot* about it 17:10:49 yes. I love what Dusty has done over the years! I also care a lot. 17:11:12 so I'd like to ask davdunc if he'd like to consider taking over leading the Fedora Cloud WG to help steer us forward 17:11:30 (as for why not me? I'm stretched even thinner than Dusty, if that's even possible...) 17:12:47 so what does the group think on this? 17:13:08 no objection 17:13:09 (and davdunc too! after all, he's the subject of this!) 17:13:14 I'm all for more involvement and I think davdunc is an excellent member of our community 17:13:45 +1 17:13:53 jdoss, dcavalca, ? 17:14:05 +1 17:14:18 I would definitely do it. 17:14:29 I know mattdm is going to schedule some time to talk with all of us soon about fedora cloud - so this certainly plays into that discussion 17:14:59 I think it makes sense to have davdunc leading by that point if he wants to 17:15:03 davdunc: can you reach out to mattdm and let him know your interest as well? 17:15:53 dustymabe: will do. 17:15:55 Eighth_Doctor: WFM - let's have a ticket in our tracker anyway - to draw more attention to the proposed change 17:16:02 sure 17:16:10 then we can follow that with a ML announcement and such 17:16:10 dustymabe: do you want to make the ticket? 17:16:18 you are the leader atm :) 17:16:22 so it makes sense 17:16:23 :) 17:16:25 sure can do 17:16:49 #action dustymabe to file ticket and announce change to davdunc as Cloud WG lead 17:17:04 thanks Eighth_Doctor 17:17:08 have to leave keyboard again 17:17:10 bbiab 17:17:18 :D thanks for the vote of confidence everyone. Thanks dustymabe for everything! 17:17:19 alright, that's pretty much everything I had 17:17:29 +1 17:17:29 if there's nothing else, we can end this meeting 17:17:43 counting down... 17:17:56 3............. 17:17:57 dustymabe++ 17:17:57 2........ 17:18:03 not sure if that works via matrix 17:18:03 1... 17:18:19 cmurf: it does if zodbot recognizes the IRC nick from Matrix 17:18:26 #endmeeting