16:07:06 #startmeeting fedora_cloud_meeting 16:07:06 Meeting started Tue Jun 8 16:07:06 2021 UTC. 16:07:06 This meeting is logged and archived in a public location. 16:07:06 The chair is dustymabe. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:07:06 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:07:06 The meeting name has been set to 'fedora_cloud_meeting' 16:07:12 #topic roll call 16:07:14 hey cmurf[m] 16:07:18 .hi 16:07:20 dustymabe: dustymabe 'Dusty Mabe' 16:07:42 .hi 16:07:43 cyberpear: cyberpear 'James Cassell' 16:07:46 hmm laggy 16:07:49 .hello ngompa 16:07:51 King_InuYasha: ngompa 'Neal Gompa' 16:07:52 hey cmurf 16:08:02 I'm confused. When I joined the #fedora:matrix.org rooms last week, I ended up in freenode... 16:08:18 not all rooms get converted at once I think 16:08:22 they were rehomed last week 16:08:29 oh dear, the backlog is syncing to IRC now 16:08:30 and some still haven't changed 16:08:31 * michel notes this room is not in the Fedora Space yet either 16:08:31 .hi 16:08:32 dcavalca: dcavalca 'Davide Cavalca' 16:09:05 we're almost caught up to Matrix :) 16:09:06 oddly enough, cmurf entering here stopped the bridge 16:09:14 crap 16:09:22 so... that happens when someone who's not bridged properly to IRC joined 16:09:23 And the Element UI doesn't tell me what irc server im actually chatting in. 16:09:31 it's not sending messages to IRC anymore, presumably because he's still not correctly logged into libera.chat 16:09:34 oh, there we go 16:09:46 at least that's what someone in #libera-matrix:libera.chat said as a diagnosis 16:09:55 .hello ngompa 16:10:00 .hello salimma 16:10:00 Eighth_Doctor: ngompa 'Neal Gompa' 16:10:03 michel: salimma 'Michel Alexandre Salim' 16:10:39 * michel fires zodbot 16:10:51 .fire zodbot 16:10:53 adamw fires zodbot 16:10:57 I see zodbot. How can i be bridged wrong if i see zodbot replies? 16:11:14 let me know when I should proceed with the meeting 16:11:28 I'll let you know 16:11:28 cmurf: looks like appservice finally noticed you, it just took a while 16:11:33 and until then sync was off 16:11:35 .hello chrismurphy 16:11:36 cmurf[m]: chrismurphy 'Chris Murphy' 16:12:01 Ok now no zodbot reply 16:12:09 just wait a sec, the events are synchronizing 16:13:01 .hire zodbot 16:13:01 adamw hires zodbot 16:13:08 haha 16:14:07 proceed yet? 16:14:26 Matrix is stuck it seems 16:14:37 it's getting IRC events, but now Matrix events aren't coming back 16:14:52 okay, now, let's see if we're in sync yet... 16:14:52 did zodbot drop Davide's intro? 16:14:57 let's just start anyway... 16:15:01 no, it was earlier up 16:15:10 michel: still here 16:15:46 That was fast 16:15:48 dustymabe: let's start the meeting 16:15:50 #chair King_InuYasha cmurf[m] michel dcavalca 16:15:50 Current chairs: King_InuYasha cmurf[m] dcavalca dustymabe michel 16:15:54 oh yeah, we're synced now 16:16:01 davdunc around? 16:16:05 * King_InuYasha pokes davdunc 16:16:11 #topic Action items from last meeting 16:16:14 no we're not, IRC is going to Matrix faster than the other way 16:16:17 I am in the background. 16:16:22 * davdunc will work on a Change proposal for switching to GPT with 16:16:25 hybrid BIOS+UEFI 16:16:27 * davdunc will investigate the issue with Fedora 34 Cloud nightly images 16:16:29 not booting in AWS 16:16:31 #chair davdunc 16:16:31 Current chairs: King_InuYasha cmurf[m] davdunc dcavalca dustymabe michel 16:16:32 .hello2 16:16:33 davdunc: davdunc 'David Duncan' 16:16:45 in an engineering meeting at the same time. 16:16:59 any updates on action items? 16:17:22 Please proceed. 16:17:36 dustymabe: davdunc has submitted the GPT + hybrid boot proposal: https://fedoraproject.org/wiki/Changes/FedoraCloudHybridBoot 16:17:57 #info davdunc has submitted the GPT + hybrid boot proposal: https://fedoraproject.org/wiki/Changes/FedoraCloudHybridBoot 16:18:02 yes. I think that this is a good time to seek alignment with the project as a whole. 16:18:14 I don't know much about the progress on the nightly images though 16:18:40 they are working, but the openQA is not booting them. 16:18:47 need to look at why that is exactly. 16:19:01 davdunc: so they work in AWS but not in openQA? 16:19:15 correct. I can upload and boot. 16:19:54 can you run in QEMU/Libvirt? 16:20:24 I haven't attempted. I was too keenly focused on the target environment. 16:20:44 might be a good data point 16:20:56 but there should be no blockers and I will add that as a test point. 16:21:04 #info still debugging f34 cloud nightly image boot issues 16:21:31 I do want to know how we can update the openQA to include a worker on the specific cloud environments. 16:22:04 #topics for this meeting 16:22:10 #topic topics for this meeting 16:22:23 What do you all want to discuss today. Throw out some links and I'll make a list 16:22:28 https://pagure.io/cloud-sig/issues 16:22:56 https://pagure.io/cloud-sig/issue/325 16:23:24 anything else? 16:23:59 https://pagure.io/cloud-sig/issue/329 16:24:37 https://pagure.io/cloud-sig/issue/320 16:24:58 ok 16:25:00 that last one is mostly asking if we're actually *done* implementing this yet 16:25:08 #topic Please publish Fedora AMI to af-south-1 16:25:12 #link https://pagure.io/cloud-sig/issue/325 16:25:19 King_InuYasha: I am the bottleneck on 320 16:25:56 maybe we can clear that out with the other changes for F35 images? 16:26:15 ok so issues with af-south-1 region? 16:26:45 I don't know much about this, but it's been lingering... 16:27:06 it's an "opt-in" wrappter 16:27:12 davdunc: do accounts have to be "activated" when they first try to use a new region? 16:27:26 yes the region has to be added to the valid regions. 16:27:28 i.e. "do you really want to use this region" 16:27:31 ok 16:27:41 any links you know of we could share with mobrien? 16:27:44 It's Capetown. 16:28:20 https://aws.amazon.com/blogs/security/setting-permissions-to-enable-accounts-for-upcoming-aws-regions/ 16:28:38 is there a reason we shouldn't have every region except CN ones enabled? 16:28:48 there is not. 16:29:28 can we make it so it works that way instead? 16:29:48 (I guess we could also have CN regions enabled too, but IIRC CN is "special") 16:29:49 I just added a comment for mobrien 16:30:10 some of the FCOS stuff detects new regions automatically 16:30:23 let me check to see if we have an image in that region for FCOS 16:30:31 davdunc: as you can tell, I'm cloud-dumb ^_^;; 16:30:46 CN is special, but I am working on pushing images to the Marketplace. 16:30:50 `af-south-1: ami-06c96d24608e629c0 ` 16:30:50 Ihave permissions. 16:31:11 so FCOS is using it 16:31:22 and it should be the same account 16:32:48 so not sure why fedimg is getting the error 16:33:49 should we move to the next topic? 16:34:09 maybe it was a one-time issue, but it's been fixed. 16:34:54 #topic Add Ignition support to Fedora Cloud 16:34:57 #link https://pagure.io/cloud-sig/issue/320 16:35:03 I think jdoss has been hacking on this as a side project 16:35:36 that would be awesome. 16:35:38 I personally think this is a no go unless we can figure out how to support them both 16:36:02 I think the goal is to do both 16:36:20 we need to figure out how to have cloud-init disable ignition and vice versa, though 16:36:21 in that case, It's a neat idea 16:36:41 King_InuYasha: I think they should just cue off of the metadata 16:36:52 #cloud-config -> cloud-init 16:37:18 json -> ignition 16:37:26 cloud-config is yaml, so it can be json too 16:37:33 so ship and support both but deconflict, so that they are mutually exclusive at install time? 16:37:43 cmurf: that's the goal 16:37:44 no, they'll be both installed 16:37:53 they're going to be mutually exclusive at runtime 16:38:03 got it 16:38:08 i think that's what cmurf meant 16:38:31 runtime for ignition is installation part 2, or whatever the nomenclature would become 16:39:01 cool - anything else to discuss on this ticket? 16:39:40 I think we should come up with some tests to verify sanity of a dual ign+c-i setup 16:39:46 but I don't know what that would be 16:41:01 King_InuYasha: yeah I mean there's a lot that Ignition does (reformatting disks, etc) that might not work out of the box 16:41:08 it's going to take some effort 16:41:55 anything else for this topic? 16:42:01 nope 16:42:51 #topic fedora cloud image: enable ipv6 (accept RAs by default) 16:42:54 #link https://pagure.io/cloud-sig/issue/320 16:43:28 davdunc: I think you mentioned this earlier? 16:43:35 .hi 16:43:36 jdoss: jdoss 'Joe Doss' 16:44:00 hey jdoss, you just missed us talking about ign+c-i on cloud images :) 16:44:03 i did. I said I would work on it and I have had other prioties here. 16:44:26 King_InuYasha: yah I see that. All good. I think what was talked about is spot on. 16:45:05 davdunc: no worries 16:45:15 davdunc: move to open floor then? 16:45:20 Should we bring up the btrfs stuff? 16:45:22 thanks. it's still on my list and yes. 16:45:42 I guess that's what cmurf and dcavalca are here for :) 16:45:56 actually let me hit on this other topic real quick 16:46:04 #topic systemd-oomd by default 16:46:09 #link https://pagure.io/cloud-sig/issue/324 16:46:21 what do we need to do to do this? 16:46:26 just finish implementing swap on zram? 16:46:50 I'm thinking we enable swap on zram and then systemd-oomd 16:46:55 I've added anitazha to that ticket in case she wants to chime in 16:47:00 I'm good with that :) 16:47:04 but yeah, you'll need swap for oomd to work properly 16:47:10 hilariously, I thought we already did this in F34 16:47:10 this seems like a good plan 16:47:15 and found out we didn't :o 16:47:20 but also cgroups work for processes needs to happen 16:47:22 there's a reported issue on systems with too much physical RAM - but that won't affect cloud usages normally right? 16:47:31 it's just a matter of tweaking the defaults anyway 16:47:31 since oomd clobbers things by cgroup, not by process 16:47:40 super, super, super unlikely to have _too much_ physical RAM 16:47:48 dcavalca: what are the negatives of run systemd-oomd without swap at all? 16:48:03 dustymabe: it won't work properly 16:48:06 michel: what were those "too much RAM" issues? 16:48:22 King_InuYasha: yeah, I think it was an unintentional slippage, not including Cloud there. or... what does Cloud ship now for oom? 16:48:26 it can't kill based on swap pressure because there is none; so you get a ton of reclaim which looks similar to swap thrashing, but the process won't get killed 16:48:33 dustymabe: see "setup information" in https://man7.org/linux/man-pages/man8/systemd-oomd.service.8.html 16:49:04 dcavalca: should the service require swap then (as a condition check or something) 16:49:04 michel: nothing, earlyoom is not installed or enabled 16:49:21 ok nothing is incorrect, there is still the kernel's OOM killer 16:49:46 dustymabe: that's a good point, I'll bring it up with Anita 16:49:51 cmurf: ah, then it makes sense, we were focused on replacing earlyoom. hmm.. ISTR the 'too much RAM' issue is different, but let me try and find it 16:49:57 but kernel OOM killer only triggers if the kernel thinks its life is threatened 16:50:09 ^^ eat or be eaten 16:50:20 hah 16:50:27 :) 16:50:40 too much RAM won't be an issue for zram because there is a cap by default; 100% RAM or 8G whichever is smaller 16:50:56 oomd should be fine with the zram combo, then 16:50:56 ok so to bring this back - enable swap on zram 16:51:00 so 8G /dev/zram0 device has fairly minimal overhead 16:51:07 and systemd-oomd 16:51:19 dustymabe: yup, swaponzram+systemd-oomd 16:51:25 and maybe tweak systemd-oomd such that if someone turns off swap-on-zram, it just won't start 16:51:32 which realigns us with everyone else 16:51:43 #chair Eighth_Doctor 16:51:43 Current chairs: Eighth_Doctor King_InuYasha cmurf[m] davdunc dcavalca dustymabe michel 16:51:50 we're trying to evaluate this for FCOS too 16:52:07 #chair jdoss 16:52:15 #chair jdoss 16:52:15 Current chairs: Eighth_Doctor King_InuYasha cmurf[m] davdunc dcavalca dustymabe jdoss michel 16:52:22 k8s systems would probably benefit from this by having two levels of swap 16:52:35 the first level being "fast swap" and the second level being "slow swap" 16:52:48 IIRC k8s doesn't support swap yet or at least it is something that is being looked at as it is a blocker for FCOS? 16:52:51 dcavalca: so anitazha would be able to tell us if we can have a condition check on swap for systemd-oomd service ? 16:52:52 k8s doesn't work with swap 16:53:03 dustymabe: yeah, I just pinged her 16:53:19 i'm reluctant to encourage two swaps on cloud, due to priority inversions 16:53:29 cool, could you have her update the ticket with the answer to that question? 16:53:32 cmurf: we wouldn't set up two swaps by default 16:53:36 only zram swap 16:53:38 dustymabe: yup, will do 16:53:51 https://github.com/kubernetes/kubernetes/issues/53533 tracks swap support for k8s 16:53:54 swap-ception. swap within a swap 16:54:00 looks like it's slated for 1.22 16:54:04 dcavalca: if we get that behavior then for FCOS we can just enable systemd-oomd 16:54:04 dustymabe: so use different thresholds based on if swap is enabled? 16:54:20 Is k8s a blocker for fedora cloud too? 16:54:23 then if the user set up swap, then they get systemd-oomd. If not, then it just will not start itself (condition check) 16:54:25 if you want two levels of swap you'd looking at zswap (front cache) and a conventional swap, that would be fine due to it's "least recently used" model for pushing things out of the cache and onto disk 16:54:55 because I really don't want k8s to dictate what we do for fedora cloud. It's one usecase. 16:54:58 jdoss: k8s is not a blocker for cloud 16:55:00 jdoss: no 16:55:06 michel: i'm not sure I understand the question 16:55:13 but it's a consideration for fcos, which dustymabe brought up 16:55:25 yeah, sorry for polluting the conversation 16:55:29 :) 16:55:46 but as it relates to cloud: zram based swap is probably better than no swap, and easy to disable for the k8s use case just by touching /etc/systemd/zram-generator.conf 16:55:58 #proposed for f35 fedora cloud we will enable swap-on-zram and sytemd-oomd 16:56:09 +1 16:56:10 ack/nack? 16:56:10 besides, having zram swap might encourage k8s people to look into it more :) 16:56:20 +1 16:56:20 dustymabe: +1 16:56:22 (if guests can vote) 16:56:37 +1 16:56:39 Michel Alexandre Salim: we'll make you a member soon, you've been helping us :) 16:56:47 same for dcavalca 16:56:52 #agreed for f35 fedora cloud we will enable swap-on-zram and sytemd-oomd 16:57:01 michel: we don't have super strict rules 16:57:04 You gotta take a few laps around the datacenter tho... 16:57:06 just happy to have participation 16:57:16 😆 16:57:20 Conan Kudo: I feel voluntold :p 16:57:29 Michel Alexandre Salim: welcome to my world 😛 16:57:42 #topic Btrfs as the default filesystem\ 16:57:46 #link https://pagure.io/cloud-sig/issue/308 16:57:55 Eighth_Doctor: was this the one we wanted to bring up? 16:58:01 * michel will have spare time once he does not have to care about nvidia anymore 16:58:02 here we go... 16:58:04 dustymabe: yes 16:58:11 we have, uh, 2 mins? 16:58:26 yeah, sorry. We got a late start and the matrix bridge was acting funny 16:58:29 I'll get the fire proof blanket for you Eighth_Doctor 16:58:38 haha 16:58:58 anyway, davdunc has taken the charge on this and has done a great job with the change proposal 16:59:14 yes. 16:59:21 if you need a punching bag. I am here. 16:59:41 .hire davdunc 16:59:41 adamw hires davdunc 16:59:49 LOL. 16:59:50 that's as strong as it gets around here 16:59:54 anything in particular to discuss? 16:59:55 the only significant feedback was... oddly from pboy, which seems to be more about some kind of cloud+server merge that nobody agreed to 17:00:04 otherwise, everyone seems to be "meh" about it 17:00:27 the derailment in that thread was unfortunate. 17:00:39 to be fair we've (at least I was involved) been talking on and off about a server/cloud wg merge for some time 17:00:53 mostly because of lack of interest/resources 17:00:56 right 17:01:03 I think at this point we can consider the Cloud WG revived 17:01:28 with the F35 release, I will ask the Council to restore our presence on the main site 17:01:32 as an Edition 17:01:42 King_InuYasha: perhaps, I'm still interested in seeing more involvement over time. And I'd really like to see someone other than myself become the de-facto "leader" 17:01:43 (we technically haven't lost edition status, but...) 17:01:55 dustymabe: I'd promote davdunc :) 17:02:01 Dusty you can't ever leave us... 17:02:02 he's done a great job with the cloud-y things 17:02:03 dustymabe: can we arrange a joint meeting with server just so we can settle this issue (if only to say "no merger planned for immediate future"? 17:02:20 so we don't have this uncertainty hanging over our heads 17:02:25 I'll be more actively involved after july 7 17:02:30 You have to show up Dusty otherwise you are going to get fined. 17:02:33 michel: I talked with matthew miller recently about it. He's going to bring some people together to discuss the future 17:02:44 dustymabe++ 17:02:44 michel: Karma for dustymabe changed to 6 (for the current release cycle): https://badges.fedoraproject.org/tags/cookie/any 17:02:59 jdoss: haha. I don't want to leave, but I do want to be more of a watcher/adviser rather than someone who does a lot 17:03:03 dustymabe: also, we love you Dusty, we don't want you to leave :) 17:03:20 dustymabe: hello management :) 17:03:35 yeah. definitely don't want to leave. I just don't want to have anxiety about cloud languishing and it being my fault because I'm not putting enough time into it 17:03:42 The problem with merging with Server IMO is it has things like Cockpit on by default and that on Cloud is not something I personally think is a good idea. 17:03:49 anyways, lets talk some butter 17:03:56 yes. btrfs: 17:03:58 well, I think the larger issue is there are different types of people 17:04:02 but again, for later 17:04:18 my two cents is merger or no merger, file system is unrelated. But if anything, btrfs makes cloud more like server's lvm+xfs -> integrated volume manager, reflinks, snapshots, subvolumes for separation. 17:04:30 they're not 1:1 but closer than just plain ext4 17:04:50 +1 to cmurf's point. 17:04:57 I've said it a few times. If it was just me I probably wouldn't make the change, but I'm happy people are here and interested in doing the work 17:04:59 I think it is a good idea to use the same FS as the other main editions. 17:05:13 and I do use btrfs, so I'm not afraid of it 17:05:27 I think there is a lot of FUD w/r/t btrfs 17:06:43 it is a very effective early warning system :) 17:06:45 yeah, I'm sure some people got scarred many years ago. I've never had an issue 17:07:08 I've been personally running it since 4.0 krneel 17:07:17 a mix of past issues and actual hardware issues that stays silent on other file systems 17:07:34 chances are we won't see the issues in cloud that we see with desktops: memory bitflips have been the most common, possibly tied with the occasional drive that just flat out drops critical writes or gets them out of order (i.e. not properly honoring flush/fua) 17:07:37 has anybody brought up any points in the ML discussion that are worth bringing up here? 17:07:57 other than cloud/server WG merge 17:08:02 but me doing desktop triage for btrfs issues, there's been in some sense surprisingly few problems 17:08:05 can we assume most Cloud usage has ECC RAM? or not so much depending on which provider 17:08:09 I think we should move forward with the change since King_InuYasha is leading the charge. We have the personpower to get it done. 17:08:49 dustymabe: I did not see anything. maybe we should do another request for comments, specifically excluding the server issue, and mention that that will be discussed separately? 17:08:49 cmurf: thanks for sticking around and helping people when they do have problems 17:09:30 michel: +1 i agree that we should draw a line in the sand on that and refresh the thread to bring up legit concerns about the change 17:09:57 no one else seconded that particular concern 17:10:40 works for me 17:10:53 So bump thread and be like for reals, we are going to move forward unless you give us a legit reason(s) not to? 17:10:54 let's just get this on the record though 17:11:14 jdoss: yeah something like that.. maybe with lighter tone :) 17:11:23 davdunc: I think you've got it 17:11:30 gotta keep for reals in there tho... 17:11:33 I can do it. 17:11:47 I'll bump the thread and declare our intention. 17:12:14 King_InuYasha: give davdunc the blanket 17:13:17 * King_InuYasha hands davdunc the blanket and his secret cape 17:13:25 #proposed the cloud wg will move to btrfs for the f35 unless techincal hurdles are hit or we decide not to for organizational reasons (cloud/server working together) 17:13:26 :D 17:13:38 I added the last clause in there for a reason 17:14:10 as I said, we are in the midst of a discussion about fedora cloud's future (hopefully invite going out soon) 17:14:44 regardless of what we do I think moving to btrfs should be fine, I just want to leave an escape hatch in case we end up not doing it 17:15:01 King_InuYasha: thanks. . . I think. 17:15:29 ack/nack/discuss? 17:15:35 +1 17:15:42 dustymabe: I'd rather leave out the latter 17:15:50 honestly that's unfair to us as a group 17:15:58 yeah I guess I am in that camp too 17:16:08 ehh, i'm implying the group would be involved in that decision 17:16:29 Because we should be able to operate as a WG and move forward on things such as this without dealing with what ifs 17:16:36 i.e. we'd have to agree and say "ok yes we think the benefits outweigh XYZ, we'll delay this until f36" 17:17:15 does that make sense? 17:18:43 I think the first half of your sentence is enough 17:18:59 #proposed the cloud wg will move to btrfs for the f35 unless techincal hurdles are hit 17:20:24 I have my reasons for wanting to include the latter. It's mostly for diplomatic purposes 17:20:36 but I guess we can have that conversation later 17:20:38 ack 17:21:01 any other people want to weigh in 17:21:46 +1 -1 - ack/nack ? 17:21:56 jdoss: davdunc: cmurf: ? 17:22:13 +1 17:23:20 * dustymabe waiting on at least one more vote 17:24:22 well I think jdoss voted in spirit earlier 17:24:30 #agreed the cloud wg will move to btrfs for the f35 unless techincal hurdles are hit 17:24:36 #topic open floor 17:24:40 anything for open floor 17:24:47 I am good for now. 17:24:50 nothing from me 17:24:59 I'm in three meetings atm 17:25:04 that makes this a bit hard :) 17:25:13 ok will end the meeting soon 17:25:50 #endmeeting