2024-04-10 16:00:29 <@davide:cavalca.name> !startmeeting CentOS Hyperscale SIG 2024-04-10 16:00:32 <@meetbot:fedora.im> Meeting started at 2024-04-10 16:00:29 UTC 2024-04-10 16:00:32 <@meetbot:fedora.im> The Meeting name is 'CentOS Hyperscale SIG' 2024-04-10 16:00:46 <@davide:cavalca.name> !topic Roll call 2024-04-10 16:00:59 <@davide:cavalca.name> !hi 2024-04-10 16:01:01 <@zodbot:fedora.im> Davide Cavalca (dcavalca) - he / him / his 2024-04-10 16:01:49 <@conan_kudo:matrix.org> !hi 2024-04-10 16:01:51 <@zodbot:fedora.im> Neal Gompa (ngompa) - he / him / his 2024-04-10 16:01:56 <@aekoroglu:matrix.org> !hi 2024-04-10 16:01:58 <@zodbot:fedora.im> Ali Erdinc Koroglu (aekoroglu) 2024-04-10 16:02:01 <@rcolebaugh:matrix.org> !hi 2024-04-10 16:02:03 <@zodbot:fedora.im> Raymond Colebaugh (rcolebaugh) - he / him / his 2024-04-10 16:02:49 <@salimma:fedora.im> !hi 2024-04-10 16:02:50 <@zodbot:fedora.im> Michel Lind (salimma) - he / him / his 2024-04-10 16:03:38 <@jonathanspw:fedora.im> !hi 2024-04-10 16:03:40 <@zodbot:fedora.im> Jonathan Wright (jonathanspw) 2024-04-10 16:03:43 <@davide:cavalca.name> welcome everyone, let's get started 2024-04-10 16:03:48 <@jonathanspw:fedora.im> I'm on mobile but sort of here 2024-04-10 16:03:51 <@davide:cavalca.name> !topic Followups 2024-04-10 16:04:05 <@davide:cavalca.name> any followups to share from the last meeting? 2024-04-10 16:05:17 <@davide:cavalca.name> ahah same here 2024-04-10 16:06:46 <@conan_kudo:matrix.org> stuff and things :) 2024-04-10 16:07:02 <@conan_kudo:matrix.org> alas we don't have the meeting logs and minutes on the sig site still :( 2024-04-10 16:07:25 <@anitazha:matrix.org> !hi 2024-04-10 16:07:27 <@zodbot:fedora.im> Anita Zhang (anitazha) - she / her / hers 2024-04-10 16:08:24 <@salimma:fedora.im> yeah, let me try and fish out a link 2024-04-10 16:08:35 <@salimma:fedora.im> we should probably send the link with every followup topic 2024-04-10 16:08:35 <@davide:cavalca.name> https://meetbot.fedoraproject.org/meeting_matrix_fedoraproject-org/2024-03-27/centos-hyperscale-sig.2024-03-27-16.02.log.html 2024-04-10 16:08:53 <@zodbot:fedora.im> salimma has already given cookies to dcavalca during the F39 timeframe 2024-04-10 16:08:57 <@davide:cavalca.name> at least the fedora meetbot has search 2024-04-10 16:09:42 <@salimma:fedora.im> no action item... hmm 2024-04-10 16:10:08 <@salimma:fedora.im> the info is ... virt stack refresh, talks, and RPM CoW 2024-04-10 16:10:20 <@salimma:fedora.im> and there's question about the 6.8 kernel 2024-04-10 16:10:37 <@salimma:fedora.im> sorry, there's info about that, and Jun Wang asked about Mellanox in the channel yesterday 2024-04-10 16:10:47 <@conan_kudo:matrix.org> I was waiting for the Fedora rebase to complete before shipping it here 2024-04-10 16:10:53 <@salimma:fedora.im> and someone from Intel is supposed to show up today? Adenilson Cavalcanti 2024-04-10 16:10:55 <@conan_kudo:matrix.org> that's done as of end of last week 2024-04-10 16:11:05 <@conan_kudo:matrix.org> so now I'll update Hyperscale probably today 2024-04-10 16:11:12 <@conan_kudo:matrix.org> I need to make new images anyway for TXLF 2024-04-10 16:12:00 <@conan_kudo:matrix.org> Carl George asked me about demoing CentOS Hyperscale at TXLF 2024-04-10 16:14:45 <@aekoroglu:matrix.org> he's here :) 2024-04-10 16:15:18 <@aekoroglu:matrix.org> Adenilson Cavalcanti: ? 2024-04-10 16:16:05 <@conan_kudo:matrix.org> he hasn't joined the room yet 2024-04-10 16:16:40 <@davide:cavalca.name> let's move on in the meantime 2024-04-10 16:16:44 <@davide:cavalca.name> !topic Announcements 2024-04-10 16:16:58 <@davide:cavalca.name> a bunch of us will be at TXLF later this week 2024-04-10 16:17:13 <@davide:cavalca.name> as Conan Kudo mentioned we might have a demo there as well 2024-04-10 16:17:52 <@davide:cavalca.name> I was checking if SCALE videos were up already but nope not yet 2024-04-10 16:19:16 <@davide:cavalca.name> the only other thing I had on my end is that we're looking at potentially backporting conda in Hyperscale 2024-04-10 16:20:01 <@davide:cavalca.name> that would only be for el9, and I'm hopeful we can actually get the bulk of it into EPEL proper 2024-04-10 16:20:38 <@salimma:fedora.im> I hope we don't have many cases where we need to have multiple vrsions of a Python module 2024-04-10 16:20:40 <@salimma:fedora.im> those are a pain 2024-04-10 16:21:17 <@salimma:fedora.im> oh, something similar but smaller: someone's asking for an ed update in c9s since apparently there are regressions in the version shipped affecting search/replace (yes, someone uses ed...) 2024-04-10 16:21:32 <@salimma:fedora.im> so I'll probably do a quick backport to HS and file a JIRA to get it upgraded 2024-04-10 16:21:35 <@davide:cavalca.name> yeah, for the record the only reason we're even considering conda is because recent version reimplemented the solver and it's massively faster 2024-04-10 16:21:41 <@conan_kudo:matrix.org> ... 2024-04-10 16:22:08 <@conan_kudo:matrix.org> so is conda now using the mamba solver or something else? 2024-04-10 16:23:13 <@davide:cavalca.name> yep it's using mamba which uses libsolv 2024-04-10 16:24:08 <@conan_kudo:matrix.org> awesome 2024-04-10 16:24:16 <@conan_kudo:matrix.org> do we have that in fedora already? 2024-04-10 16:25:16 <@davide:cavalca.name> yep it's already in Fedora 2024-04-10 16:27:01 <@davide:cavalca.name> anything else for announcements? 2024-04-10 16:27:39 <@salimma:fedora.im> it was fun too that conda recently switched to using YY.MM as their version number 2024-04-10 16:27:54 <@salimma:fedora.im> making it initially seeming like EL9 is woefully out of date, but it's actually not 2024-04-10 16:30:06 <@davide:cavalca.name> next up 2024-04-10 16:30:12 <@davide:cavalca.name> !topic Tickets 2024-04-10 16:30:53 <@davide:cavalca.name> I don't think we have anything notable here this week? 2024-04-10 16:31:34 <@davide:cavalca.name> !topic Membership 2024-04-10 16:31:48 <@davide:cavalca.name> we have one membership request in https://pagure.io/centos-sig-hyperscale/sig/issue/163 2024-04-10 16:32:07 <@davide:cavalca.name> Adenilson Cavalcanti: would you like to introduce yourself? 2024-04-10 16:32:35 <@conan_kudo:matrix.org> still hasn't joined the room 2024-04-10 16:32:53 <@davide:cavalca.name> ah, would be nice if element flagged than when tagging them :) 2024-04-10 16:33:04 <@davide:cavalca.name> ah, would be nice if element flagged that when tagging them :) 2024-04-10 16:35:20 <@junwang123:matrix.org> Is there an office hour time to talk more about this one? 2024-04-10 16:36:22 <@davide:cavalca.name> if you mean talk over zoom, we have a hangout scheduled for next week 2024-04-10 16:36:35 <@davide:cavalca.name> we can also talk about it here if nobody else has stuff 2024-04-10 16:36:41 <@davide:cavalca.name> !topic Misc 2024-04-10 16:37:24 <@conan_kudo:matrix.org> I don't have anything :) 2024-04-10 16:38:08 <@davide:cavalca.name> Jun Wang: you have the floor :) 2024-04-10 16:39:44 <@junwang123:matrix.org> Hi Everyone, thanks for all the help. In the past, we go with LTS kernel and recompile nvidia kernel modules as out of the tree module, for each kernel version we use. 2024-04-10 16:40:42 <@junwang123:matrix.org> I'm looking for inputs on how it would work with the Hyperscale/Fedora kernel. Looks like the minor version update is quite frequent. 2024-04-10 16:41:43 <@conan_kudo:matrix.org> what's keeping the module out of tree? 2024-04-10 16:41:53 <@salimma:fedora.im> nvidia GPU modules, or Mellanox? 2024-04-10 16:42:05 <@junwang123:matrix.org> both 2024-04-10 16:42:20 <@salimma:fedora.im> for mellanox Conan Kudo's question holds I think 2024-04-10 16:42:36 <@conan_kudo:matrix.org> yeah, I know what's up with the GPU drivers 2024-04-10 16:42:54 <@junwang123:matrix.org> for nvidia-kmod, we were using https://github.com/elrepo/packages/tree/master/nvidia-kmod/el7. thinking about using https://github.com/elrepo/packages/tree/master/nvidia-kmod/el9 now. 2024-04-10 16:43:11 <@salimma:fedora.im> why is it out of tree (if it's because you get development versions that are not upstreamed yet, FWIW Meta might have similar drivers and we are ... not as far as 6.8 yet internally) 2024-04-10 16:44:05 <@davide:cavalca.name> this is the proprietary nvidia driver, it can't go in-tree 2024-04-10 16:44:07 <@salimma:fedora.im> for the GPU drivers I think your best bet is participating in Fedora's kernel test days 2024-04-10 16:44:50 <@salimma:fedora.im> have Fedora installed on one of the machine with a GPU you need to work, and report if there's any issue with the driver (note that Fedora recommends the RPM Fusion driver at the moment, so if you can repro with that it will help) 2024-04-10 16:45:20 <@salimma:fedora.im> because then you can catch issues before it hits Fedora, and before we then rebase the HS kernel on it 2024-04-10 16:45:26 <@junwang123:matrix.org> there are different nvidia versions and different kernel versions, we need combinations. so we were building them. 2024-04-10 16:46:08 <@salimma:fedora.im> right. we built nvidia drivers in house too for our production kernel, and sometimes people need different versions 2024-04-10 16:46:35 <@conan_kudo:matrix.org> fwiw, test results for fedora 100% apply to hyperscale 2024-04-10 16:46:46 <@conan_kudo:matrix.org> since the code is the same and the config is only slightly different 2024-04-10 16:46:50 <@salimma:fedora.im> but... I guess if you really need this to work, you need to control your own kernel release cadence and the HS kernel might not be suitable. I don't think we want to be blocked on making sure various Nvidia kernel versions work 2024-04-10 16:46:54 <@conan_kudo:matrix.org> fwiw, test results for fedora nearly 100% apply to hyperscale 2024-04-10 16:47:22 <@salimma:fedora.im> so yeah, test compiling your different versions during the Fedora kernel test and report any issue (I am not sure if they will consider it blocking, but you should try) 2024-04-10 16:47:47 <@davide:cavalca.name> we've talked about potentially doing a slower-moving kernel in HS as well, for similar reasons, but it's tricky in practice and I don't know if/where that will land 2024-04-10 16:48:04 <@conan_kudo:matrix.org> it also depends on how things shake out for cs10 2024-04-10 16:48:41 <@conan_kudo:matrix.org> I'm tracking the cs10 kernel development stuff now, and watching to see where things land 2024-04-10 16:48:56 <@salimma:fedora.im> yeah. it's chicken and egg... unless we can get some of us to actually dogfood it, who knows how well it will work 2024-04-10 16:49:18 <@conan_kudo:matrix.org> well the main problem is coexistence 2024-04-10 16:49:41 <@conan_kudo:matrix.org> we need to hackfest this to make it so parallel kernel tracks can be available in the repository at once 2024-04-10 16:49:44 <@conan_kudo:matrix.org> right now we don't have that 2024-04-10 16:50:32 <@salimma:fedora.im> something similar to Asahi where they used to have a differently-named kernel package would work, I guess? 2024-04-10 16:50:39 <@salimma:fedora.im> Debian and Ubuntu do it that way too 2024-04-10 16:50:52 <@salimma:fedora.im> but yeah there'll also be the question of who will maintain the other kernels :) 2024-04-10 16:51:01 <@davide:cavalca.name> we could use separate tags I suppose? but yeah this came up in another setting with sched_ext, so it'd be worth coming up with a good solution and documenting it 2024-04-10 16:52:19 <@junwang123:matrix.org> we're using version number combination, such as kmod-nvidia-5.15.147-t3-515.65.01-1.el7.twitter.x86_64.rpm 2024-04-10 16:53:34 <@conan_kudo:matrix.org> we will definitely need separate tags if for nothing else so kmod rebuilds don't get confused 2024-04-10 16:53:43 <@salimma:fedora.im> yeah, if we need different kernel tracks it will likely embed the kernel MAJ.MIN somewhere 2024-04-10 16:54:24 <@salimma:fedora.im> but it can't just be MAJ.Min - sometimes we'll need another tag e.g. sched_ext, or something else we don't anticipate right now 2024-04-10 16:55:15 <@junwang123:matrix.org> For Mellanox OFED, we're going through the support page. https://docs.nvidia.com/networking/display/mlnxofedv24010331/general+support 2024-04-10 16:55:20 <@conan_kudo:matrix.org> the problem is that the infrastructure around the kernel packaging makes it difficult to change the basename without potentially breaking something 2024-04-10 16:56:20 <@junwang123:matrix.org> For Mellanox OFED, we're going through the support page. There is a 6.7 kernel row on the table. Not sure what that means. Is it related to Fedora kernel therefore be a moving target or was it 6.7 kernel selected to gain more support. https://docs.nvidia.com/networking/display/mlnxofedv24010331/general+support 2024-04-10 16:56:33 <@conan_kudo:matrix.org> also, changing the basename means the mainline kernel package is no longer overshadowed too 2024-04-10 16:56:51 <@salimma:fedora.im> yeah but that's probably a feature not a bug 2024-04-10 16:57:15 <@conan_kudo:matrix.org> is it? our systems have btrfs 2024-04-10 16:57:20 <@conan_kudo:matrix.org> the mainline kernel does not 2024-04-10 16:57:25 <@salimma:fedora.im> so that question is still unanswered, is this driver in the process of being upstreamed? 2024-04-10 16:57:30 <@conan_kudo:matrix.org> ("mainline" referring to what CentOS itself provides) 2024-04-10 16:57:50 <@salimma:fedora.im> good point. yeah but if you're tracking a different kernel series you should know to never boot the normal 'kernel' 2024-04-10 16:58:07 <@salimma:fedora.im> and we'll still have the normal, unrenamed HS kernel right? that one still shadows the mainline kernel 2024-04-10 16:58:21 <@conan_kudo:matrix.org> yeah 2024-04-10 16:58:22 <@salimma:fedora.im> if you need another series, remove every 'kernel' package. if you don't, use the untagged HS kernel 2024-04-10 16:58:27 <@conan_kudo:matrix.org> as long as we always have that, we should be good 2024-04-10 16:59:42 <@davide:cavalca.name> we're almost out of time 2024-04-10 17:00:13 <@davide:cavalca.name> once we get consensus on this we should document it somewhere so it doesn't get lost 2024-04-10 17:00:30 <@davide:cavalca.name> have a good one folks! 2024-04-10 17:00:34 <@conan_kudo:matrix.org> yes 2024-04-10 17:00:45 <@conan_kudo:matrix.org> I think this is something we're going to have to sort out in a hackfest 2024-04-10 17:00:59 <@davide:cavalca.name> !endmeeting