2024-11-07 17:30:36 <@tflink:fedora.im> !startmeeting fedora-ai-ml-sig 2024-11-07 17:30:37 <@meetbot:fedora.im> Meeting started at 2024-11-07 17:30:36 UTC 2024-11-07 17:30:37 <@meetbot:fedora.im> The Meeting name is 'fedora-ai-ml-sig' 2024-11-07 17:32:00 <@trix:fedora.im> quick someone else add something to meeting agenda before i start talking! 2024-11-07 17:32:41 <@man2dev:fedora.im> !hi 2024-11-07 17:32:42 <@tflink:fedora.im> yeah, I noticed that they were both from you :) 2024-11-07 17:33:04 <@zodbot:fedora.im> Mohammadreza Hendiani (man2dev) 2024-11-07 17:33:12 <@tflink:fedora.im> that being said, shall we get started? 2024-11-07 17:33:22 <@tflink:fedora.im> !topic ROCm APU 2024-11-07 17:34:00 <@trix:fedora.im> APU's are looking good. 2024-11-07 17:34:19 <@tflink:fedora.im> which ones? gfx1103? 2024-11-07 17:34:29 <@trix:fedora.im> I believe we are on track for a Fedora feature saying 'yeah pytorch on laptops' 2024-11-07 17:34:41 <@trix:fedora.im> 1035 - M680 2024-11-07 17:34:49 <@trix:fedora.im> 1103 - M780 2024-11-07 17:35:01 <@trix:fedora.im> 1151 - M880 ?? 2024-11-07 17:35:14 <@trix:fedora.im> i don't have the last one yet, it builds.. 2024-11-07 17:36:06 <@trix:fedora.im> its this one https://www.amd.com/en/products/processors/laptop/ryzen/300-series/amd-ryzen-ai-9-hx-370.html 2024-11-07 17:37:17 <@trix:fedora.im> i am also trying to get ollama spun up so people with laptops can get llm's to do their jobs. 2024-11-07 17:37:21 <@trix:fedora.im> or whatever. 2024-11-07 17:37:30 <@trix:fedora.im> so fun stuff . 2024-11-07 17:37:43 <@man2dev:fedora.im> Oh btw 2024-11-07 17:37:48 <@tflink:fedora.im> cool, thanks for the update 2024-11-07 17:38:41 <@man2dev:fedora.im> Vulkan is Beyond unstable in nvidia since it relies on Vulcan layers 2024-11-07 17:39:38 <@tflink:fedora.im> !info progress has been made on enabling more mobile AMD GPUs to work with the pytorch stack - everything appears to be working 2024-11-07 17:39:39 <@trix:fedora.im> i am only working on ROCm things because that's what i get paid for, but if someone wants vulkan, we can do that 2024-11-07 17:40:47 <@man2dev:fedora.im> And And for whatever reason I'm getting way more crashes and I believe it's a c-group issue because Nvidia bypasses c-group protection and every kind of protection and sets the limits themselves 2024-11-07 17:41:36 <@man2dev:fedora.im> Which can easily cause overflow and crash or freezing of system 2024-11-07 17:42:00 <@trix:fedora.im> no surprise, i have no nvidia hw to help out with that problem 😊 2024-11-07 17:42:20 <@man2dev:fedora.im> Especially with LLMs that use a lot of memory. 2024-11-07 17:42:44 <@trix:fedora.im> if you want to use vulkan, try it on amd, i can help there. 2024-11-07 17:43:17 <@tflink:fedora.im> anyhow, anything else on AMD APUs? 2024-11-07 17:43:27 <@trix:fedora.im> nope. it's looking great! 2024-11-07 17:44:08 <@tflink:fedora.im> ok, moving on to ... 2024-11-07 17:44:11 <@tflink:fedora.im> !topic ROCm bundled llvm 2024-11-07 17:44:30 <@tflink:fedora.im> ah, this topic coming out of the fun of F40 and F41 2024-11-07 17:44:36 <@trix:fedora.im> yes. 2024-11-07 17:44:57 <@tflink:fedora.im> !info late releases of llvm have caused problems for ROCm in both F40 and F41 2024-11-07 17:45:12 <@tflink:fedora.im> have you been able to get ROCm working in F41 now? 2024-11-07 17:45:16 <@trix:fedora.im> i am trying to mitigate the risk of another very late llvm drop that breaks F42 2024-11-07 17:46:25 <@trix:fedora.im> F41, i am not sure. Jeremy Newton had some low parts of the stack for 6.2.1 that i +1's with you today. 2024-11-07 17:46:39 <@trix:fedora.im> so maybe they dribbling in soon. 2024-11-07 17:46:41 <@tflink:fedora.im> oh, that update is still sitting in testing? 2024-11-07 17:46:46 <@trix:fedora.im> yes. 2024-11-07 17:47:30 <@tflink:fedora.im> we should poke jeremy about that outside the meeting, he's the only one who can move that forward 2024-11-07 17:47:35 <@trix:fedora.im> lld fix going in at the last day before the freeze screwed rocm over. 2024-11-07 17:48:16 <@trix:fedora.im> apu == happy tom, llvm == mad tom. 2024-11-07 17:48:32 <@tflink:fedora.im> have you brought the topic up with FPC or is this more of a "we'll be prepared if stuff is broken at the last minute again" kind of thing 2024-11-07 17:48:42 <@man2dev:fedora.im> 😂 2024-11-07 17:49:12 <@trix:fedora.im> fpc ? 2024-11-07 17:49:20 <@tflink:fedora.im> fedora packaging committee 2024-11-07 17:49:37 <@mystro256:fedora.im> !hi 2024-11-07 17:49:37 <@zodbot:fedora.im> None (mystro256) 2024-11-07 17:49:39 <@tflink:fedora.im> I suspect that bundling llvm will need a waiver 2024-11-07 17:49:57 <@mystro256:fedora.im> Is that needed anymore? 2024-11-07 17:50:03 <@mystro256:fedora.im> I thought policy changed 2024-11-07 17:50:20 <@tflink:fedora.im> note - the bundling is not turned on right now but it can be enabled in the spec files 2024-11-07 17:50:24 <@man2dev:fedora.im> Tom we do have the infra set up to work with upstream I believe forexample what if the Bundel and build was based of whatever and is using at the time 2024-11-07 17:50:43 <@tflink:fedora.im> there have been changes around bundling but I don't remember the details. I don't think that all bundling was allowed, though 2024-11-07 17:50:53 <@man2dev:fedora.im> But don't know if this would fallow packaging guidelines 2024-11-07 17:51:17 <@trix:fedora.im> a problem Fedora and all the distro's have is clang forks. 2024-11-07 17:51:33 <@tflink:fedora.im> yeah, there are "just a few" of those 2024-11-07 17:51:48 <@tflink:fedora.im> and by "just a few" I mean a ton 2024-11-07 17:52:00 <@mystro256:fedora.im> https://docs.fedoraproject.org/en-US/packaging-guidelines/#bundling 2024-11-07 17:52:10 <@mystro256:fedora.im> "Fedora packages SHOULD make every effort to avoid having multiple..." 2024-11-07 17:52:19 <@trix:fedora.im> When i asked about triton, i would told bundling was fine, and no toolchains would not handle it. 2024-11-07 17:52:21 <@mystro256:fedora.im> not a MUST 2024-11-07 17:52:38 <@tflink:fedora.im> "All packages whose upstreams allow them to be built against system libraries MUST be built against system libraries" 2024-11-07 17:52:59 <@tflink:fedora.im> but there's an argument in this case that we can't build against system llcm 2024-11-07 17:53:07 <@tflink:fedora.im> but there's an argument in this case that we can't build against system llvm 2024-11-07 17:53:16 <@mystro256:fedora.im> You MUST set Provides: bundled(llvm) == 18.0.0 (and same for clang, ldd, compiler-rt etc) 2024-11-07 17:53:42 <@trix:fedora.im> things like triton are build on a snapshot and plain don't work with system clang. 2024-11-07 17:53:57 <@mystro256:fedora.im> Yeah the problem is that upstream doesn't intend for upstream linking, but it can be done 2024-11-07 17:54:11 <@mystro256:fedora.im> so they don't "allow" it in an official way 2024-11-07 17:54:26 <@mystro256:fedora.im> but it's trivial to allow 2024-11-07 17:54:56 <@mystro256:fedora.im> rocm-llvm is a light fork 2024-11-07 17:55:04 <@mystro256:fedora.im> of llvm 18 (for ROCm 6.2) 2024-11-07 17:55:10 <@tflink:fedora.im> so forking of llvm is discouraged but using it for anything outside the upstream project is only not disallowed? 2024-11-07 17:55:42 <@mystro256:fedora.im> Maybe we should ask Fesco? 2024-11-07 17:56:03 <@man2dev:fedora.im> I believe there are certain exceptions that can be set. For example, I think it wa syncthing that has an exemption and is bundling basically a bunch of go  language libraries inside of it? 2024-11-07 17:56:14 <@tflink:fedora.im> yeah, having a conversation with FESCo about this might be wise - either stop the last minute drops with late bugfixes or let us bundle 2024-11-07 17:56:44 <@mystro256:fedora.im> yeah honestly they need to come down hard on LLVM 2024-11-07 17:56:51 <@mystro256:fedora.im> or we need to bundle 2024-11-07 17:57:35 <@trix:fedora.im> i don't think we really have a choice, there are other clang forks, llvm is just a crappy project for allowing forks 2024-11-07 17:58:06 <@mystro256:fedora.im> Approaching FESCo will allow us to codify it then 2024-11-07 17:58:17 <@tflink:fedora.im> I can put together a ticket for FESCo unless someone else wants to do it 2024-11-07 17:58:20 <@mystro256:fedora.im> having it in writing that llvm is a forktastic project 2024-11-07 17:58:28 <@mystro256:fedora.im> Please 2024-11-07 17:58:57 <@tflink:fedora.im> !action tflink to write up FESCo ticket about the LLVM late landing problem 2024-11-07 18:00:40 <@tflink:fedora.im> anything else on this topic? 2024-11-07 18:00:46 <@trix:fedora.im> tflink: could you also include the f40 blender problem in that ? 2024-11-07 18:01:12 <@trix:fedora.im> F40 is going to be eol-ed before that thing is fixed. 2024-11-07 18:01:26 <@mystro256:fedora.im> Well the blender issue could be easily resolved if they allowed static linking 2024-11-07 18:01:32 <@mystro256:fedora.im> not sure if the symbol change fixed it 2024-11-07 18:02:06 <@trix:fedora.im> llvm guys are not really testing their dependent packages. 2024-11-07 18:02:10 <@tflink:fedora.im> I'll talk to you about the blender problem, not sure I understand that one well enough to make a coherant ticket about it 2024-11-07 18:02:42 <@tflink:fedora.im> to be fair, they really don't have the bandwidth to do all that testing. it's not an excuse to toss the hand grenade and walk away, though. IMHO 2024-11-07 18:03:09 <@man2dev:fedora.im> I think I'm getting why the FFmpeg people really love optimizing their code base by just writing it in assembly. 2024-11-07 18:03:35 <@trix:fedora.im> no one has the time to test. but late drops invalidate all the testing i did as i rolled out F40 and F41. 2024-11-07 18:04:09 <@trix:fedora.im> i really only test blender once or twice in a cycle, its a pain to set up. 2024-11-07 18:04:39 <@trix:fedora.im> but it is part of my normal build test. 2024-11-07 18:05:03 <@tflink:fedora.im> anything else on this topic? 2024-11-07 18:05:18 <@trix:fedora.im> sorry mad tom needs a smoke break .. 2024-11-07 18:06:35 <@tflink:fedora.im> no worries, it got rather crazy and it's a shame that ROCm wasn't quite working in the F41 release 2024-11-07 18:06:58 <@man2dev:fedora.im> Tom, if the blender people, or any of the packages have a build script already made inside the repo I recall seeing something about having support in the newer RPM specs. 2024-11-07 18:07:12 <@man2dev:fedora.im> Same story for tests 2024-11-07 18:08:00 <@mystro256:fedora.im> A suggestiong 2024-11-07 18:08:10 <@mystro256:fedora.im> we should have Fedora model llvm after debian 2024-11-07 18:08:19 <@mystro256:fedora.im> where all llvm packages are versioned 2024-11-07 18:08:27 <@tflink:fedora.im> what does debian do with llvm? I'm not familiar with that 2024-11-07 18:08:30 <@mystro256:fedora.im> llvm is a metapackage instead of an actually package 2024-11-07 18:08:40 <@trix:fedora.im> you mean like suse / tumbleweed ? 2024-11-07 18:08:54 <@mystro256:fedora.im> right now we have llvm and llvm18. I propose we add llvm19, and llvm is a metapackage requiring latest 2024-11-07 18:09:01 <@mystro256:fedora.im> Exactly 2024-11-07 18:09:04 <@trix:fedora.im> not that i have been looking at suse, i love you guys, really i do. 2024-11-07 18:09:24 <@mystro256:fedora.im> no need to last minute change anything, new llvm's are an addition of a new package instead of an update 2024-11-07 18:09:50 <@trix:fedora.im> yes, that would be better. 2024-11-07 18:09:51 <@mystro256:fedora.im> it would save us a lot of grief 2024-11-07 18:09:53 <@tflink:fedora.im> I think that topic came up in a bz thread or on devel@, didn't it? 2024-11-07 18:10:13 <@mystro256:fedora.im> I think we need to include this in the FESCo "demands" :) 2024-11-07 18:10:24 <@tflink:fedora.im> if I'm remembering correctly, there was resistance to doing that in Fedora but I don't remember the details 2024-11-07 18:10:36 <@trix:fedora.im> fwiw, suse has no lld-devel, so we need rocm-llvm there to get basic comgr going. 2024-11-07 18:11:28 <@trix:fedora.im> i would like rocm-llvm to be general enough that fedora-like distro could use it as is. 2024-11-07 18:11:55 <@tflink:fedora.im> I really hate my mail provider's web interface. I'll see if I can find the thread on devel@ (or wherever that was) after the meeting 2024-11-07 18:12:00 <@mystro256:fedora.im> sure, you can keep the logic in the spec file even if fedora doesn't use it 2024-11-07 18:12:43 <@trix:fedora.im> its working on tumbleweed now, you can see a few things need to be handled. but not much. 2024-11-07 18:13:11 <@trix:fedora.im> this is similar to get fedora things working on rhel. 2024-11-07 18:13:35 <@mystro256:fedora.im> I'm assuming RHEL is probably find with bundling LLVM 2024-11-07 18:14:02 <@trix:fedora.im> yes. i think not asking for 1/2 engineer to do the work would be a win for them. 2024-11-07 18:15:33 <@tflink:fedora.im> !info late llvm releases for Fedora continue to cause problems for ROCm, there are proposals as to how to deal with this problem but for now, the plan is to submit a ticket to FESCo and go from there once that conversation has happened 2024-11-07 18:15:45 <@tflink:fedora.im> anything else? 2024-11-07 18:15:51 <@tflink:fedora.im> on this topic 2024-11-07 18:16:14 <@trix:fedora.im> it is mostly working, i am up to the hip libs 2024-11-07 18:16:36 <@trix:fedora.im> and plan on having it functionally working for 6.3 2024-11-07 18:16:47 <@tflink:fedora.im> it? 2024-11-07 18:16:49 <@trix:fedora.im> as i think 6.3 will be the cutoff in F42 2024-11-07 18:16:55 <@trix:fedora.im> bundled llvm. 2024-11-07 18:17:17 <@tflink:fedora.im> ah. it's still disabled by default, though. right? 2024-11-07 18:17:23 <@trix:fedora.im> yes. 2024-11-07 18:18:10 <@tflink:fedora.im> ok, moving on to ... 2024-11-07 18:18:14 <@tflink:fedora.im> !topic open floor 2024-11-07 18:18:20 <@tflink:fedora.im> any other topics that folks wanted to bring up? 2024-11-07 18:19:28 <@trix:fedora.im> question in ai/ml. 2024-11-07 18:19:49 <@trix:fedora.im> heavy builders for fedora, anything you can speak of ? 2024-11-07 18:20:12 <@man2dev:fedora.im> I don't think I understand the question? 2024-11-07 18:20:19 <@tflink:fedora.im> nothing at the moment, I don't think that the budget for 25 has been fully decided yet 2024-11-07 18:21:30 <@trix:fedora.im> pytorch takes a long time to build, in past life, i asked for hw to allevate that .. hw == heavy builders, something much better than our basic builders 2024-11-07 18:22:08 <@tflink:fedora.im> there are heavy builders in koji but those are going EOL in 25 AFAIK. the proposal was to get some machines to replace some of the heavy builders that are going away 2024-11-07 18:23:26 <@trix:fedora.im> if we had a proposal for testing infra, it would be something i could shop around at amd for support. 2024-11-07 18:23:45 <@man2dev:fedora.im> https://dvprogram.state.gov 2024-11-07 18:23:53 <@trix:fedora.im> atm saying a bunch of machine on / under my desk doesn't cut it. 2024-11-07 18:24:03 <@tflink:fedora.im> I proposed it but I don't know what's happening with it after everyone moved around organizationally 2024-11-07 18:24:28 <@tflink:fedora.im> would it help to have machines in my basement? 2024-11-07 18:24:47 <@tflink:fedora.im> :-D 2024-11-07 18:24:52 <@man2dev:fedora.im> Universal-blue.org 2024-11-07 18:25:07 <@trix:fedora.im> if we could hook them up so others could use them or report the results publically that would be good. 2024-11-07 18:25:27 <@tflink:fedora.im> working on it but progress is slow now that it's only in my spare time 2024-11-07 18:25:28 <@trix:fedora.im> i don't have the bw to do that for my own machines. 2024-11-07 18:26:01 <@tflink:fedora.im> Mohammadreza Hendiani: I don't understand what you're getting at with the link 2024-11-07 18:27:26 <@man2dev:fedora.im> They bunch of DevOps people run this project and they do fairly big builds because they build Fedora from the ground up. I don't Remember what their build system uses but I know they have a lot of infrastructure set up for big builds 2024-11-07 18:28:08 <@tflink:fedora.im> I've talked with them a bit in the past but never really got into the details of their buildsystem 2024-11-07 18:29:24 <@trix:fedora.im> i try not to get involved in other people's buildsystems, just dealing with fedora's is enough, everyone solves the same problems but in different 'best' ways. 2024-11-07 18:30:04 <@tflink:fedora.im> anyhow, we're pretty much out of time unless there are any last minute topics 2024-11-07 18:30:28 <@trix:fedora.im> good meeting guys! 2024-11-07 18:30:35 <@tflink:fedora.im> thanks for coming, everyone 2024-11-07 18:30:44 <@tflink:fedora.im> !endmeeting