<@tflink:fedora.im>
16:31:19
!startmeeting fedora-ai-ml-sig
<@meetbot:fedora.im>
16:31:20
Meeting started at 2024-09-26 16:31:19 UTC
<@meetbot:fedora.im>
16:31:21
The Meeting name is 'fedora-ai-ml-sig'
<@tflink:fedora.im>
16:31:28
there we go, matrix is cooperating
<@tflink:fedora.im>
16:31:42
Who all's here for the glorious ai-ml sig meeting?
<@tflink:fedora.im>
16:31:43
!hi
<@zodbot:fedora.im>
16:31:44
Tim Flink (tflink)
<@jsteffan:fedora.im>
16:32:15
!hi (lurking in the background)
<@zodbot:fedora.im>
16:32:15
Sorry, I can only look up one username at a time
<@man2dev:fedora.im>
16:32:24
!hi
<@zodbot:fedora.im>
16:32:24
Mohammadreza Hendiani (man2dev)
<@trix:fedora.im>
16:32:32
!hi made it on time .. yeah!
<@zodbot:fedora.im>
16:32:33
Sorry, I can only look up one username at a time
<@tflink:fedora.im>
16:32:58
the !hi macro is somewhat limited :)
<@trix:fedora.im>
16:33:20
poor little macro.. needs some ai injection :)
<@trix:fedora.im>
16:33:50
me first ?
<@tflink:fedora.im>
16:34:13
I was going to wait another minute to see if anyone else joins but I suppose we have most of the usual suspects
<@trix:fedora.im>
16:34:19
k
<@tflink:fedora.im>
16:34:35
!topic PyTorch on ROCm APUs
<@trix:fedora.im>
16:34:54
what you guys think about having this as a feature for F42 ?
<@tflink:fedora.im>
16:35:18
I like the idea but I'm a little worried about making it happen well
<@mystro256:fedora.im>
16:35:24
!hi
<@zodbot:fedora.im>
16:35:25
None (mystro256)
<@tflink:fedora.im>
16:36:22
especially since we're already having buildtime problems with certain rocm packages - if we start adding shader families, that's not going to get better
<@trix:fedora.im>
16:36:45
I have a M780 now, the gfx1103, it would be nice to get this out experimental and doing something useful.
<@trix:fedora.im>
16:38:47
maybe we don't get as far as pytorch, but if we can get rocblas going, then llama-cpp and similar can benifit
<@tflink:fedora.im>
16:39:25
yeah, I don't have any real objection to adding targets so long as we can still build everything
<@tflink:fedora.im>
16:39:34
I just wonder if it's still too early to have a feature for it, I guess
<@man2dev:fedora.im>
16:39:59
I think it's not bad idea to start testing the waters and see how it goes
<@trix:fedora.im>
16:40:11
i am depending on other folks covering other apu's. so trying to get folks interested.
<@trix:fedora.im>
16:40:43
i guess if no objections we can say yes and move on ?
<@tflink:fedora.im>
16:40:46
I do have an external GPU that I wanted to start working with but ENOTIME and I have a hard time seeing that change any time soon :(
<@tflink:fedora.im>
16:41:12
!info at the moment, support for APUs in ROCm is experimental
<@man2dev:fedora.im>
16:41:17
Do we have the infra for APU?
<@trix:fedora.im>
16:41:36
theortically, yes.
<@trix:fedora.im>
16:41:48
in practice there are too many of them.
<@tflink:fedora.im>
16:41:53
!info adding support for gfx1103 (Ryzen 780M amongst others) would make testing with those targets easier
<@tflink:fedora.im>
16:42:07
define having infra - we have around as much as we have for anything else
<@tflink:fedora.im>
16:42:35
it'd be harder to include in any future automated testing against real HW but since we don't have any of that now ... it's not a huge difference
<@trix:fedora.im>
16:42:41
infra == sitting on Tom's desk :)
<@man2dev:fedora.im>
16:43:16
I just never Built a package that had APU supported hardware, so I never checked if the infra is there or not
<@man2dev:fedora.im>
16:43:46
😂😂
<@tflink:fedora.im>
16:43:47
the build is done CPU only - you only need the specific HW when using the compiled package
<@tflink:fedora.im>
16:44:00
either that or my basement :-/
<@tflink:fedora.im>
16:44:51
I also don't see any objections
<@man2dev:fedora.im>
16:45:06
I don't know. I don't have enough experience as to have a solid opinion on this. But in theory I don't see anything wrong with it.
<@tflink:fedora.im>
16:45:40
PROPOSAL: enable gfx1103 with a future rebuild of ROCm and start poking at it. if things are working well, consider submitting that as a feature for F42
<@tflink:fedora.im>
16:45:49
ack/nak/patch?
<@tflink:fedora.im>
16:46:07
or am I being too formal with this?
<@tflink:fedora.im>
16:46:15
yeah, probably am
<@germano:fedora.im>
16:46:22
I am available to test the 680M too since is quite new (just 2 years old)
<@tflink:fedora.im>
16:46:36
!info the plan is to enable gfx1103 with a future rebuild of ROCm and start poking at it. if things are working well, consider submitting that as a feature for F42
<@trix:fedora.im>
16:47:10
amend to gfx1103 and gfx1035
<@tflink:fedora.im>
16:47:12
the problem with 680M is that's gfx10
<@tflink:fedora.im>
16:47:20
ok
<@tflink:fedora.im>
16:47:23
!undo
<@tflink:fedora.im>
16:47:47
!info the plan is to enable gfx1103 and gfx1035 with a future rebuild of ROCm and start poking at it. if things are working well, consider submitting that as a feature for F42
<@tflink:fedora.im>
16:48:02
anything else on this?
<@trix:fedora.im>
16:48:03
groovy
<@tflink:fedora.im>
16:48:30
moving on
<@tflink:fedora.im>
16:48:37
!topic Vulkan Issue Update
<@tflink:fedora.im>
16:48:43
Mohammadreza Hendiani: you're up
<@man2dev:fedora.im>
16:49:42
I already iterated the issue but if anyone's not up to date, the problem is that we are breaking the Vulkan spec as it is.
<@man2dev:fedora.im>
16:50:05
By adding .%arch at the end of spec
<@man2dev:fedora.im>
16:52:23
Vulkan does not have official support for x86 and and if you run then run code it will Supposedly silently fail as far as I researched.
<@tflink:fedora.im>
16:52:50
I'm not sure I follow 100% - it sounds like this is something that needs to be fixed upstream? or is that comment about how multiarch support would need to be fixed upstream?
<@tflink:fedora.im>
16:53:14
aren't a bunch of games ia32 only? or is that just steam itself?
<@man2dev:fedora.im>
16:53:30
No they never supported it in the first place
<@tflink:fedora.im>
16:54:06
so it's more of a "we're trying to support (on paper) something that upstream doesn't support and it's breaking things"?
<@man2dev:fedora.im>
16:54:31
It just there because they had to write the code for windows
<@tflink:fedora.im>
16:55:43
either way, I don't pretend to be terribly knowledgeable about vulkan and am inclined to go with whatever airlied says
<@tflink:fedora.im>
16:55:57
what ai-ml bits are trying to use vulkan?
<@man2dev:fedora.im>
16:56:09
Yes I believe there is a solution that we can have 32 bits and 64 bits without breaking things, but I haven't found a scalable one yet. I have found one that is doable with environment variable, but it doesn't support elevated privileges.
<@man2dev:fedora.im>
16:56:13
https://github.com/KhronosGroup/Vulkan-Tools/blob/main/BUILD.md#building-on-linux
<@man2dev:fedora.im>
16:57:47
I haven't shipped any yet because Vulkan support isn't very great at this point, but Nvidia stuff does use Vulkan layers as well
<@tflink:fedora.im>
16:57:59
ah, ok.
<@tflink:fedora.im>
16:58:38
it sounds like there is a proposal to make things work better on at least x86_64 but there seems to be at least some hesitation to applying the proposal?
<@man2dev:fedora.im>
16:58:38
Nvidia container Toolkit if I'm not mistaken.
<@man2dev:fedora.im>
16:58:49
Does have vulkan support
<@tflink:fedora.im>
16:59:29
!info there are some (as of yet unnamed) ai-ml related bits that could use vulkan but the current state of vulkan support in Fedora makes that difficult
<@tflink:fedora.im>
16:59:44
!link https://bugzilla.redhat.com/show_bug.cgi?id=2314042
<@tflink:fedora.im>
17:00:18
!info a fix has been proposed but there have been concerns raised about that fix and more discussion will happen around the linked rhbz
<@tflink:fedora.im>
17:00:25
does that sum things up pretty well?
<@man2dev:fedora.im>
17:01:39
I dont have solid solid fix yet. Other than bringing back the spec, everything else, is just kind of working but it's not really good fix
<@man2dev:fedora.im>
17:01:45
Yeah
<@tflink:fedora.im>
17:01:57
ok. anything else to discuss on this topic?
<@man2dev:fedora.im>
17:02:14
No
<@tflink:fedora.im>
17:02:35
cool, thanks for the update and info
<@tflink:fedora.im>
17:02:48
!topic Fedora Chatbot Update?
<@tflink:fedora.im>
17:03:06
not sure any of the relevant folks are around today
<@man2dev:fedora.im>
17:03:29
If you 2ant we can skip
<@man2dev:fedora.im>
17:03:42
And just talk in group
<@tflink:fedora.im>
17:03:54
oh, yeah - did my notes on your other topic answer your questions?
<@tflink:fedora.im>
17:04:02
I just assumed they did but should have asked
<@man2dev:fedora.im>
17:04:48
Yeah it was good
<@tflink:fedora.im>
17:05:06
ok, sorry for assuming before asking - that's why I skipped it
<@tflink:fedora.im>
17:05:35
!info none of the folks involved in the Fedora Chatbot proposal are around for the meeting
<@tflink:fedora.im>
17:05:49
that's all of the submitted topics for today which bring us to
<@tflink:fedora.im>
17:05:53
!topic Open Floor
<@tflink:fedora.im>
17:06:06
anyone have other topics to bring up today?
<@trix:fedora.im>
17:07:20
6.2.1 is coming along.
<@trix:fedora.im>
17:07:52
compat llvm18 is going to slap us in the face soon.
<@tflink:fedora.im>
17:08:02
!info rocm 6.2.1 rebuild is moving along
<@tflink:fedora.im>
17:08:22
woo, only need to resubmit that message 20 times ... sigh
<@tflink:fedora.im>
17:09:07
it's already in rawhide and as I understand the changes thatJeremy Newton made to rocm-compilersupport, the rawhide builds should already be using the llvm18 compat packages
<@trix:fedora.im>
17:09:11
is there anything to do about compat so we don't do this again in llvm19
<@tflink:fedora.im>
17:09:31
llvm18 has landed in F41 updates-testing but hasn't been pushed stable yet
<@trix:fedora.im>
17:10:03
clang18 has a problem with blender.. so i dont think that its the last rev.
<@tflink:fedora.im>
17:10:33
semi-fragile and complex dep chains are so much fun :)
<@tflink:fedora.im>
17:11:21
I think that we need to watch for the llvm20 change proposal for F42 and restart the conversation around making this less painful for folks who aren't the llvm maintainers
<@trix:fedora.im>
17:11:36
too much depends on 'soak time' in rawhide so not doing llvm18 or llvm17 in rawhide has consequences.
<@tflink:fedora.im>
17:11:50
the promised changes to make our lives less painful didn't actually happen
<@tflink:fedora.im>
17:11:55
for F41 and llvm19
<@tflink:fedora.im>
17:12:37
but that's still a little while out
<@tflink:fedora.im>
17:13:38
either way, let's keep track of the timing of the llvm19 stuff landing in F41 and what has been required to deal with that late-landing change so that we can go into the F42 change discussion with concrete "these are the exact problems" items
<@tflink:fedora.im>
17:14:11
the recent change to rocm-compilersupport should make future changes less painful, though
<@tflink:fedora.im>
17:14:16
_should_
<@trix:fedora.im>
17:14:49
as i said the last time, the problem is testing on F41 is unplanned, as the testing for F41 happened in rawhide before the branching.
<@trix:fedora.im>
17:15:19
now folks have oob testing in addition to rawhide devel.
<@tflink:fedora.im>
17:15:23
!info let's gather more concrete information on how the late-landing llvm changes are affecting us for the eventual F42 change proposal discussion - specific issues are more likely to garner change
<@tflink:fedora.im>
17:15:42
anything else on this we can discuss or fix today?
<@nirik:matrix.scrye.com>
17:16:21
Oh, hey... I had something to mention...
<@nirik:matrix.scrye.com>
17:17:18
someone mentioned (I think in devel?) the other day that llama-cpp wasn't building in rawhide... I looked at it and the problem seems to be that there are two files in sources. The current version and an old version that never was uploaded? I think fixing the sources file there might get it building again.
<@trix:fedora.im>
17:18:08
nirik, poke me after meeting, llama-cpp is mine and i haven't given it any luv recently.
<@nirik:matrix.scrye.com>
17:18:40
can do. Oddly I do see it building in koschei now that I look, so I will have to remember what the problem was.
<@man2dev:fedora.im>
17:18:42
Llama.cpp is an always moving target. It's practically unrecognizable from its original form due to fast changes in AI.
<@tflink:fedora.im>
17:19:18
!info llama-cpp is having having issues building in rawhide, may need some (hopefully small) changes
<@man2dev:fedora.im>
17:19:31
It's so fast moving that they do a tag build with each commit.
<@tflink:fedora.im>
17:20:12
it sounds like that topic has been addressed for now
<@tflink:fedora.im>
17:20:16
anything else?
<@tflink:fedora.im>
17:20:36
has been, will be - whatever tense makes most sense :)
<@trix:fedora.im>
17:21:16
F41 updates seem slow, any reason ?
<@tflink:fedora.im>
17:22:03
freeze lifts today so there shouldn't be any other delays that I know of
<@tflink:fedora.im>
17:22:32
s/today/tuesday
<@tflink:fedora.im>
17:22:39
I know what day today is, I swear
<@trix:fedora.im>
17:22:56
so expected slow was the freeze, i guess i need to pay more attention to stuff.
<@man2dev:fedora.im>
17:22:59
Thu
<@trix:fedora.im>
17:23:40
6.2.0 dribbled in really really slow, sorry sorry..
<@tflink:fedora.im>
17:24:33
there shouldn't be any other issues until final freeze which I believe will be 2024-10-15
<@trix:fedora.im>
17:25:16
did we want to get 6.2.1 into F41 ?
<@tflink:fedora.im>
17:25:48
we need to rebuild for the impending llvm18 compat packages, no? or do we just need to rebuild the core bits that depend directly on llvm?
<@trix:fedora.im>
17:26:21
at a minimum.. i have trust issues with compat.
<@trix:fedora.im>
17:26:50
so will rebuild be 6.2.0 or 6.2.1 ?
<@tflink:fedora.im>
17:27:09
I think you're in the best position to make that call, honestly
<@trix:fedora.im>
17:27:48
baaaa ... ok, i'll think about it and say on list later.
<@tflink:fedora.im>
17:28:02
sounds good, let me know if you want help
<@tflink:fedora.im>
17:28:16
anyhow we're pretty much out of time at this point
<@tflink:fedora.im>
17:28:29
unless there are urgent things to bring up, I'll close out the meeting
<@trix:fedora.im>
17:28:41
sounds good, thanks!
<@tflink:fedora.im>
17:29:27
Thanks for coming, everyone. I'll send out minutes shortly
<@tflink:fedora.im>
17:29:40
!info next ai-ml SIG meeting will be Thursday, October 10
<@tflink:fedora.im>
17:29:44
!endmeeting