16:07:11 #startmeeting ELN (2023-10-20) 16:07:11 Meeting started Fri Oct 20 16:07:11 2023 UTC. 16:07:11 This meeting is logged and archived in a public location. 16:07:11 The chair is sgallagh. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:07:11 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:07:11 The meeting name has been set to 'eln_(2023-10-20)' 16:07:11 Meeting started Fri Oct 20 16:07:11 2023 UTC. 16:07:11 This meeting is logged and archived in a public location. 16:07:11 The chair is sgallagh. Information about MeetBot at https://fedoraproject.org/wiki/Zodbot#Meeting_Functions. 16:07:11 Useful Commands: #action #agreed #halp #info #idea #link #topic. 16:07:12 The meeting name has been set to 'eln_(2023-10-20)' 16:07:19 #meetingname eln 16:07:19 The meeting name has been set to 'eln' 16:07:19 The meeting name has been set to 'eln' 16:07:24 #topic Init Process 16:07:33 .hi 16:07:33 sgallagh: [hellomynameis sgallagh] 16:07:34 sgallagh: sgallagh 'Stephen Gallagher' 16:07:35 .hi 16:07:36 yselkowitz: [hellomynameis yselkowitz] 16:07:37 yselkowitz: yselkowitz 'Yaakov Selkowitz' 16:08:23 Howdy 16:09:50 I don't have anything specific on the agenda for today 16:09:56 #topic Agenda Topics 16:10:59 Anyone? 16:11:32 not sure if it's a meeting topic, but I had a thought about EBS 16:12:05 #topic ELNBuildSync 16:12:08 It is now! 16:12:25 Oh, for posterity: 16:12:27 #link https://sgallagh.wordpress.com/2023/10/13/sausage-factory-fedora-eln-rebuild-strategy/ 16:12:58 thanks for that btw, nice to have something to point to 16:13:01 wrt EBS's issues building sets of packages which require a particular build order... 16:13:38 I assume this is specifically related to OCAML? 16:13:40 I was wondering if it 1) is possible and 2) would help if EBS were to have certain packages which it should process first in a batch 16:13:47 ocaml, llvm, etc. 16:13:55 so iow 16:14:26 if ocaml or llvm were to be included in a batch, they should be processed first 16:15:08 iiuc this would increase the likelihood of EBS successfully handling the rest of the packages that are typically built with those 16:15:28 because it keeps retrying as long as the failures don't repeat 16:15:44 Not a bad idea, for known problem cases. Though it gets tricky if there's more than one layer like that. 16:15:44 so, taking ocaml as an example 16:15:54 (e.g. if we have to build llvm, then clang, then everything else) 16:16:17 wrt llvm, I think the version requirements take care of that 16:16:40 Meaning they'll fail and get reordered in subsequent RebuildAttempts? 16:17:01 so if you build llvm first, then the rest of that ecosystem, yes some will fail until clang has been built, but EBS will retry and they should pass the second time? 16:17:12 OK, I'll buy that. 16:17:38 no guarantees that will solve all our problems with these, but I think it would improve our chances 16:18:03 Thinking of ocaml ... isn't there a 1st, 2nd, 3rd package that needs to be done for a successful build? 16:18:04 wrt ocaml, "seeding" ocaml first will break anything that has dependencies between it and ocaml 16:18:20 I have an order that I've been using 16:18:47 I'm just wondering if we need to go at least one step further. A 1st and 2nd build. So llvm, then clang type thing. 16:18:58 Or do you think that would make things too complicated. 16:19:07 That's rapidly getting complicated. 16:19:21 so for ocaml, I do: 16:19:22 GROUP 1: ocaml 16:19:22 GROUP 2: graphviz ocaml-ocamlbuild ocaml-csexp ocaml-pp ocaml-labltk 16:19:22 GROUP 3: ocaml-dune ocaml-findlib 16:19:22 GROUP 4: brltty hivex libnbd ocaml-augeas ocaml-cppo ocaml-curses ocaml-fileutils ocaml-libvirt ocaml-re 16:19:23 GROUP 5: ocaml-calendar ocaml-gettext supermin 16:19:25 GROUP 6: libguestfs virt-top 16:19:29 GROUP 7: virt-v2v 16:19:30 With a single "first pass", I think I have a hack that would work. 16:19:47 but just getting ocaml in first assures that nothing can be mistakenly built against the previous version 16:20:34 anything with multiple ocaml dep layers will fail, but those that don't will build on the first pass, and subsequent passes should get more and more built until they're all done 16:20:37 Ah, if that will work, then that is good. 16:20:45 I *think* it will work 16:20:57 it certainly shouldn't make things *worse* 16:21:07 It's worth a shot. 16:21:48 Regarding the "hack", what I can do is check a batch for specific packages and if they exist, mark all others as "failing" the first pass so they get retried on pass two. 16:22:24 (Mark them as failed rather than firing them off, I mean) 16:22:47 I can't speak for the implementation, just the concept 16:22:52 That might work. 16:23:05 Right, I think the concept makes sense and would be an improvement. 16:23:33 I'm mentioning the implementation for two reasons: 1) to get feedback on it and 2) so I remember later what I was thinking :) 16:27:07 OK, I'm not hearing any wild disagreement here. 16:27:25 yselkowitz: Mind opening a ticket and I'll get to that probably on Monday? 16:27:50 Both the proposal, and the possible implmentation sound ok to me. 16:28:26 ok 16:28:28 There's one tiny edge-case I can see with this, but it's fairly unlikely. 16:28:35 this->this hack 16:28:44 ?? 16:29:19 Actually, I just thought of another. Let me write them down... 16:30:25 1) If the first pass fails to build e.g. ocaml, the second pass will still happen. 16:30:46 ouch 16:30:54 2) If the first pass succeeds, but all of the second pass fails, they won't get the one auto-retry we have in place for dealing with test flakes 16:31:03 2) is probably ignorable. 16:31:11 The first one concerns m 16:31:12 me 16:31:30 So I may need to do this differently. 16:33:11 I don't know what a good answer here would be. If ocaml or llvm fails (validly; let's set aside flakes for the moment), what should we do about the further packages? 16:33:24 Not every package in the batch is necessarily related to ocaml or llvm. 16:34:48 Sounds like a dep map is the answer, but realistically, cancel anything that is related to the failed 16:35:03 But we don't actually have that information. 16:35:23 the rest of llvm will just fail 16:35:38 Especially in a world that includes automatic BuildRequires 16:35:53 because they are version locked 16:36:27 yselkowitz I'm more concerned about things like OCAML (or python minor version bumps). 16:36:45 Where the builds would succeed, but have incorrect Requires 16:36:57 shouldn't be worse than it is now 16:37:11 python is tagged in though 16:37:21 Right, so python is probably okay. 16:37:28 yes, if ocaml fails, then ocaml-* might be built against the old version 16:37:31 OCAML we currently just outright refuse to attempt 16:38:02 (Both the main package and the ocaml-* ones 16:38:28 attempt? or tag? 16:38:38 Attempt, I think 16:39:15 * yselkowitz thinks they're just not tagged 16:39:42 No, you're right. 16:40:16 It's just skipping the tag. 16:40:47 OK, so I guess this would still be an improvement. 16:41:36 Do we have any examples of this sort of problem outside OCAML? 16:42:08 llvm and ocaml are the ones that come to mind 16:42:33 because they are not tagged (for good reason) and build order matters 16:43:14 LLVM is only an issue around major version bumps (which is twice a year). Is OCAML more frequent? 16:43:29 llvm is micro-version locked 16:43:36 What do you mean? 16:44:39 e.g. clang requires the exact same version of llvm 16:44:59 and then other components require clang(-filesystem) 16:45:31 You sure? It looks like it's only bound to the soname 16:45:38 so doing llvm first means that clang should build on the next pass, leaving the others to build on the subsequent pass 16:45:55 `BuildRequires: %{llvm_pkg_name}-devel = %{version}` 16:47:27 Are you () kidding me? 16:47:54 no kidding, I've built both stacks manually enough times to know 16:48:44 I really have to question why they are separate packages, if they're that interrelated 16:49:51 So ... big question here. and maybe I need to read your blog better. Are we not tagging in the Fedora version into the side-tag before a build? ... or is that what you are meaning by "tagging in" .... and ... now the tagging part of this conversation just made sense. 16:50:00 not for llvm and ocaml 16:50:11 they're not binary compatible between fedora and eln 16:50:11 Sorry ... meant to delete that but hit enter instead. 16:50:16 Right, certain packages with known issues are excluded. 16:50:31 Because we know the Fedora and ELN versions are incompatible 16:50:56 In the case of LLVM and OCAML, the ELN runtime cannot run anything built with the Fedora version of the compiler 16:51:04 mind you, ocaml and binary compatibility don't go in the same paragraph, if you sneeze at it, it breaks :-) 16:51:14 *laughs* 16:51:38 * yselkowitz speaks from many years of experience 16:55:21 OK, I think you're still right that we need to have a list of special packages that need to be built before we attempt any others, but it still doesn't answer the question of what to do with the rest of the batch. 16:55:57 Especially if the `ocaml` or `llvm` package ends up failing 16:57:09 I know it's not *safe* to just fire it off, but we really only have two choices: 16:57:22 the rest of the batch should wait the first time 16:57:44 1) Fire off the batch and let ocaml-* break 16:57:44 2) Abort the batch and let that mean that some unrelated packages may not get attempted. 16:58:07 I mean *after* we've tried the ocaml build on its own *and it fails* 16:58:16 #1 16:58:43 because we don't control what else is in the batch 16:58:49 Yeah, I think that's the least-bad option, but I was kind of hoping someone had a great idea for a "third way" :) 16:58:59 Technically, #1 is what we are currently doing. So ... in theory, it isn't any worse than right now. 16:59:26 I want to say #2, because it is safer, but what if it happens before a long weekend and/or holiday and nobody looks at it for a while. 16:59:47 Well, #2 only seems safer; I don't think it actually is. 16:59:48 unless EBS gains some interactiveness to prompt for and respond to intervention 17:00:14 Because what if the remainder of the batch includes a soname bump for a major library? 17:00:39 #1 is the only safe option 17:00:51 it's no worse than the current situation 17:01:07 I guess in the specific case of `ocaml-*` I could maybe skip just those packages if the `ocaml` build has failed. 17:01:10 and has the potential at least to improve things 17:01:11 correct ... it's not worse that currently, and at the same time, better than currently. 17:01:32 But that's a one-off, not a general solution 17:01:43 what's a one-off 17:01:45 ? 17:02:02 and we're out of time 17:02:06 Adding a hack so if `ocaml` fails, we auto-skip `ocaml-*` 17:02:17 But let the others run 17:02:46 The channel is free after our meeting, so we can run long if we want to keep going on this. 17:03:01 ok 17:04:38 That helps, but still doesn't skip any others which might have a builRequires on ocaml. 17:05:34 we could make more lists for "skip these if this doesn't build" but no idea how complicated that is 17:05:37 While auto BuildRequires does create a problem, we have the advantage that fedora just built those packages for rawhide, so we could get proper buildreqs from the rawhide srpms 17:05:43 Sure... but at this point we're mitigating, not solving. 17:06:23 jforbes: Sure... if they aren't conditionalizing BuildRequires too 17:06:42 sgallagh: if they are conditionalizing, they aren't auto BuildRequires 17:06:52 not necessarily true 17:07:21 given that the SRPM download is heavy, it would make sense to only use that when the package is using auto buildrequires 17:08:13 not sure what auto buildreqs have to do with this? 17:08:26 Oh, I see what you mean, if they conditionalize something else in the build, it might bring in a new BuildRequires that isn't in rawhide 17:08:36 Yes 17:08:55 yselkowitz: I was thinking if you build a depmap for BuildRequires, you can know truly what to cancel when a package fails. 17:08:57 auto-BR is one way that happens, but not the only one 17:09:25 jforbes: If that was possible, we'd be using it to do ordering right from the get-go. Unfortunately, it's not reliable 17:09:27 sgallagh: sure, but for non auto-BR, we can use the lighter weight spec glance 17:09:40 just because package A depends on B, and both land in the same batch, does not necessarily mean they need to be built in order 17:09:52 and besides, most packages are cross-tagged 17:10:15 And in some cases, they are circularly dependent and trying to order them will get it wrong 17:10:19 here we're talking about very specific package sets which are not tagged and are always build-order sensitive 17:10:45 Point: saying "are not tagged" is confusing. 17:10:57 not cross-tagged 17:11:03 We probably need to say "the Rawhide version cannot be in the buildroot" 17:11:46 Since that's the effect of not tagging it into the batch side-tag 17:11:55 Was just thinking about how we solved this with conary and the rPath build system a while ago, but we had the advantage of putting bootstrap build info into the recipes, so if you had a circular dep, the build order was added so that the buildsystem knew how to order 17:12:19 Yeah, that was one of the requirements we had on Modularity too. 17:12:19 * yselkowitz feels like this is going off the rails 17:12:28 Almost as if this is a hard problem! 17:13:07 jforbes: If you haven't had a chance to read it, I did a write-up of our approach the other day; I linked it earlier in the meeting. 17:14:02 OK, this clearly needs some more thought before I jump to any kind of implementation. yselkowitz, please file that ticket after the meeting. 17:14:19 We're well over time and should probably close out the meeting 17:16:05 thanks sgallagh 17:16:11 #endmeeting