[PATCH] D120781: [IRLinker] materialize Functions before moving any

Duncan P. N. Exon Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 10 15:43:36 PST 2022


dexonsmith added inline comments.


================
Comment at: llvm/lib/Linker/IRMover.cpp:1605-1613
+          for (Use &U : I.operands())
+            if (auto *Callee = dyn_cast<Function>(U)) {
+              // TODO: shouldLink?
+              if (!Callee->isMaterializable())
+                continue;
+              if (Error E = Callee->materialize())
+                return E;
----------------
I worry this could cause a big compile-time hit for ThinLTO, depending on the exact code paths it uses.

Here's part of a testcase for situation I'm worried about:
```
; f1.ll
define weak_odr void @f1() {
  ret void
}

; f2.ll
define linkonce_odr void @f1() {
  ret void
}
define void @f2() {
  call @f1()
  ret void
}
```
The scenario (more scaffolding needed):
- Linker decides to take `@f1` from f1.ll and `@f2` from f2.ll.
- The worklist for `f2.bc` contains just `@f2`, and `@f1` will (never) be materialized from `f2.bc`.
- IIUC, this patch changes it so that `@f1` is eagerly materialized from `f2.bc` even though it will be ignored.

In the general case, this would effectively defeat lazy-loading of templates and inline functions.

@tejohnson , do you know how to set up the above scenario for ThinLTO?


================
Comment at: llvm/lib/Linker/IRMover.cpp:1522-1526
+  for (GlobalValue *GV : Worklist)
+    if (auto *F = dyn_cast<Function>(GV))
+      if (!F->isDeclaration())
+        if (Error E = F->materialize())
+          return E;
----------------
nickdesaulniers wrote:
> nickdesaulniers wrote:
> > nickdesaulniers wrote:
> > > dexonsmith wrote:
> > > > Can `materialize()` add things to the Worklist? If so, I don't think the ranged-based for is safe.
> > > > 
> > > > I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time?
> > > > Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is safe.
> > > 
> > > I don't think so (though please do triple check).  `IRLinker::maybeAdd` is the only method that adds to the worklist. That is only called from `IRLinker`'s constructor and `IRLinker::shouldLink`. While `IRLinker::shouldLink` is called from `IRLinker::materialize`, the method I'm calling here is `GlobalValue::materialize` (which through some indirection is a call to `BitcodeReader::materialize`.  While these methods are all called `materialize`, `BitcodeReader` doesn't (AFAICT) have any reference to `IRLinker`'s private `Worklist` member.
> > > 
> > > To triple check the other calls of `IRLinker::shouldLink`, I see `IRLinker::linkAppendingVarProto` and `IRLinker::linkGlobalValueProto`. The latter may call the former. The former is only called from `IRLinker::materialize`.
> > > 
> > > So check my work, but it looks to me like the answer to `Can materialize() add things to the Worklist?` is "No; `IRLinker::materialize` can, but I'm not calling that; I'm calling `GlobalValue::materialize`". So I don't think there's an iterator invalidation bug here as implied by the question.
> > > 
> > > Working on a reply to the second question...stay tuned.
> > > I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time?
> > 
> > Here's a test case that fails, with the same error I'm trying to fix here.
> > ```
> > declare void @f(i8*)
> > 
> > ; 1. @x gets parsed + moved. References to @x's %label can no longer be
> > ;    created.
> > define void @x() {
> >   br label %label
> > 
> > label:
> >   ret void
> > }
> > 
> > ; 2. @y not parsed; will be parsed lazily if referenced.
> > define linkonce_odr void @y() {
> >   br label %label
> > 
> > label:
> > ; 4. when lazy parsing, @x's %label is gone, resulting in a badref.
> >   call void @f(i8* blockaddress(@x, %label))
> >   ret void
> > }
> > 
> > ; 3. @z parsed, call to @y adds @y to worklist.
> > define void @z() {
> >   call void @y()
> >   ret void
> > }
> > ```
> > ---
> > ```
> > $ llvm-as z.ll
> > $ llvm-link -S z.bc
> > error: Never resolved function from blockaddress (Producer: 'LLVM15.0.0git' Reader: 'LLVM 15.0.0git')
> > 
> > ```
> > 
> > Let me see if I can more thoroughly separate the parsing from the linking/remapping, then I'll add the above test case to this patch.
> > I also worry if it's sufficient; can the worklist loop below add new functions to the worklist, which haven't been materialized ahead of time?
> 
> Diff 414166 now appears to handle this.  I have a TODO on the existing call to `GlobalValue::materialize()` that I'd like to remove. That causes some tests to fail, so marking this "Changes Planned" until I have more time to sort that out.  But PTAL with this approach.
>> Can materialize() add things to the Worklist? If so, I don't think the ranged-based for is safe.
> I don't think so (though please do triple check).

Is there a way to add an assertion to that effect? If so, it'd be great to do that as a prep commit (do a local bootstrap with assertions on, see if it survives the pre-commit bots, etc.) before we starting to rely on the fact. This is also important to prevent that behaviour from showing up later. As you described, the logic here is entangled. I've been surprised before by various co-recursive things in the valuemapper and irlinker.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120781/new/

https://reviews.llvm.org/D120781



More information about the llvm-commits mailing list