[PATCH] D97128: [ThinLTO] Make cloneUsedGlobalVariables deterministic

Teresa Johnson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 23 09:59:14 PST 2021


tejohnson added inline comments.


================
Comment at: llvm/lib/IR/Module.cpp:662
+GlobalVariable *llvm::collectUsedGlobalVariables(
+    const Module &M, SmallVectorImpl<GlobalValue *> &Set, bool CompilerUsed) {
+  const char *Name = CompilerUsed ? "llvm.compiler.used" : "llvm.used";
----------------
MaskRay wrote:
> tejohnson wrote:
> > MaskRay wrote:
> > > tejohnson wrote:
> > > > MaskRay wrote:
> > > > > tejohnson wrote:
> > > > > > MaskRay wrote:
> > > > > > > tejohnson wrote:
> > > > > > > > MaskRay wrote:
> > > > > > > > > MaskRay wrote:
> > > > > > > > > > tejohnson wrote:
> > > > > > > > > > > Instead of having a second very similar interface, which could get confusing, how about just change it to use a set with a deterministic iterator, like std::set? Or if it is just that one callsite that needs the deterministic ordering, perhaps just copy the SmallPtrSet into a std::set there. Otherwise, the interfaces need to be carefully documented to avoid confusion. I'd lean towards just coping into a std::set in this instance for simplicity. WDYT?
> > > > > > > > > > We probably should delete the SmallPtrSet overload. I've checked several other call sites - The call sites don't seem to benefit from a SmallPtrSet as well.  Some call sites actually have similar non-deterministic issues. (I have confirmed that if I add a `std::reverse` I can get different bitcode files.) 
> > > > > > > > > > 
> > > > > > > > > > Using `std::set` will require a comparator on name. Let's use SmallVectorImpl here and migrate SmallPtrSet separately?
> > > > > > > > > Hmm. Some call sites benefit from a SmallPtrSet (`count`), while some are better off with `SmallVectorImpl`. It looks to me that both overloads are useful.
> > > > > > > > That's true about needing to ensure std::set sorted by name not pointer value. I saw you sent a follow on patch to fix LowerTypeTests. I think there are other places that would have the same issue (e.g. in the BitcodeWriter, in EmbedBitcodeInModule - and that was just the first one I looked at).
> > > > > > > > 
> > > > > > > > My concern is that with 2 similar interfaces it will be confusing and error prone which one gets used.
> > > > > > > > 
> > > > > > > > How about using a SetVector? From https://llvm.org/docs/ProgrammersManual.html#llvm-adt-setvector-h:
> > > > > > > > "The difference between SetVector and other sets is that the order of iteration is guaranteed to match the order of insertion into the SetVector. This property is really important for things like sets of pointers."
> > > > > > > > 
> > > > > > > A SetVector is by default a composition of a SmallVector and a SmallDenseSet. Its space consumption is the two data structures combined.
> > > > > > > 
> > > > > > > Another idea is `collectUsedGlobalVariables(const Module &M, function_ref<void(GlobalValue*)>, bool CompilerUsed)`: let the call site pass in a callback. In every call site we'll do `[&](GlobalValue *X) { Set.insert(X); }` or `push_back`. That would make every call site use a bit more code.
> > > > > > > 
> > > > > > > How about adding `collectUsedGlobalVariablesImpl` for function_ref and still keep the SmallVectorImpl+SmallPtrSet interfaces. The internal implementation is not duplicated.
> > > > > > > 
> > > > > > > In terms of interface confusion... I think this scheme is no more confusing than the template solution (moving the Module.cpp implementation to Module.h and making the function a template).
> > > > > > My sense from experience is that these llvm*.used typically don't hold that many values and aren't terribly common, so I'm not sure that the overhead of the SetVector should be that much of a concern. This seems to be the case that data structure was designed for, and would keep the interface simple. I also see that in most cases the results of collectUsedGlobalVariables are usually consumed and then discarded pretty quickly.
> > > > > > 
> > > > > > Before adding new interfaces with more complexity, I'd like to understand whether the SetVector overhead is really an issue in practice. Are you seeing long lived and large sets of references from the llvm*.used globals?
> > > > > I think the size overhead of SetVector is minor, but the additional ability it provides can confuse users at all call sites. 
> > > > > 
> > > > > The user needs to instantiate a SetVector in a call site, where they just use the Set functionality or the Vector functionality. The powerful SetVector makes readers wonder whether the Set/the Vector features are both used.
> > > > > 
> > > > This seems like a lower level of confusion than figuring out which interface is used (or even realizing that a different interface may be appropriate for their use than one they might have copied from other code). We can just document the interface at the header, as to why a SetVector is used (makes both set based interaction as well as those clients relying on deterministic iteration work equally well).
> > > If the overhead is low, how about removing the `SmallPtrSet` overload in a follow-up? (That requires to refactor most call sites to use `SmallVectorImpl` instead and the one or two SmallPtrSet use cases to construct a `SmallPtrSet` from `SmallVector` after the collect* call.
> > > 
> > > If we do that, I think this is resolved:) 
> > Still have a preference for using SetVector for its simplicity in both use cases. If you are strongly opposed, then migrating to SmallVector is ok with me, if the set use case is rare.
> I have been thinking about this.
> 
> To have inline elements, we have to use SmallSetVector. SmallSetVector doubles the stack consumption. I guess I'll follow
> https://eel.is/c++draft/sequence.reqmts "When choosing a container, remember vector is best; leave a comment to explain if you choose from the rest!" and drop the SmallPtrSetImpl overload in D97257. 
Looking at the use sites in D97257, it looks like 50% want to use the result as a set and 50% as a vector. To me this, combined with the expected small size of elements, argues strongly in favor of SetVector for simplicity - it seems to be the case SetVector was designed for.

To get a sense of how large these sets get in practice, I built a large internal project with >30k input files, and printed out the size of the llvm.used and llvm.compiler.used sets in each when we go to build the summary index on a ThinLTO build. All llvm.compiler.used were of size 0 and almost all of the llvm.used were also 0, with 9 having a set of size 1. Here's the histogram:


```
      9 Used size: 1
  31198 Used size: 0
  31207 CompilerUsed size: 0
```

I then rebuilt with WPD which as of my recent patch puts available_externally vtables in the llvm.compiler.used set. Here's a histogram of the resulting sizes:

```
      1 CompilerUsed size: 40
      1 CompilerUsed size: 15
      4 CompilerUsed size: 9
     11 CompilerUsed size: 8
     26 CompilerUsed size: 7
     57 CompilerUsed size: 6
    224 CompilerUsed size: 5
    403 CompilerUsed size: 4
    471 CompilerUsed size: 3
   1306 CompilerUsed size: 2
      9 Used size: 1
   2275 CompilerUsed size: 1
  29768 Used size: 0
  24998 CompilerUsed size: 0
```

Many more are non-zero, but all but 2 are <10, the vast majority are <5 and still mostly 0.

My concern is that making the interface a vector rather than SetVector adds a more complicated usage model to solve an overhead that is not a problem in practice.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97128/new/

https://reviews.llvm.org/D97128



More information about the llvm-commits mailing list