[PATCH] D49362: [ThinLTO] Internalize read only globals

Eugene Leviant via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 25 00:45:42 PDT 2018


evgeny777 added a comment.

> I see, yes that explanation makes sense. I suppose on the importing side we could go back and update the linkage for imported variables based on the linkage in the index after the IRMover importing is complete for the module (since the linkage in the index is currently ignored on the importing side). On the exporting side it would get internalized as usual during the index-based internalization. This has the advantage of being more consistent with how we currently internalize based on thin link analysis. Would that not work as expected?

It seems to me as significantly more complex approach. At the first glance, I'll have to:

- Prevent promotion on thin link phase in both `LTO.cpp` and `ThinLTOCodeGenerator.cpp` for all read-only variables.
- On promote phase check all read-only variables with local linkage preventing them from becoming `available_externally`. Non-local vars must not be touched (or IRMover will fail to link them to declarations), so we also need step 3:
- After import phase we still need to analyze external read only globals, internalizing them when possible (checking index, instead of attributes)

So compared to current implementation we:

- add extra step (prevent promotion in index)
- must have all three steps consistent, which means spreading checks around the code. For instance, if variable is not eligible to import it (a) shouldn't be prevented from promotion, (b) should have `available_externally` linkage when being imported and (c) shouldn't be internalized if it has non-local linkage.



================
Comment at: include/llvm/IR/ModuleSummaryIndex.h:530
+  // non-volatile load instructions.
+  unsigned ImmutableRefCnt;
+
----------------
tejohnson wrote:
> evgeny777 wrote:
> > tejohnson wrote:
> > > Think it would be clearer to have a ref edge type and include a flag there.
> > What about memory usage? Also, like I said, we have all constant references grouped at the end of RefEdgeList, so this looks excessive
> I have mixed feelings on this. With your current scheme you don't add increase the in-memory index size, but it is unfortunate to have the read only flag on all ValueInfo when it only applies to reference edges. Let me take a look at some of our large indexes today and see what the extra overhead would be to make this explicit on just the ref edges.
I've already changed the way immutable references are stored. Now I'm using an extra flag in `ValueInfo` which supposedly has zero overhead. I'm still using immutable reference counter when store/load index to/from bitcode file as all read-only references are still grouped in the end of RefEdges array to save disk space.


================
Comment at: lib/Analysis/ModuleSummaryAnalysis.cpp:371
+  std::vector<ValueInfo> Refs = RefEdges.takeVector();
+  if (IsThinLTO)
+    for (; RefCnt < Refs.size(); ++RefCnt)
----------------
tejohnson wrote:
> Comment needed. I suppose this is because we need to be able to import into any module containing a ref?
We can't internalize a variable if it is referenced by a function in regular LTO module. This is because regular LTO modules are not participating in ThinLTO import, so we can't make a local copy there.


================
Comment at: lib/Analysis/ModuleSummaryAnalysis.cpp:421
+  // TODO: Currently we don't support importing GV with references,
+  // so non-empty RefEdges is equivalent to notElgibleToImport.
+  GlobalVarSummary::GVarFlags VarFlags(
----------------
tejohnson wrote:
> The initialization below seems to be a combination of conditions on whether it can be internalized and whether it can be imported. Can you better document this?
> 
> Non-empty RefEdges isn't related to the notEligibleToImport flag, but rather they both preclude importing of a global var. Don't we also need to set the read only flag based on notEligibleToImport if we want to prevent it from being set unless its def can be imported?
> 
> In general, since you presumably need to ensure that all variables marked read only at the end of the thin link are imported into every referencing module, I think it would be good to make this connection more explicit (in comments and in code). Can you enforce this during the thin link (i.e. during computeImportForReferencedGlobals)? E.g. detect when we can't import a reference and then assert if it is marked read only? Alternatively, rather than trying to detect here when it can't be imported and initializing to not read only, during the thin link importing can you clear the read only flag on any variable with a reference that we aren't able to import? Otherwise I am concerned that this relationship may prove fragile.
Makes sense.

The `RefEdges.empty()` is the only condition on whether a variable can be imported, all others are related to internalization. I think, I can safely move it (and only it) to `propagateConstants` if it looks confusing here. My concerns were ThinLTO cache key computation, but it seems such move won't affect it  , because import and export lists will be different so will be the cache key.


================
Comment at: lib/IR/ModuleSummaryIndex.cpp:36
+  // Here we take advantage of having all read-only references
+  // located in the end of the RefEdgeList.
+  auto Refs = refs();
----------------
tejohnson wrote:
> Is this comment current? It looks like you will look at all of the refs below so I don't see any advantage being taken from them being at the end. You could presumably take advantage of it by breaking out of the loop when you see the first non-readonly ref since you are walking it backwards. Not sure if it is worth it though, might be better just to look at them all for simplicity (and in case this ordering scheme ever changes).
Yes, the comment is current, but there is a bug in implementation, it should be something like:
```
for (int I = Refs.size() - 1; I >= 0 && Refs[I].isReadOnly(); --I)
    ImmutableRefCnt ++;
```


================
Comment at: lib/IR/ModuleSummaryIndex.cpp:106
+static bool hasReadOnlyFlag(const GlobalValueSummary *S) {
+  if (auto *GVS = dyn_cast<GlobalVarSummary>(S))
+    return GVS->isReadOnly();
----------------
tejohnson wrote:
> Does this need to handle S being an alias like the above method? Looks like not from the current callsite, but I think it should invoke getBaseObject() to be consistent with the above method and also to be safe in case new calls are added that might pass a generic GlobalValueSummary.
This is a helper method used by dot dumper exclusively. I think it should be moved below to a place of actual call.


================
Comment at: lib/LTO/LTO.cpp:153
       AddUint64(Fn);
+      if (auto *GVS = dyn_cast<GlobalVarSummary>(
+              Index.getGlobalValueSummary(Fn, false)))
----------------
tejohnson wrote:
> I don't think this is necessary because later on (line 221) we walk all the import summaries and invoke AddUsedThings on them.
Yep, I missed second call of AddUsedThings


================
Comment at: lib/Transforms/IPO/FunctionImport.cpp:828
+    // If import is disabled we should drop read-only attribute
+    // from all summaries to prevent internalization.
+    for (auto &P : Index)
----------------
tejohnson wrote:
> Why is this necessary? Since internalizeImmutableGVs is only invoked from importFunctions(), presumably it doesn't have any effect anyway when not importing?
On -lto-O0 ThinLTO import is entirely disabled, so we have to prevent internalization of read-only stuff or we'll get link errors.


https://reviews.llvm.org/D49362





More information about the llvm-commits mailing list