[PATCH] D49362: [ThinLTO] Internalize read only globals

Wed Oct 24 09:28:22 PDT 2018

tejohnson added a comment.

Thanks for the update. A bunch of comments below. I am a little concerned on the relationship between the read only flag on the global var and the importing requirements (see comment in computeVariableSummary). Also, it would be good to have a high level description of the algorithm and how this works somewhere (maybe above propagateConstants()).

In https://reviews.llvm.org/D49362#1268278, @evgeny777 wrote:

> > In fact I think that making this change to shouldPromoteLocalToGlobal on the importing side to get the linkage type from the index would work today
>
> Yep, this seems obvious and the very first version of this patch uses this kind of approach. Unfortunately it didn't work quite well for me. Modifying linkage in the index will work fine on promotion and
>  internalization phases. However it's the import phase where all bad things happen. When performing import ThinLTO loads source module and does specific modifications in it given list of globals to import
>  Those modifications include renaming globals to avoid naming conflicts and setting linkage of imported entities to `available_externally`. After that IRMover performs all the dirty work.
>
> The main problem in this scheme is handling read only external globals. We can't set internal linkage for them in `renameModuleForThinLTO`, because IRMover will fail to link variable definition to 
>  it's external declaration. These two variables will be considered different:
>
>   @g = internal global i32 0
>   @g = external global i32
>
>
> IRMover will rename internal definition, leaving external declaration unsatisfied. This will obviously result in link error.
>
> So while it's possible to use proposed approach to deal with promoted static variables, one still needs some processing in the backend to deal with read only external globals. Still backend approach can deal with both cases, so I used it.

I see, yes that explanation makes sense. I suppose on the importing side we could go back and update the linkage for imported variables based on the linkage in the index after the IRMover importing is complete for the module (since the linkage in the index is currently ignored on the importing side). On the exporting side it would get internalized as usual during the index-based internalization. This has the advantage of being more consistent with how we currently internalize based on thin link analysis. Would that not work as expected?

================
Comment at: include/llvm/IR/ModuleSummaryIndex.h:530
+  // non-volatile load instructions.
+  unsigned ImmutableRefCnt;
+
----------------
evgeny777 wrote:
> tejohnson wrote:
> > Think it would be clearer to have a ref edge type and include a flag there.
> What about memory usage? Also, like I said, we have all constant references grouped at the end of RefEdgeList, so this looks excessive
I have mixed feelings on this. With your current scheme you don't add increase the in-memory index size, but it is unfortunate to have the read only flag on all ValueInfo when it only applies to reference edges. Let me take a look at some of our large indexes today and see what the extra overhead would be to make this explicit on just the ref edges.

================
Comment at: lib/Analysis/ModuleSummaryAnalysis.cpp:371
+  std::vector<ValueInfo> Refs = RefEdges.takeVector();
+  if (IsThinLTO)
+    for (; RefCnt < Refs.size(); ++RefCnt)
----------------
Comment needed. I suppose this is because we need to be able to import into any module containing a ref?

================
Comment at: lib/Analysis/ModuleSummaryAnalysis.cpp:421
+  // TODO: Currently we don't support importing GV with references,
+  // so non-empty RefEdges is equivalent to notElgibleToImport.
+  GlobalVarSummary::GVarFlags VarFlags(
----------------
The initialization below seems to be a combination of conditions on whether it can be internalized and whether it can be imported. Can you better document this?

Non-empty RefEdges isn't related to the notEligibleToImport flag, but rather they both preclude importing of a global var. Don't we also need to set the read only flag based on notEligibleToImport if we want to prevent it from being set unless its def can be imported?

In general, since you presumably need to ensure that all variables marked read only at the end of the thin link are imported into every referencing module, I think it would be good to make this connection more explicit (in comments and in code). Can you enforce this during the thin link (i.e. during computeImportForReferencedGlobals)? E.g. detect when we can't import a reference and then assert if it is marked read only? Alternatively, rather than trying to detect here when it can't be imported and initializing to not read only, during the thin link importing can you clear the read only flag on any variable with a reference that we aren't able to import? Otherwise I am concerned that this relationship may prove fragile.

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:36
+  // Here we take advantage of having all read-only references
+  // located in the end of the RefEdgeList.
+  auto Refs = refs();
----------------
Is this comment current? It looks like you will look at all of the refs below so I don't see any advantage being taken from them being at the end. You could presumably take advantage of it by breaking out of the loop when you see the first non-readonly ref since you are walking it backwards. Not sure if it is worth it though, might be better just to look at them all for simplicity (and in case this ordering scheme ever changes).

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:99
+static void clearReadOnlyFlag(GlobalValueSummary *S) {
+  if (auto *AS = dyn_cast<AliasSummary>(S))
+    S = &AS->getAliasee();
----------------
Just invoke S->getBaseObject() for simplicity.

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:106
+static bool hasReadOnlyFlag(const GlobalValueSummary *S) {
+  if (auto *GVS = dyn_cast<GlobalVarSummary>(S))
+    return GVS->isReadOnly();
----------------
Does this need to handle S being an alias like the above method? Looks like not from the current callsite, but I think it should invoke getBaseObject() to be consistent with the above method and also to be safe in case new calls are added that might pass a generic GlobalValueSummary.

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:120
+    const DenseSet<GlobalValue::GUID> &GUIDPreservedSymbols) {
+  auto ForEachLiveSummary =
+      [&](llvm::function_ref<void(GlobalValue::GUID, GlobalValueSummary *)> F) {
----------------
Is there a need to perform each of the steps below in a separate walk over the whole index? I don't see that we actually propagate the flag from global var to global var, so I don't see any need for ordering. Can you do one walk over the index and perform all steps on each summary exactly once?

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:128
+
+  // Step 1: Find immutable global value summaries, which we can later
+  // internalize. Preserved symbols are visible externally, so we don't
----------------
This first sentence doesn't seem to relate specifically to Step 1, but rather to the whole algorithm.

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:132
+  ForEachLiveSummary([&](GlobalValue::GUID Id, GlobalValueSummary *S) {
+    if (GUIDPreservedSymbols.count(Id))
+      clearReadOnlyFlag(S);
----------------
I would expect the size of this set to be << the size of the whole index, so walking the whole index and checking each seems like it might be less efficient overall compared to doing a lookup in the index of each GUID preserved symbol. That being said, I think a better way is to walk the index once and perform all 3 steps (as mentioned above).

================
Comment at: lib/IR/ModuleSummaryIndex.cpp:143
+
+  // Step 3: All non-instruction refs are not constant
+  ForEachLiveSummary([&](GlobalValue::GUID, GlobalValueSummary *S) {
----------------
Suggest adding the explanation of why.

================
Comment at: lib/LTO/LTO.cpp:153
       AddUint64(Fn);
+      if (auto *GVS = dyn_cast<GlobalVarSummary>(
+              Index.getGlobalValueSummary(Fn, false)))
----------------
I don't think this is necessary because later on (line 221) we walk all the import summaries and invoke AddUsedThings on them.

================
Comment at: lib/Transforms/IPO/FunctionImport.cpp:828
+    // If import is disabled we should drop read-only attribute
+    // from all summaries to prevent internalization.
+    for (auto &P : Index)
----------------
Why is this necessary? Since internalizeImmutableGVs is only invoked from importFunctions(), presumably it doesn't have any effect anyway when not importing?

https://reviews.llvm.org/D49362