[llvm] r263275 - [ThinLTO] Support for reference graph in per-module and combined summary.

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 14 05:27:05 PDT 2016


On Sat, Mar 12, 2016 at 2:04 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Sat, Mar 12, 2016 at 7:15 AM, Teresa Johnson <tejohnson at google.com>
> wrote:
>
>> Hi David,
>>
>> Thanks for the review! Responses below.
>>
>> Teresa
>>
>> On Fri, Mar 11, 2016 at 10:23 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Mar 11, 2016 at 10:52 AM, Teresa Johnson via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>>> Author: tejohnson
>>>> Date: Fri Mar 11 12:52:24 2016
>>>> New Revision: 263275
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=263275&view=rev
>>>> Log:
>>>> [ThinLTO] Support for reference graph in per-module and combined
>>>> summary.
>>>>
>>>> Summary:
>>>> This patch adds support for including a full reference graph including
>>>> call graph edges and other GV references in the summary.
>>>>
>>>> The reference graph edges can be used to make importing decisions
>>>> without materializing any source modules, can be used in the plugin
>>>> to make file staging decisions for distributed build systems, and is
>>>> expected to have other uses.
>>>>
>>>> The call graph edges are recorded in each function summary in the
>>>> bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
>>>> data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
>>>> there is PGO, where the ValueId can be mapped to the function GUID via
>>>> the ValueSymbolTable. In the function index in memory, the call graph
>>>> edges reference the target via the CalleeGUID instead of the
>>>> CalleeValueId.
>>>>
>>>> The reference graph edges are recorded in each summary record with a
>>>> list of referenced value IDs, which can be mapped to value GUID via the
>>>> ValueSymbolTable.
>>>>
>>>> Addtionally, a new summary record type is added to record references
>>>> from global variable initializers. A number of bitcode records and data
>>>> structures have been renamed to reflect the newly expanded scope of the
>>>> summary beyond functions. More cleanup will follow.
>>>>
>>>> Reviewers: joker.eph, davidxl
>>>>
>>>> Subscribers: joker.eph, llvm-commits
>>>>
>>>> Differential Revision: http://reviews.llvm.org/D17212
>>>>
>>>>
>> <snipped>
>>
>>
>>>> +
>>>> +/// \brief Function summary information to aid decisions and
>>>> implementation of
>>>> +/// importing.
>>>> +class FunctionSummary : public GlobalValueSummary {
>>>>
>>>
>>> This is a slightly odd inheritance, having no virtual functions (had it
>>> had virtual functions the warnings related to missing virtual dtors
>>> would've fired).
>>>
>>
>> Ah, good to know that distinction in getting the warning.
>>
>>
>>>
>>> Should some functionality be rolled into these classes to generalize
>>> their use, that would be virtual? (ie: the only way to use this class is
>>> with a dyn_cast to access the members - perhaps the code that dyn_casts
>>> could instead be calling a virtual function to dispatch between the two
>>> cases?)
>>>
>>
>> I suspect as we add uses of the combined index in the function importing
>> and other places that we will need to add more methods that would be
>> appropriate as virtual functions. Currently there wasn't a need though (the
>> one place doing the dyn_casts below seems clearer to me with casting).
>>
>
> Fair enough, just a thought.
>
>
>>
>>
>>> Speaking of dyn casts...
>>>
>>>
>>
>> <snipped>
>>
>>
>>> -/// Emit the per-module function summary section alongside the rest of
>>>> +// Collect the global value references in the given variable's
>>>> initializer,
>>>> +// and emit them in a summary record.
>>>> +static void WriteModuleLevelReferences(const GlobalVariable &V,
>>>> +                                       const ValueEnumerator &VE,
>>>> +                                       SmallVector<uint64_t, 64>
>>>> &NameVals,
>>>> +                                       unsigned FSModRefsAbbrev,
>>>> +                                       BitstreamWriter &Stream) {
>>>> +  DenseSet<unsigned> RefEdges;
>>>> +  SmallPtrSet<const User *, 8> Visited;
>>>> +  findRefEdges(&V, VE, RefEdges, Visited);
>>>> +  unsigned RefCount = RefEdges.size();
>>>> +  if (RefCount) {
>>>> +    NameVals.push_back(VE.getValueID(&V));
>>>> +    NameVals.push_back(getEncodedLinkage(V.getLinkage()));
>>>> +    for (auto RefId : RefEdges) {
>>>> +      NameVals.push_back(RefId);
>>>> +    }
>>>> +    Stream.EmitRecord(bitc::FS_PERMODULE_GLOBALVAR_INIT_REFS, NameVals,
>>>> +                      FSModRefsAbbrev);
>>>> +    NameVals.clear();
>>>> +  }
>>>> +}
>>>> +
>>>> +/// Emit the per-module summary section alongside the rest of
>>>>  /// the module's bitcode.
>>>> -static void WritePerModuleFunctionSummary(
>>>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>>
>>>> &FunctionIndex,
>>>> +static void WritePerModuleGlobalValueSummary(
>>>> +    DenseMap<const Function *, std::unique_ptr<GlobalValueInfo>>
>>>> &FunctionIndex,
>>>>      const Module *M, const ValueEnumerator &VE, BitstreamWriter
>>>> &Stream) {
>>>> -  Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);
>>>> +  if (M->empty())
>>>> +    return;
>>>>
>>>> -  // Abbrev for FS_CODE_PERMODULE_ENTRY.
>>>> +  Stream.EnterSubblock(bitc::GLOBALVAL_SUMMARY_BLOCK_ID, 3);
>>>> +
>>>> +  // Abbrev for FS_PERMODULE.
>>>>    BitCodeAbbrev *Abbv = new BitCodeAbbrev();
>>>> -  Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_PERMODULE_ENTRY));
>>>> +  Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE));
>>>>    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // valueid
>>>>    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
>>>>    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount
>>>> -  unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs
>>>> +  // numrefs x valueid, n x (valueid, callsitecount)
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
>>>> +  unsigned FSCallsAbbrev = Stream.EmitAbbrev(Abbv);
>>>>
>>>> -  SmallVector<unsigned, 64> NameVals;
>>>> +  // Abbrev for FS_PERMODULE_PROFILE.
>>>> +  Abbv = new BitCodeAbbrev();
>>>> +  Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_PROFILE));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // valueid
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs
>>>> +  // numrefs x valueid, n x (valueid, callsitecount, profilecount)
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
>>>> +  unsigned FSCallsProfileAbbrev = Stream.EmitAbbrev(Abbv);
>>>> +
>>>> +  // Abbrev for FS_PERMODULE_GLOBALVAR_INIT_REFS.
>>>> +  Abbv = new BitCodeAbbrev();
>>>> +  Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_GLOBALVAR_INIT_REFS));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));  // valueids
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
>>>> +  unsigned FSModRefsAbbrev = Stream.EmitAbbrev(Abbv);
>>>> +
>>>> +  SmallVector<uint64_t, 64> NameVals;
>>>>    // Iterate over the list of functions instead of the FunctionIndex
>>>> map to
>>>>    // ensure the ordering is stable.
>>>>    for (const Function &F : *M) {
>>>> @@ -2817,9 +2974,9 @@ static void WritePerModuleFunctionSummar
>>>>      assert(FunctionIndex.count(&F) == 1);
>>>>
>>>>      WritePerModuleFunctionSummaryRecord(
>>>> -        NameVals, FunctionIndex[&F]->functionSummary(),
>>>> -        VE.getValueID(M->getValueSymbolTable().lookup(F.getName())),
>>>> FSAbbrev,
>>>> -        Stream);
>>>> +        NameVals,
>>>> dyn_cast<FunctionSummary>(FunctionIndex[&F]->summary()),
>>>>
>>>
>>> Given this is a FunctionIndex, I assume all its summaries are
>>> FunctionSummaries? So presumably this could be a cast instead of a dyn_cast?
>>>
>>
>> Good point, this particular map (unlike the full index built by the
>> reader) will always contain only function summaries. Will change to cast.
>>
>
> Actually, a second thought on this - could the FunctionIndex just be of
> the specific type in the first place? Or does it need to be of the generic
> type?
>

The FunctionIndex map holds the GlobalValueInfo type which contains a base
summary object (we also need the other field of the GlobalValueInfo for
writing, so can't hold just the summaries directly). While in this case it
is only function summaries, in the typical case this data structure is used
to build the module index used for importing which contains both summary
types.

Teresa

>
>
>>
>>
>>>
>>>
>>>> +        VE.getValueID(M->getValueSymbolTable().lookup(F.getName())),
>>>> +        FSCallsAbbrev, FSCallsProfileAbbrev, Stream, F);
>>>>    }
>>>>
>>>>    for (const GlobalAlias &A : M->aliases()) {
>>>> @@ -2830,10 +2987,25 @@ static void WritePerModuleFunctionSummar
>>>>        continue;
>>>>
>>>>      assert(FunctionIndex.count(F) == 1);
>>>> +    FunctionSummary *FS =
>>>> +        dyn_cast<FunctionSummary>(FunctionIndex[F]->summary());
>>>>
>>>
>>> Looks like this ^ should be a cast, not a dyn_cast, since the result is
>>> immediately dereferenced in the following line \/
>>>
>>
>> Will change.
>>
>>
>>>
>>>
>>>> +    // Add the alias to the reference list of aliasee function.
>>>> +    FS->addRefEdge(
>>>> +        VE.getValueID(M->getValueSymbolTable().lookup(A.getName())));
>>>>      WritePerModuleFunctionSummaryRecord(
>>>> -        NameVals, FunctionIndex[F]->functionSummary(),
>>>> -        VE.getValueID(M->getValueSymbolTable().lookup(A.getName())),
>>>> FSAbbrev,
>>>> -        Stream);
>>>> +        NameVals, FS,
>>>> +        VE.getValueID(M->getValueSymbolTable().lookup(A.getName())),
>>>> +        FSCallsAbbrev, FSCallsProfileAbbrev, Stream, *F);
>>>> +  }
>>>> +
>>>> +  // Capture references from GlobalVariable initializers, which are
>>>> outside
>>>> +  // of a function scope.
>>>> +  for (const GlobalVariable &G : M->globals())
>>>> +    WriteModuleLevelReferences(G, VE, NameVals, FSModRefsAbbrev,
>>>> Stream);
>>>> +  for (const GlobalAlias &A : M->aliases()) {
>>>> +    const auto *GV = dyn_cast<GlobalVariable>(A.getBaseObject());
>>>> +    if (GV)
>>>>
>>>
>>> If you like, you could roll this ^ declaration into the if. (& then drop
>>> the {} around the loop too, if that suits)
>>>
>>
>> Yes, will fix.
>>
>>
>>>
>>>
>>>> +      WriteModuleLevelReferences(*GV, VE, NameVals, FSModRefsAbbrev,
>>>> Stream);
>>>>    }
>>>>
>>>>    Stream.ExitBlock();
>>>> @@ -2841,35 +3013,132 @@ static void WritePerModuleFunctionSummar
>>>>
>>>>  /// Emit the combined function summary section into the combined index
>>>>  /// file.
>>>> -static void WriteCombinedFunctionSummary(const FunctionInfoIndex &I,
>>>> -                                         BitstreamWriter &Stream) {
>>>> -  Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);
>>>> +static void WriteCombinedGlobalValueSummary(
>>>> +    const FunctionInfoIndex &I, BitstreamWriter &Stream,
>>>> +    std::map<uint64_t, unsigned> &GUIDToValueIdMap, unsigned
>>>> GlobalValueId) {
>>>> +  Stream.EnterSubblock(bitc::GLOBALVAL_SUMMARY_BLOCK_ID, 3);
>>>>
>>>> -  // Abbrev for FS_CODE_COMBINED_ENTRY.
>>>> +  // Abbrev for FS_COMBINED.
>>>>    BitCodeAbbrev *Abbv = new BitCodeAbbrev();
>>>> -  Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_COMBINED_ENTRY));
>>>> -  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid
>>>> +  Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // modid
>>>>    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
>>>> -  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
>>>> -  unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs
>>>> +  // numrefs x valueid, n x (valueid, callsitecount)
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
>>>> +  unsigned FSCallsAbbrev = Stream.EmitAbbrev(Abbv);
>>>>
>>>> -  SmallVector<unsigned, 64> NameVals;
>>>> +  // Abbrev for FS_COMBINED_PROFILE.
>>>> +  Abbv = new BitCodeAbbrev();
>>>> +  Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_PROFILE));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // modid
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs
>>>> +  // numrefs x valueid, n x (valueid, callsitecount, profilecount)
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
>>>> +  unsigned FSCallsProfileAbbrev = Stream.EmitAbbrev(Abbv);
>>>> +
>>>> +  // Abbrev for FS_COMBINED_GLOBALVAR_INIT_REFS.
>>>> +  Abbv = new BitCodeAbbrev();
>>>> +  Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_GLOBALVAR_INIT_REFS));
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // modid
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));    // valueids
>>>> +  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
>>>> +  unsigned FSModRefsAbbrev = Stream.EmitAbbrev(Abbv);
>>>> +
>>>> +  SmallVector<uint64_t, 64> NameVals;
>>>>    for (const auto &FII : I) {
>>>>      for (auto &FI : FII.second) {
>>>> -      FunctionSummary *FS = FI->functionSummary();
>>>> -      assert(FS);
>>>> +      GlobalValueSummary *S = FI->summary();
>>>> +      assert(S);
>>>> +
>>>> +      if (auto *VS = dyn_cast<GlobalVarSummary>(S)) {
>>>> +        assert(!VS->refs().empty() && "Expected at least one ref
>>>> edge");
>>>> +        NameVals.push_back(I.getModuleId(VS->modulePath()));
>>>> +        NameVals.push_back(getEncodedLinkage(VS->linkage()));
>>>> +        for (auto &RI : VS->refs()) {
>>>> +          const auto &VMI = GUIDToValueIdMap.find(RI);
>>>> +          unsigned RefId;
>>>> +          // If this GUID doesn't have an entry, assign one.
>>>> +          if (VMI == GUIDToValueIdMap.end()) {
>>>> +            GUIDToValueIdMap[RI] = ++GlobalValueId;
>>>> +            RefId = GlobalValueId;
>>>> +          } else {
>>>> +            RefId = VMI->second;
>>>> +          }
>>>> +          NameVals.push_back(RefId);
>>>> +        }
>>>> +
>>>> +        // Record the starting offset of this summary entry for use
>>>> +        // in the VST entry. Add the current code size since the
>>>> +        // reader will invoke readRecord after the abbrev id read.
>>>> +        FI->setBitcodeIndex(Stream.GetCurrentBitNo() +
>>>> +                            Stream.GetAbbrevIDWidth());
>>>> +
>>>> +        // Emit the finished record.
>>>> +        Stream.EmitRecord(bitc::FS_COMBINED_GLOBALVAR_INIT_REFS,
>>>> NameVals,
>>>> +                          FSModRefsAbbrev);
>>>> +        NameVals.clear();
>>>> +        continue;
>>>> +      }
>>>>
>>>> +      auto *FS = dyn_cast<FunctionSummary>(S);
>>>> +      assert(FS);
>>>>
>>>
>>> Replace dyn_cast + assert with cast.
>>>
>>
>> ok
>>
>>
>>>
>>>
>>>>        NameVals.push_back(I.getModuleId(FS->modulePath()));
>>>> -      NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));
>>>> +      NameVals.push_back(getEncodedLinkage(FS->linkage()));
>>>>        NameVals.push_back(FS->instCount());
>>>> +      NameVals.push_back(FS->refs().size());
>>>> +
>>>>
>>>
>> <snipped>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413
>>
>
>


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160314/9ad707ec/attachment.html>


More information about the llvm-commits mailing list