[llvm] r263275 - [ThinLTO] Support for reference graph in per-module and combined summary.

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 21 15:13:44 PDT 2016


> On Mar 17, 2016, at 5:55 PM, Duncan P. N. Exon Smith via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
>> 
>> On 2016-Mar-11, at 10:52, Teresa Johnson via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>> 
>> Author: tejohnson
>> Date: Fri Mar 11 12:52:24 2016
>> New Revision: 263275
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=263275&view=rev
>> Log:
>> [ThinLTO] Support for reference graph in per-module and combined summary.
>> 
>> Summary:
>> This patch adds support for including a full reference graph including
>> call graph edges and other GV references in the summary.
>> 
>> The reference graph edges can be used to make importing decisions
>> without materializing any source modules, can be used in the plugin
>> to make file staging decisions for distributed build systems, and is
>> expected to have other uses.
>> 
>> The call graph edges are recorded in each function summary in the
>> bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
>> data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
>> there is PGO, where the ValueId can be mapped to the function GUID via
>> the ValueSymbolTable. In the function index in memory, the call graph
>> edges reference the target via the CalleeGUID instead of the
>> CalleeValueId.
>> 
>> The reference graph edges are recorded in each summary record with a
>> list of referenced value IDs, which can be mapped to value GUID via the
>> ValueSymbolTable.
>> 
>> Addtionally, a new summary record type is added to record references
>> from global variable initializers. A number of bitcode records and data
>> structures have been renamed to reflect the newly expanded scope of the
>> summary beyond functions. More cleanup will follow.
>> 
>> Reviewers: joker.eph, davidxl
>> 
>> Subscribers: joker.eph, llvm-commits
>> 
>> Differential Revision: http://reviews.llvm.org/D17212
> 
> [snip]
> 
>> Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=263275&r1=263274&r2=263275&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original)
>> +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Fri Mar 11 12:52:24 2016
>> @@ -2440,34 +2472,61 @@ static void WriteUseListBlock(const Func
>>  Stream.ExitBlock();
>> }
>> 
>> -/// \brief Save information for the given function into the function index.
>> -///
>> -/// At a minimum this saves the bitcode index of the function record that
>> -/// was just written. However, if we are emitting function summary information,
>> -/// for example for ThinLTO, then a \a FunctionSummary object is created
>> -/// to hold the provided summary information.
>> -static void SaveFunctionInfo(
>> -    const Function &F,
>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,
>> -    unsigned NumInsts, uint64_t BitcodeIndex, bool EmitFunctionSummary) {
>> -  std::unique_ptr<FunctionSummary> FuncSummary;
>> -  if (EmitFunctionSummary) {
>> -    FuncSummary = llvm::make_unique<FunctionSummary>(NumInsts);
>> -    FuncSummary->setFunctionLinkage(F.getLinkage());
>> +// Walk through the operands of a given User via worklist iteration and populate
>> +// the set of GlobalValue references encountered. Invoked either on an
>> +// Instruction or a GlobalVariable (which walks its initializer).
>> +static void findRefEdges(const User *CurUser, const ValueEnumerator &VE,
>> +                         DenseSet<unsigned> &RefEdges,
>> +                         SmallPtrSet<const User *, 8> &Visited) {
>> +  SmallVector<const User *, 32> Worklist;
>> +  Worklist.push_back(CurUser);
>> +
>> +  while (!Worklist.empty()) {
>> +    const User *U = Worklist.pop_back_val();
>> +
>> +    if (!Visited.insert(U).second)
>> +      continue;
>> +
>> +    ImmutableCallSite CS(U);
>> +
>> +    for (const auto &OI : U->operands()) {
>> +      const User *Operand = dyn_cast<User>(OI);
>> +      if (!Operand)
>> +        continue;
>> +      if (isa<BlockAddress>(Operand))
>> +        continue;
>> +      if (isa<GlobalValue>(Operand)) {
>> +        // We have a reference to a global value. This should be added to
>> +        // the reference set unless it is a callee. Callees are handled
>> +        // specially by WriteFunction and are added to a separate list.
>> +        if (!(CS && CS.isCallee(&OI)))
>> +          RefEdges.insert(VE.getValueID(Operand));
>> +        continue;
>> +      }
>> +      Worklist.push_back(Operand);
>> +    }
>>  }
>> -  FunctionIndex[&F] =
>> -      llvm::make_unique<FunctionInfo>(BitcodeIndex, std::move(FuncSummary));
>> }
>> 
>> /// Emit a function body to the module stream.
>> static void WriteFunction(
>> -    const Function &F, ValueEnumerator &VE, BitstreamWriter &Stream,
>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,
>> -    bool EmitFunctionSummary) {
>> +    const Function &F, const Module *M, ValueEnumerator &VE,
>> +    BitstreamWriter &Stream,
>> +    DenseMap<const Function *, std::unique_ptr<GlobalValueInfo>> &FunctionIndex,
>> +    bool EmitSummaryIndex) {
>>  // Save the bitcode index of the start of this function block for recording
>>  // in the VST.
>>  uint64_t BitcodeIndex = Stream.GetCurrentBitNo();
>> 
>> +  bool HasProfileData = F.getEntryCount().hasValue();
>> +  std::unique_ptr<BlockFrequencyInfo> BFI;
>> +  if (EmitSummaryIndex && HasProfileData) {
>> +    Function &Func = const_cast<Function &>(F);
>> +    LoopInfo LI{DominatorTree(Func)};
>> +    BranchProbabilityInfo BPI{Func, LI};
>> +    BFI = llvm::make_unique<BlockFrequencyInfo>(Func, BPI, LI);
> 
> This means that writing out bitcode calculates DominatorTree, LoopInfo,
> and BlockFrequencyInfo.  These are not trivial.
> 
> I'm pretty surprised to see these here.
> 
>> +  }
>> +
>>  Stream.EnterSubblock(bitc::FUNCTION_BLOCK_ID, 4);
>>  VE.incorporateFunction(F);
> 
> 
>> Modified: llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt?rev=263275&r1=263274&r2=263275&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt (original)
>> +++ llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt Fri Mar 11 12:52:24 2016
>> @@ -19,4 +19,4 @@
>> type = Library
>> name = BitWriter
>> parent = Bitcode
>> -required_libraries = Core Support
>> +required_libraries = Analysis Core Support
> 
> This means that `llvm-as` now has a dependency on lib/Analysis... which
> is surprising.
> 
> I feel like we've done something fundamentally wrong if we have to
> run expensive analyses in order to write out IR.


I agree: we should have a pass that produces the "FunctionIndex" itself, and the bitcode writer should be supplied with the result of running this pass.

-- 
Mehdi




More information about the llvm-commits mailing list