[llvm] r263275 - [ThinLTO] Support for reference graph in per-module and combined summary.

Duncan P. N. Exon Smith via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 21 16:21:19 PDT 2016


> On 2016-Mar-21, at 16:13, Teresa Johnson <tejohnson at google.com> wrote:
> 
> 
> 
> On Mon, Mar 21, 2016 at 4:02 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2016-Mar-21, at 15:13, Mehdi Amini <mehdi.amini at apple.com> wrote:
> >
> >>
> >> On Mar 17, 2016, at 5:55 PM, Duncan P. N. Exon Smith via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> >>
> >>>
> >>> On 2016-Mar-11, at 10:52, Teresa Johnson via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> >>>
> >>> Author: tejohnson
> >>> Date: Fri Mar 11 12:52:24 2016
> >>> New Revision: 263275
> >>>
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=263275&view=rev
> >>> Log:
> >>> [ThinLTO] Support for reference graph in per-module and combined summary.
> >>>
> >>> Summary:
> >>> This patch adds support for including a full reference graph including
> >>> call graph edges and other GV references in the summary.
> >>>
> >>> The reference graph edges can be used to make importing decisions
> >>> without materializing any source modules, can be used in the plugin
> >>> to make file staging decisions for distributed build systems, and is
> >>> expected to have other uses.
> >>>
> >>> The call graph edges are recorded in each function summary in the
> >>> bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
> >>> data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
> >>> there is PGO, where the ValueId can be mapped to the function GUID via
> >>> the ValueSymbolTable. In the function index in memory, the call graph
> >>> edges reference the target via the CalleeGUID instead of the
> >>> CalleeValueId.
> >>>
> >>> The reference graph edges are recorded in each summary record with a
> >>> list of referenced value IDs, which can be mapped to value GUID via the
> >>> ValueSymbolTable.
> >>>
> >>> Addtionally, a new summary record type is added to record references
> >>> from global variable initializers. A number of bitcode records and data
> >>> structures have been renamed to reflect the newly expanded scope of the
> >>> summary beyond functions. More cleanup will follow.
> >>>
> >>> Reviewers: joker.eph, davidxl
> >>>
> >>> Subscribers: joker.eph, llvm-commits
> >>>
> >>> Differential Revision: http://reviews.llvm.org/D17212
> >>
> >> [snip]
> >>
> >>> Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp
> >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=263275&r1=263274&r2=263275&view=diff
> >>> ==============================================================================
> >>> --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original)
> >>> +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Fri Mar 11 12:52:24 2016
> >>> @@ -2440,34 +2472,61 @@ static void WriteUseListBlock(const Func
> >>> Stream.ExitBlock();
> >>> }
> >>>
> >>> -/// \brief Save information for the given function into the function index.
> >>> -///
> >>> -/// At a minimum this saves the bitcode index of the function record that
> >>> -/// was just written. However, if we are emitting function summary information,
> >>> -/// for example for ThinLTO, then a \a FunctionSummary object is created
> >>> -/// to hold the provided summary information.
> >>> -static void SaveFunctionInfo(
> >>> -    const Function &F,
> >>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,
> >>> -    unsigned NumInsts, uint64_t BitcodeIndex, bool EmitFunctionSummary) {
> >>> -  std::unique_ptr<FunctionSummary> FuncSummary;
> >>> -  if (EmitFunctionSummary) {
> >>> -    FuncSummary = llvm::make_unique<FunctionSummary>(NumInsts);
> >>> -    FuncSummary->setFunctionLinkage(F.getLinkage());
> >>> +// Walk through the operands of a given User via worklist iteration and populate
> >>> +// the set of GlobalValue references encountered. Invoked either on an
> >>> +// Instruction or a GlobalVariable (which walks its initializer).
> >>> +static void findRefEdges(const User *CurUser, const ValueEnumerator &VE,
> >>> +                         DenseSet<unsigned> &RefEdges,
> >>> +                         SmallPtrSet<const User *, 8> &Visited) {
> >>> +  SmallVector<const User *, 32> Worklist;
> >>> +  Worklist.push_back(CurUser);
> >>> +
> >>> +  while (!Worklist.empty()) {
> >>> +    const User *U = Worklist.pop_back_val();
> >>> +
> >>> +    if (!Visited.insert(U).second)
> >>> +      continue;
> >>> +
> >>> +    ImmutableCallSite CS(U);
> >>> +
> >>> +    for (const auto &OI : U->operands()) {
> >>> +      const User *Operand = dyn_cast<User>(OI);
> >>> +      if (!Operand)
> >>> +        continue;
> >>> +      if (isa<BlockAddress>(Operand))
> >>> +        continue;
> >>> +      if (isa<GlobalValue>(Operand)) {
> >>> +        // We have a reference to a global value. This should be added to
> >>> +        // the reference set unless it is a callee. Callees are handled
> >>> +        // specially by WriteFunction and are added to a separate list.
> >>> +        if (!(CS && CS.isCallee(&OI)))
> >>> +          RefEdges.insert(VE.getValueID(Operand));
> >>> +        continue;
> >>> +      }
> >>> +      Worklist.push_back(Operand);
> >>> +    }
> >>> }
> >>> -  FunctionIndex[&F] =
> >>> -      llvm::make_unique<FunctionInfo>(BitcodeIndex, std::move(FuncSummary));
> >>> }
> >>>
> >>> /// Emit a function body to the module stream.
> >>> static void WriteFunction(
> >>> -    const Function &F, ValueEnumerator &VE, BitstreamWriter &Stream,
> >>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,
> >>> -    bool EmitFunctionSummary) {
> >>> +    const Function &F, const Module *M, ValueEnumerator &VE,
> >>> +    BitstreamWriter &Stream,
> >>> +    DenseMap<const Function *, std::unique_ptr<GlobalValueInfo>> &FunctionIndex,
> >>> +    bool EmitSummaryIndex) {
> >>> // Save the bitcode index of the start of this function block for recording
> >>> // in the VST.
> >>> uint64_t BitcodeIndex = Stream.GetCurrentBitNo();
> >>>
> >>> +  bool HasProfileData = F.getEntryCount().hasValue();
> >>> +  std::unique_ptr<BlockFrequencyInfo> BFI;
> >>> +  if (EmitSummaryIndex && HasProfileData) {
> >>> +    Function &Func = const_cast<Function &>(F);
> >>> +    LoopInfo LI{DominatorTree(Func)};
> >>> +    BranchProbabilityInfo BPI{Func, LI};
> >>> +    BFI = llvm::make_unique<BlockFrequencyInfo>(Func, BPI, LI);
> >>
> >> This means that writing out bitcode calculates DominatorTree, LoopInfo,
> >> and BlockFrequencyInfo.  These are not trivial.
> >>
> >> I'm pretty surprised to see these here.
> >>
> >>> +  }
> >>> +
> >>> Stream.EnterSubblock(bitc::FUNCTION_BLOCK_ID, 4);
> >>> VE.incorporateFunction(F);
> >>
> >>
> >>> Modified: llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt
> >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt?rev=263275&r1=263274&r2=263275&view=diff
> >>> ==============================================================================
> >>> --- llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt (original)
> >>> +++ llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt Fri Mar 11 12:52:24 2016
> >>> @@ -19,4 +19,4 @@
> >>> type = Library
> >>> name = BitWriter
> >>> parent = Bitcode
> >>> -required_libraries = Core Support
> >>> +required_libraries = Analysis Core Support
> >>
> >> This means that `llvm-as` now has a dependency on lib/Analysis... which
> >> is surprising.
> >>
> >> I feel like we've done something fundamentally wrong if we have to
> >> run expensive analyses in order to write out IR.
> >
> >
> > I agree: we should have a pass that produces the "FunctionIndex" itself, and the bitcode writer should be supplied with the result of running this pass.
> >
> 
> This reminds me of something Mehdi noticed earlier and mentioned
> to me in passing, but I'm not sure anyone brought up on the list
> yet.  The current implementation of the function summary cannot
> round-trip to textual IR.  I feel like this breaks a fundamental
> IR contract (bitcode == textual IR == IR in memory).
> 
> Mehdi, have you already filed a PR for this and I missed it?
> 
> Mehdi and I discussed this on IRC at one point. You are right, it is a missing piece that will be needed. I am not sure what the best way to represent this textually is. Metadata doesn't seem right, since it isn't metadata internally. Maybe as a special format specific to the function summary?

I agree that Metadata doesn't make sense for this, for the reason
you gave.  A special format makes sense to me, but this probably
needs some broader discussion.  It's not obvious how to fix it.

> 
> 
> -- 
> Teresa Johnson |	 Software Engineer |	 tejohnson at google.com |	 408-460-2413



More information about the llvm-commits mailing list