[llvm] r263275 - [ThinLTO] Support for reference graph in per-module and combined summary.

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 21 16:09:11 PDT 2016


On Mon, Mar 21, 2016 at 4:02 PM, Duncan P. N. Exon Smith via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

>
> > On 2016-Mar-21, at 15:13, Mehdi Amini <mehdi.amini at apple.com> wrote:
> >
> >>
> >> On Mar 17, 2016, at 5:55 PM, Duncan P. N. Exon Smith via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
> >>
> >>>
> >>> On 2016-Mar-11, at 10:52, Teresa Johnson via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
> >>>
> >>> Author: tejohnson
> >>> Date: Fri Mar 11 12:52:24 2016
> >>> New Revision: 263275
> >>>
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=263275&view=rev
> >>> Log:
> >>> [ThinLTO] Support for reference graph in per-module and combined
> summary.
> >>>
> >>> Summary:
> >>> This patch adds support for including a full reference graph including
> >>> call graph edges and other GV references in the summary.
> >>>
> >>> The reference graph edges can be used to make importing decisions
> >>> without materializing any source modules, can be used in the plugin
> >>> to make file staging decisions for distributed build systems, and is
> >>> expected to have other uses.
> >>>
> >>> The call graph edges are recorded in each function summary in the
> >>> bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
> >>> data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
> >>> there is PGO, where the ValueId can be mapped to the function GUID via
> >>> the ValueSymbolTable. In the function index in memory, the call graph
> >>> edges reference the target via the CalleeGUID instead of the
> >>> CalleeValueId.
> >>>
> >>> The reference graph edges are recorded in each summary record with a
> >>> list of referenced value IDs, which can be mapped to value GUID via the
> >>> ValueSymbolTable.
> >>>
> >>> Addtionally, a new summary record type is added to record references
> >>> from global variable initializers. A number of bitcode records and data
> >>> structures have been renamed to reflect the newly expanded scope of the
> >>> summary beyond functions. More cleanup will follow.
> >>>
> >>> Reviewers: joker.eph, davidxl
> >>>
> >>> Subscribers: joker.eph, llvm-commits
> >>>
> >>> Differential Revision: http://reviews.llvm.org/D17212
> >>
> >> [snip]
> >>
> >>> Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp
> >>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=263275&r1=263274&r2=263275&view=diff
> >>>
> ==============================================================================
> >>> --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original)
> >>> +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Fri Mar 11
> 12:52:24 2016
> >>> @@ -2440,34 +2472,61 @@ static void WriteUseListBlock(const Func
> >>> Stream.ExitBlock();
> >>> }
> >>>
> >>> -/// \brief Save information for the given function into the function
> index.
> >>> -///
> >>> -/// At a minimum this saves the bitcode index of the function record
> that
> >>> -/// was just written. However, if we are emitting function summary
> information,
> >>> -/// for example for ThinLTO, then a \a FunctionSummary object is
> created
> >>> -/// to hold the provided summary information.
> >>> -static void SaveFunctionInfo(
> >>> -    const Function &F,
> >>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>>
> &FunctionIndex,
> >>> -    unsigned NumInsts, uint64_t BitcodeIndex, bool
> EmitFunctionSummary) {
> >>> -  std::unique_ptr<FunctionSummary> FuncSummary;
> >>> -  if (EmitFunctionSummary) {
> >>> -    FuncSummary = llvm::make_unique<FunctionSummary>(NumInsts);
> >>> -    FuncSummary->setFunctionLinkage(F.getLinkage());
> >>> +// Walk through the operands of a given User via worklist iteration
> and populate
> >>> +// the set of GlobalValue references encountered. Invoked either on an
> >>> +// Instruction or a GlobalVariable (which walks its initializer).
> >>> +static void findRefEdges(const User *CurUser, const ValueEnumerator
> &VE,
> >>> +                         DenseSet<unsigned> &RefEdges,
> >>> +                         SmallPtrSet<const User *, 8> &Visited) {
> >>> +  SmallVector<const User *, 32> Worklist;
> >>> +  Worklist.push_back(CurUser);
> >>> +
> >>> +  while (!Worklist.empty()) {
> >>> +    const User *U = Worklist.pop_back_val();
> >>> +
> >>> +    if (!Visited.insert(U).second)
> >>> +      continue;
> >>> +
> >>> +    ImmutableCallSite CS(U);
> >>> +
> >>> +    for (const auto &OI : U->operands()) {
> >>> +      const User *Operand = dyn_cast<User>(OI);
> >>> +      if (!Operand)
> >>> +        continue;
> >>> +      if (isa<BlockAddress>(Operand))
> >>> +        continue;
> >>> +      if (isa<GlobalValue>(Operand)) {
> >>> +        // We have a reference to a global value. This should be
> added to
> >>> +        // the reference set unless it is a callee. Callees are
> handled
> >>> +        // specially by WriteFunction and are added to a separate
> list.
> >>> +        if (!(CS && CS.isCallee(&OI)))
> >>> +          RefEdges.insert(VE.getValueID(Operand));
> >>> +        continue;
> >>> +      }
> >>> +      Worklist.push_back(Operand);
> >>> +    }
> >>> }
> >>> -  FunctionIndex[&F] =
> >>> -      llvm::make_unique<FunctionInfo>(BitcodeIndex,
> std::move(FuncSummary));
> >>> }
> >>>
> >>> /// Emit a function body to the module stream.
> >>> static void WriteFunction(
> >>> -    const Function &F, ValueEnumerator &VE, BitstreamWriter &Stream,
> >>> -    DenseMap<const Function *, std::unique_ptr<FunctionInfo>>
> &FunctionIndex,
> >>> -    bool EmitFunctionSummary) {
> >>> +    const Function &F, const Module *M, ValueEnumerator &VE,
> >>> +    BitstreamWriter &Stream,
> >>> +    DenseMap<const Function *, std::unique_ptr<GlobalValueInfo>>
> &FunctionIndex,
> >>> +    bool EmitSummaryIndex) {
> >>> // Save the bitcode index of the start of this function block for
> recording
> >>> // in the VST.
> >>> uint64_t BitcodeIndex = Stream.GetCurrentBitNo();
> >>>
> >>> +  bool HasProfileData = F.getEntryCount().hasValue();
> >>> +  std::unique_ptr<BlockFrequencyInfo> BFI;
> >>> +  if (EmitSummaryIndex && HasProfileData) {
> >>> +    Function &Func = const_cast<Function &>(F);
> >>> +    LoopInfo LI{DominatorTree(Func)};
> >>> +    BranchProbabilityInfo BPI{Func, LI};
> >>> +    BFI = llvm::make_unique<BlockFrequencyInfo>(Func, BPI, LI);
> >>
> >> This means that writing out bitcode calculates DominatorTree, LoopInfo,
> >> and BlockFrequencyInfo.  These are not trivial.
> >>
> >> I'm pretty surprised to see these here.
> >>
> >>> +  }
> >>> +
> >>> Stream.EnterSubblock(bitc::FUNCTION_BLOCK_ID, 4);
> >>> VE.incorporateFunction(F);
> >>
> >>
> >>> Modified: llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt
> >>> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt?rev=263275&r1=263274&r2=263275&view=diff
> >>>
> ==============================================================================
> >>> --- llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt (original)
> >>> +++ llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt Fri Mar 11 12:52:24
> 2016
> >>> @@ -19,4 +19,4 @@
> >>> type = Library
> >>> name = BitWriter
> >>> parent = Bitcode
> >>> -required_libraries = Core Support
> >>> +required_libraries = Analysis Core Support
> >>
> >> This means that `llvm-as` now has a dependency on lib/Analysis... which
> >> is surprising.
> >>
> >> I feel like we've done something fundamentally wrong if we have to
> >> run expensive analyses in order to write out IR.
> >
> >
> > I agree: we should have a pass that produces the "FunctionIndex" itself,
> and the bitcode writer should be supplied with the result of running this
> pass.
> >
>
> This reminds me of something Mehdi noticed earlier and mentioned
> to me in passing, but I'm not sure anyone brought up on the list
> yet.  The current implementation of the function summary cannot
> round-trip to textual IR.  I feel like this breaks a fundamental
> IR contract (bitcode == textual IR == IR in memory).
>

Can you elaborate on this? Is it that text IR dump will drop the summary
information Or something else? The information written out is derived from
information from IR nothing more or less.

David


>
> Mehdi, have you already filed a PR for this and I missed it?
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160321/ee21b5b1/attachment.html>


More information about the llvm-commits mailing list