<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Mar 21, 2016 at 4:02 PM, Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>
> On 2016-Mar-21, at 15:13, Mehdi Amini <<a href="mailto:mehdi.amini@apple.com">mehdi.amini@apple.com</a>> wrote:<br>
><br>
>><br>
>> On Mar 17, 2016, at 5:55 PM, Duncan P. N. Exon Smith via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br>
>><br>
>>><br>
>>> On 2016-Mar-11, at 10:52, Teresa Johnson via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br>
>>><br>
>>> Author: tejohnson<br>
>>> Date: Fri Mar 11 12:52:24 2016<br>
>>> New Revision: 263275<br>
>>><br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=263275&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=263275&view=rev</a><br>
>>> Log:<br>
>>> [ThinLTO] Support for reference graph in per-module and combined summary.<br>
>>><br>
>>> Summary:<br>
>>> This patch adds support for including a full reference graph including<br>
>>> call graph edges and other GV references in the summary.<br>
>>><br>
>>> The reference graph edges can be used to make importing decisions<br>
>>> without materializing any source modules, can be used in the plugin<br>
>>> to make file staging decisions for distributed build systems, and is<br>
>>> expected to have other uses.<br>
>>><br>
>>> The call graph edges are recorded in each function summary in the<br>
>>> bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO<br>
>>> data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when<br>
>>> there is PGO, where the ValueId can be mapped to the function GUID via<br>
>>> the ValueSymbolTable. In the function index in memory, the call graph<br>
>>> edges reference the target via the CalleeGUID instead of the<br>
>>> CalleeValueId.<br>
>>><br>
>>> The reference graph edges are recorded in each summary record with a<br>
>>> list of referenced value IDs, which can be mapped to value GUID via the<br>
>>> ValueSymbolTable.<br>
>>><br>
>>> Addtionally, a new summary record type is added to record references<br>
>>> from global variable initializers. A number of bitcode records and data<br>
>>> structures have been renamed to reflect the newly expanded scope of the<br>
>>> summary beyond functions. More cleanup will follow.<br>
>>><br>
>>> Reviewers: joker.eph, davidxl<br>
>>><br>
>>> Subscribers: joker.eph, llvm-commits<br>
>>><br>
>>> Differential Revision: <a href="http://reviews.llvm.org/D17212" rel="noreferrer" target="_blank">http://reviews.llvm.org/D17212</a><br>
>><br>
>> [snip]<br>
>><br>
>>> Modified: llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp<br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=263275&r1=263274&r2=263275&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp?rev=263275&r1=263274&r2=263275&view=diff</a><br>
>>> ==============================================================================<br>
>>> --- llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp (original)<br>
>>> +++ llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp Fri Mar 11 12:52:24 2016<br>
>>> @@ -2440,34 +2472,61 @@ static void WriteUseListBlock(const Func<br>
>>> Stream.ExitBlock();<br>
>>> }<br>
>>><br>
>>> -/// \brief Save information for the given function into the function index.<br>
>>> -///<br>
>>> -/// At a minimum this saves the bitcode index of the function record that<br>
>>> -/// was just written. However, if we are emitting function summary information,<br>
>>> -/// for example for ThinLTO, then a \a FunctionSummary object is created<br>
>>> -/// to hold the provided summary information.<br>
>>> -static void SaveFunctionInfo(<br>
>>> - const Function &F,<br>
>>> - DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,<br>
>>> - unsigned NumInsts, uint64_t BitcodeIndex, bool EmitFunctionSummary) {<br>
>>> - std::unique_ptr<FunctionSummary> FuncSummary;<br>
>>> - if (EmitFunctionSummary) {<br>
>>> - FuncSummary = llvm::make_unique<FunctionSummary>(NumInsts);<br>
>>> - FuncSummary->setFunctionLinkage(F.getLinkage());<br>
>>> +// Walk through the operands of a given User via worklist iteration and populate<br>
>>> +// the set of GlobalValue references encountered. Invoked either on an<br>
>>> +// Instruction or a GlobalVariable (which walks its initializer).<br>
>>> +static void findRefEdges(const User *CurUser, const ValueEnumerator &VE,<br>
>>> + DenseSet<unsigned> &RefEdges,<br>
>>> + SmallPtrSet<const User *, 8> &Visited) {<br>
>>> + SmallVector<const User *, 32> Worklist;<br>
>>> + Worklist.push_back(CurUser);<br>
>>> +<br>
>>> + while (!Worklist.empty()) {<br>
>>> + const User *U = Worklist.pop_back_val();<br>
>>> +<br>
>>> + if (!Visited.insert(U).second)<br>
>>> + continue;<br>
>>> +<br>
>>> + ImmutableCallSite CS(U);<br>
>>> +<br>
>>> + for (const auto &OI : U->operands()) {<br>
>>> + const User *Operand = dyn_cast<User>(OI);<br>
>>> + if (!Operand)<br>
>>> + continue;<br>
>>> + if (isa<BlockAddress>(Operand))<br>
>>> + continue;<br>
>>> + if (isa<GlobalValue>(Operand)) {<br>
>>> + // We have a reference to a global value. This should be added to<br>
>>> + // the reference set unless it is a callee. Callees are handled<br>
>>> + // specially by WriteFunction and are added to a separate list.<br>
>>> + if (!(CS && CS.isCallee(&OI)))<br>
>>> + RefEdges.insert(VE.getValueID(Operand));<br>
>>> + continue;<br>
>>> + }<br>
>>> + Worklist.push_back(Operand);<br>
>>> + }<br>
>>> }<br>
>>> - FunctionIndex[&F] =<br>
>>> - llvm::make_unique<FunctionInfo>(BitcodeIndex, std::move(FuncSummary));<br>
>>> }<br>
>>><br>
>>> /// Emit a function body to the module stream.<br>
>>> static void WriteFunction(<br>
>>> - const Function &F, ValueEnumerator &VE, BitstreamWriter &Stream,<br>
>>> - DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,<br>
>>> - bool EmitFunctionSummary) {<br>
>>> + const Function &F, const Module *M, ValueEnumerator &VE,<br>
>>> + BitstreamWriter &Stream,<br>
>>> + DenseMap<const Function *, std::unique_ptr<GlobalValueInfo>> &FunctionIndex,<br>
>>> + bool EmitSummaryIndex) {<br>
>>> // Save the bitcode index of the start of this function block for recording<br>
>>> // in the VST.<br>
>>> uint64_t BitcodeIndex = Stream.GetCurrentBitNo();<br>
>>><br>
>>> + bool HasProfileData = F.getEntryCount().hasValue();<br>
>>> + std::unique_ptr<BlockFrequencyInfo> BFI;<br>
>>> + if (EmitSummaryIndex && HasProfileData) {<br>
>>> + Function &Func = const_cast<Function &>(F);<br>
>>> + LoopInfo LI{DominatorTree(Func)};<br>
>>> + BranchProbabilityInfo BPI{Func, LI};<br>
>>> + BFI = llvm::make_unique<BlockFrequencyInfo>(Func, BPI, LI);<br>
>><br>
>> This means that writing out bitcode calculates DominatorTree, LoopInfo,<br>
>> and BlockFrequencyInfo. These are not trivial.<br>
>><br>
>> I'm pretty surprised to see these here.<br>
>><br>
>>> + }<br>
>>> +<br>
>>> Stream.EnterSubblock(bitc::FUNCTION_BLOCK_ID, 4);<br>
>>> VE.incorporateFunction(F);<br>
>><br>
>><br>
>>> Modified: llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt<br>
>>> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt?rev=263275&r1=263274&r2=263275&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt?rev=263275&r1=263274&r2=263275&view=diff</a><br>
>>> ==============================================================================<br>
>>> --- llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt (original)<br>
>>> +++ llvm/trunk/lib/Bitcode/Writer/LLVMBuild.txt Fri Mar 11 12:52:24 2016<br>
>>> @@ -19,4 +19,4 @@<br>
>>> type = Library<br>
>>> name = BitWriter<br>
>>> parent = Bitcode<br>
>>> -required_libraries = Core Support<br>
>>> +required_libraries = Analysis Core Support<br>
>><br>
>> This means that `llvm-as` now has a dependency on lib/Analysis... which<br>
>> is surprising.<br>
>><br>
>> I feel like we've done something fundamentally wrong if we have to<br>
>> run expensive analyses in order to write out IR.<br>
><br>
><br>
> I agree: we should have a pass that produces the "FunctionIndex" itself, and the bitcode writer should be supplied with the result of running this pass.<br>
><br>
<br>
</div></div>This reminds me of something Mehdi noticed earlier and mentioned<br>
to me in passing, but I'm not sure anyone brought up on the list<br>
yet. The current implementation of the function summary cannot<br>
round-trip to textual IR. I feel like this breaks a fundamental<br>
IR contract (bitcode == textual IR == IR in memory).<br>
<br>
Mehdi, have you already filed a PR for this and I missed it?<br></blockquote><div><br></div><div>Mehdi and I discussed this on IRC at one point. You are right, it is a missing piece that will be needed. I am not sure what the best way to represent this textually is. Metadata doesn't seem right, since it isn't metadata internally. Maybe as a special format specific to the function summary?</div></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top-style:solid;border-top-color:rgb(213,15,37);border-top-width:2px">Teresa Johnson |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(51,105,232);border-top-width:2px"> Software Engineer |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(0,153,57);border-top-width:2px"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(238,178,17);border-top-width:2px"> 408-460-2413</td></tr></tbody></table></span></div>
</div></div>