[PATCH] D12781: PGO IR-level instrumentation infrastructure

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 23 21:54:22 PDT 2015


On Wed, Sep 23, 2015 at 6:49 PM, Xinliang David Li <davidxl at google.com>
wrote:

>
>
> On Wed, Sep 23, 2015 at 6:02 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>>
>>
>> On Wed, Sep 23, 2015 at 4:39 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Sep 23, 2015 at 3:28 PM, Rong Xu <xur at google.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Sep 22, 2015 at 6:09 PM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>> silvas added a reviewer: kcc.
>>>>> silvas added a comment.
>>>>>
>>>>> This should not be called "late instrumentation". That is
>>>>> clang-specific terminology. Think from the perspective of a random frontend
>>>>> (say, one that doesn't have any other PGO mechanism). There should be no
>>>>> mention of "FE instrumentation", "late instrumentation" "IR-level
>>>>> instrumentation", etc. This is just "edge profiling" or something like that.
>>>>>
>>>>
>>>> these terms are mainly for the reviewers to distinguish the two
>>>> instrumentation. I'll clean up the code and comments to use "edge
>>>> profiling".
>>>>
>>>
>>> Thanks.
>>>
>>>
>>>>
>>>>> A couple other high-level questions/comments:
>>>>>
>>>>> - is there a reason that this is breaking critical edges itself
>>>>> instead of using break-crit-edges pass? Is break-crit-edges dead?
>>>>> SanitizerCoverage doesn't seem to use it either.
>>>>>
>>>>
>>>> As I mentioned in the email to Kostya: this is to avoid unnecessary
>>>> split. We only need to split a small set of the critical edges.
>>>>
>>>>
>>>>> - why is this a module pass? It operates entirely on a
>>>>> function-by-function basis.
>>>>>
>>>> Good question. When I started the implementation, I thought I would
>>>> create global functions and global vars that needed to be a module pass.
>>>> But now it seems function pass should work. I'll change this in a later
>>>> patch.
>>>>
>>>> - can we use GraphTraits for the MST algorithm, so that it is reusable?
>>>>> (could probably be a separate patch if so).
>>>>>
>>>> This can be done in the future in a later patch. Not sure what other
>>>> pass would use MST.
>>>>
>>>
>>> At the very least, it may help to be able to unittest the MST algorithm.
>>> Your current patch has no testing for this; how are you sure that your MST
>>> algorithm is correct? (actually, this patch has no tests at all; please add
>>> testing to the patch once we have converged on the final design).
>>>
>>>
>>>>
>>>>
>>>>> - could we break the PGO data reading into a separate analysis? Also
>>>>> could the pass be split into more fine-grained passes? (I think the answer
>>>>> is yes to both questions)
>>>>>
>>>>
>>>> I would prefer to put instrumentation and reading profile (read the
>>>> count out and populate to each BB) in the same pass. This two passes need
>>>> to see the matched IR. Separating them will create more changes of
>>>> mismatches.
>>>>
>>>
>>> The problem is that now we have a pass that really is two separate
>>> passes. If we look in PGOLateInstrumentationFunc::
>>> PGOLateInstrumentationFunc, we can see that there is a shared block of code:
>>>
>>>     setFuncName(F);
>>>     buildEdges();
>>>     sortEdges();
>>>     computeMinimumSpanningTree();
>>>     computeCFGHash();
>>>     dumpEdges();
>>>
>>> Then the pass does two radically different things:
>>>
>>>     if (!IsUse) {
>>>       instrumentCFG();
>>>       return;
>>>     }
>>>
>>>     if (readCounters(Filename)) {
>>>       populateCounters();
>>>       setBranchWeights();
>>>       dumpEdges(true);
>>>     }
>>>
>>> The shared block of code seems like it could be pulled out into a helper
>>> class that can be shared by the two passes.
>>>
>>> Precisely because the two different operations are so tightly coupled,
>>> we have to clearly separate the single point of truth that they both rely
>>> on. Otherwise it is difficult to reason about this close coupling. This
>>> piece of functionality should be clearly separated (separate
>>> class/function; a method is insufficient isolation because then there can
>>> be accidental interdependence due to member variables).
>>>
>>> Also, in order to fully benefit from the MST, we will need to be able to
>>> read the counters earlier (but if I understand correctly, don't need to set
>>> the branch weight metadata). This indicates that, roughly speaking,
>>> `populateCounters()` should be clearly separated.
>>>
>>> Even if we don't expose two different passes (for the reason you
>>> described), internally in the implementation there has to be a clear
>>> separation. I'm still not convinced that we shouldn't expose two different
>>> passes, but it should be easy to make that choice later once the internal
>>> implementation is cleanly factored.
>>>
>>>
>>>> But I could move some of the code, like setting function attributes
>>>> into a separated pass.
>>>>
>>>>
>>>>> - We should directly accept a buffer containing the profile feedback
>>>>> data instead of a file name. That lets the frontend do the error handling,
>>>>> and is more flexible for usage as a library.
>>>>>
>>>>
>>>> Good suggestion. I'll work on this.
>>>>
>>>
>>> Now that I look at the code again, instead of a raw buffer, it probably
>>> makes sense for the frontend to pass in an IndexedInstrProfReader directly.
>>> The idea is that as many "error conditions" as possible should be pushed up
>>> into the frontend.
>>>
>>
>> Also, the current patch is reading the profile file O(#functions) times
>> (and reading the file is O(#functions) work, leading to quadratic run
>> time). Clearly this has to be avoided, so merely passing in the contents of
>> the profile file is an insufficient solution. Something like passing in an
>> IndexedInstrProfReader (or possibly your own class, which does further
>> preprocessing on the data from an IndexedInstrProfReader) will be necessary.
>>
>
>
> Actually it is not quadratic as you described.  Profile use reads only
> indexed format. The readProfile only reads the header of the profile, so
> the cost is still linear.
>
> Of course, there is still some redundant work in the current
> implementation. The reader instance should be maintained at module level
> and shared across all functions.  This is perhaps a reason for it to be a
> module pass.
>

Ah, you're right; at least it's not quadratic. But yes, it's highly
redundant and may be a good reason for it to be a module pass (if we cannot
somehow share across functions using an analysis, but I'm never sure with
the old PassManager).

-- Sean Silva


>
> David
>
>
>
>>
>> -- Sean Silva
>>
>>
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>>
>>>>>
>>>>> Adding Kostya, as this has a lot of similarities to sanitizer
>>>>> coverage; also just for general knowledge of instrumentation passes.
>>>>>
>>>>>
>>>>> ================
>>>>> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:122-126
>>>>> @@ +121,7 @@
>>>>> +
>>>>> +  // Threshold of the hot functions.
>>>>> +  const float HotFunctionThreshold = 0.01;
>>>>> +
>>>>> +  // Threshold of the cold functions.
>>>>> +  const float ColdFunctionThreshold = 0.0002;
>>>>> +
>>>>> ----------------
>>>>> Use integers.
>>>>>
>>>>> ================
>>>>> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:301-302
>>>>> @@ +300,4 @@
>>>>> +
>>>>> +// Compute the CRC32 of unsigned integers.
>>>>> +unsigned PGOLateInstrumentationFunc::crc32UnsignedBits(unsigned
>>>>> Chksum,
>>>>> +                                                       unsigned Value,
>>>>> ----------------
>>>>> Why crc32?
>>>>>
>>>>> ================
>>>>> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:313-314
>>>>> @@ +312,4 @@
>>>>> +
>>>>> +// Compute Hash value for the CFG: the lower 32 bits are CRC32 of the
>>>>> index
>>>>> +// value of each BB in the CFG. The higher 32 bits record the number
>>>>> of edges.
>>>>> +void PGOLateInstrumentationFunc::computeCFGHash() {
>>>>> ----------------
>>>>> This seems like a very poor use of the bits of the hash. Why not use
>>>>> md5 like clang's `PGOHash`?
>>>>>
>>>>>
>>>>> http://reviews.llvm.org/D12781
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150923/5acbcd12/attachment.html>


More information about the llvm-commits mailing list