[PATCH] D12781: PGO IR-level instrumentation infrastructure

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 23 16:51:57 PDT 2015


On Wed, Sep 23, 2015 at 4:39 PM, Sean Silva <chisophugis at gmail.com> wrote:

>
>
> On Wed, Sep 23, 2015 at 3:28 PM, Rong Xu <xur at google.com> wrote:
>
>>
>>
>> On Tue, Sep 22, 2015 at 6:09 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>> silvas added a reviewer: kcc.
>>> silvas added a comment.
>>>
>>> This should not be called "late instrumentation". That is clang-specific
>>> terminology. Think from the perspective of a random frontend (say, one that
>>> doesn't have any other PGO mechanism). There should be no mention of "FE
>>> instrumentation", "late instrumentation" "IR-level instrumentation", etc.
>>> This is just "edge profiling" or something like that.
>>>
>>
>> these terms are mainly for the reviewers to distinguish the two
>> instrumentation. I'll clean up the code and comments to use "edge
>> profiling".
>>
>
> Thanks.
>
>
>>
>>> A couple other high-level questions/comments:
>>>
>>> - is there a reason that this is breaking critical edges itself instead
>>> of using break-crit-edges pass? Is break-crit-edges dead? SanitizerCoverage
>>> doesn't seem to use it either.
>>>
>>
>> As I mentioned in the email to Kostya: this is to avoid unnecessary
>> split. We only need to split a small set of the critical edges.
>>
>>
>>> - why is this a module pass? It operates entirely on a
>>> function-by-function basis.
>>>
>> Good question. When I started the implementation, I thought I would
>> create global functions and global vars that needed to be a module pass.
>> But now it seems function pass should work. I'll change this in a later
>> patch.
>>
>> - can we use GraphTraits for the MST algorithm, so that it is reusable?
>>> (could probably be a separate patch if so).
>>>
>> This can be done in the future in a later patch. Not sure what other pass
>> would use MST.
>>
>
> At the very least, it may help to be able to unittest the MST algorithm.
> Your current patch has no testing for this; how are you sure that your MST
> algorithm is correct? (actually, this patch has no tests at all; please add
> testing to the patch once we have converged on the final design).
>
>
>>
>>
>>> - could we break the PGO data reading into a separate analysis? Also
>>> could the pass be split into more fine-grained passes? (I think the answer
>>> is yes to both questions)
>>>
>>
>> I would prefer to put instrumentation and reading profile (read the count
>> out and populate to each BB) in the same pass. This two passes need to see
>> the matched IR. Separating them will create more changes of mismatches.
>>
>
> The problem is that now we have a pass that really is two separate passes.
> If we look in PGOLateInstrumentationFunc:: PGOLateInstrumentationFunc, we
> can see that there is a shared block of code:
>
>     setFuncName(F);
>     buildEdges();
>     sortEdges();
>     computeMinimumSpanningTree();
>     computeCFGHash();
>     dumpEdges();
>
> Then the pass does two radically different things:
>
>     if (!IsUse) {
>       instrumentCFG();
>       return;
>     }
>
>     if (readCounters(Filename)) {
>       populateCounters();
>       setBranchWeights();
>       dumpEdges(true);
>     }
>
> The shared block of code seems like it could be pulled out into a helper
> class that can be shared by the two passes.
>
> Precisely because the two different operations are so tightly coupled, we
> have to clearly separate the single point of truth that they both rely on.
> Otherwise it is difficult to reason about this close coupling. This piece
> of functionality should be clearly separated (separate class/function; a
> method is insufficient isolation because then there can be accidental
> interdependence due to member variables).
>
>
Moving CfgMst analysis into its own class as utility/help class, and make
Instrumentation and Profile Annotation two distinct passes sounds
reasonable.



> Also, in order to fully benefit from the MST, we will need to be able to
> read the counters earlier (but if I understand correctly, don't need to set
> the branch weight metadata). This indicates that, roughly speaking,
> `populateCounters()` should be clearly separated.
>

Not sure what you mean here.  populateCounters depend on readCounters ..

David


>
> Even if we don't expose two different passes (for the reason you
> described), internally in the implementation there has to be a clear
> separation. I'm still not convinced that we shouldn't expose two different
> passes, but it should be easy to make that choice later once the internal
> implementation is cleanly factored.
>
>
>> But I could move some of the code, like setting function attributes into
>> a separated pass.
>>
>>
>>> - We should directly accept a buffer containing the profile feedback
>>> data instead of a file name. That lets the frontend do the error handling,
>>> and is more flexible for usage as a library.
>>>
>>
>> Good suggestion. I'll work on this.
>>
>
> Now that I look at the code again, instead of a raw buffer, it probably
> makes sense for the frontend to pass in an IndexedInstrProfReader directly.
> The idea is that as many "error conditions" as possible should be pushed up
> into the frontend.
>
> -- Sean Silva
>
>
>>
>>
>>>
>>> Adding Kostya, as this has a lot of similarities to sanitizer coverage;
>>> also just for general knowledge of instrumentation passes.
>>>
>>>
>>> ================
>>> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:122-126
>>> @@ +121,7 @@
>>> +
>>> +  // Threshold of the hot functions.
>>> +  const float HotFunctionThreshold = 0.01;
>>> +
>>> +  // Threshold of the cold functions.
>>> +  const float ColdFunctionThreshold = 0.0002;
>>> +
>>> ----------------
>>> Use integers.
>>>
>>> ================
>>> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:301-302
>>> @@ +300,4 @@
>>> +
>>> +// Compute the CRC32 of unsigned integers.
>>> +unsigned PGOLateInstrumentationFunc::crc32UnsignedBits(unsigned Chksum,
>>> +                                                       unsigned Value,
>>> ----------------
>>> Why crc32?
>>>
>>> ================
>>> Comment at: lib/Transforms/Instrumentation/PGOLateInstr.cpp:313-314
>>> @@ +312,4 @@
>>> +
>>> +// Compute Hash value for the CFG: the lower 32 bits are CRC32 of the
>>> index
>>> +// value of each BB in the CFG. The higher 32 bits record the number of
>>> edges.
>>> +void PGOLateInstrumentationFunc::computeCFGHash() {
>>> ----------------
>>> This seems like a very poor use of the bits of the hash. Why not use md5
>>> like clang's `PGOHash`?
>>>
>>>
>>> http://reviews.llvm.org/D12781
>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150923/cdbeb802/attachment.html>


More information about the llvm-commits mailing list