[llvm-dev] (RFC) Encoding code duplication factor in discriminator

Dehao Chen via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 1 14:07:47 PDT 2016

Oops... pressed the wrong button and sent out early...

On Tue, Nov 1, 2016 at 2:01 PM, Dehao Chen <dehao at google.com> wrote:

> If Hal's proposal is for SamplePGO purpose, let me clarify some design
> principles of SamplePGO.
> The profile for sample pgo uses source location as the key to map the
> execution count back to IR. This design is based on the principle that we
> do not want the profile to be tightly couple with compiler IR. Instead,
> profile is simple an attribute of the source code. We have been benefited
>  a lot from this design that the profile can easily be reused across
> different source versions and compiler versions, or even compilers.
> That being said, the design to encode more info into discriminator does
> not mean that we will change the profile. The encoded info in discriminator
> will be handled by the create_llvm_prof tool, which combines counts from
> different clones of the same source code and generate the combined profile
> data. The output profile will not have any cloning/dupliaction bits at all.
> So for the initial example profile I provided, the output profile will be:

#1: 10
#3: 80


#1: 10
#3.0x400: 70
#3.0x10400: 5
#3.0x20400: 3
#3.0x30400: 2

The goal of the proposed change, is to make profile more accurately
represent the attribute of the source.
The non-goal of the proposed change, is to provide more context in the
profile to present the behavior of program in the context of different

If we are pursuing more context-sensitive profile, that would be a very
different design. In this design, we will need to read profile multiple
times, and have the profile tightly coupled with the compiler IR/pass
manager. That is doable, but I don't think that probably better suits
instrumentation based PGO's domain. Comments?


> On Tue, Nov 1, 2016 at 1:04 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> ------------------------------
>> *From: *"Paul Robinson" <paul.robinson at sony.com>
>> *To: *"Dehao Chen" <dehao at google.com>, "Hal Finkel" <hfinkel at anl.gov>
>> *Cc: *"Xinliang David Li" <davidxl at google.com>, llvm-dev at lists.llvm.org
>> *Sent: *Tuesday, November 1, 2016 2:15:38 PM
>> *Subject: *RE: [llvm-dev] (RFC) Encoding code duplication factor
>> in        discriminator
>> As illustrated in the above example, it is not like "vectorization has a
>> distinct bit". All different optimizations make clones of code which will
>> be labeled by UIDs represented by N (e.g. 8) bits. In this way, the space
>> will be capped by the number of clones all optimizations have made, instead
>> of # of optimizations that has applied. And it will be capped at 2^N-1. The
>> cons of using uid is that you will not know if a clone is coming from
>> vectorization or unroll or loop distribution.
>> Okay, but that kind of semantic mapping is important. How should we
>> encode/recover that information? To be clear, I'm not saying that we need
>> to implement that up front, but there needs to be a clear path to an
>> implementation, because I don't want to have two disjoint schemes.
>> You mean that you want to know which optimization created the clone? How
>> would you use that info? Looks to me this will expose compiler
>> implementation detail in debug info.
>> This is still doable, assume we have 15 interesting optimizations to
>> track, we can use 4 bits to encode the optimization type that created the
>> clone. But this becomes nasty if the a clone is created by more than one
>> optimizations. In that way, discriminator may not be fit for this purpose.
>> My understanding was that the encoding scheme would allow the profiling
>> analysis to correctly map execution data back to the original source
>> construct, while preserving the property that each distinct basic block
>> would have its own discriminator value.  That is, the execution data would
>> be attributed back to the original source construct, not whatever each
>> individual optimization had done to it, and the data for the original
>> source construct would correctly reflect the execution (e.g. profiling says
>> you got 82 hits on the original loop, rather than reporting 20 hits on the
>> unrolled-by-4 loop plus 1 each on 2 of the trailing copies).
>> It sounds like Hal is thinking that the per-discriminator execution info
>> would be preserved down to the point where an individual optimization could
>> look at the profile for each piece, and make decisions on that basis.
>> I'm not clear how that would be possible, as the optimization would have
>> to first do the transform (or predict how it would do the transform) in
>> order to see which individual-discriminator counts mapped to which actual
>> blocks, and then make some kind of decision about whether to do the
>> transform differently based on that information.  Then, if the optimization
>> did choose to do the transform differently, then that leaves the IR in a
>> state where the individual discriminators *cannot* map back to it.  (Say
>> you unroll by 2 instead of 4; then you have only 1 trailing copy, not 3,
>> and a discriminator that maps to the second trailing copy now maps to
>> nothing.  The individual-discriminator data becomes useless.)
>> Am I expressing this well enough to show that what Hal is looking for is
>> not feasible?
>> Yes, it will need to predict how the transformation would affect the
>> blocks produced. That does not seem problematic (at least at a coarse
>> level). Yes, if transformations made earlier in the pipeline make different
>> decisions, then that will invalidate later fine-grained data (at least
>> potentially). I don't see how any of this makes this infeasible. We just
>> need a way for the profiling counts, per descriminator, to remain
>> available, and for the transformations themselves to know which
>> discriminators (loop ids, or whatever) to consider.
>>  -Hal
>> --paulr
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161101/d43663aa/attachment-0001.html>

More information about the llvm-dev mailing list