[PATCH] D25742: Remove debug location from common tail when tail-merging

Tue Oct 18 19:03:44 PDT 2016

rob.lougher added a comment.

In https://reviews.llvm.org/D25742#573702, @probinson wrote:

> In https://reviews.llvm.org/D25742#573687, @aprantl wrote:
>
> > In https://reviews.llvm.org/D25742#573401, @probinson wrote:
> >
> > > In https://reviews.llvm.org/D25742#573352, @aprantl wrote:
> > >
> > > > This approach seems generally fine, but I have one question:
> > > >
> > > > If the code were on a single line, and both locations share a common ancestor scope, it seems make sense to create a new location using the common ancestor scope and line and only remove the column information.
> > >
> > >
> > > That would collapse the if-then-else into (effectively) a single statement.  That probably works okay for a debugger but not profiling, which still wants to treat the then/else as distinct.  And, after the tail merging, the tails are no longer distinct.
> >
> >
> > I'm not sure I understand your point. How is having an orphaned add instruction preferable over having it associated with the collapsed if-then-else statement? Wouldn't I want that instruction to be counted towards the line?
>
>
> No, the profiler wants to assign sample counts to each block individually.  By giving each block the same source attribution, you assert that they have the same profiles.  That's unlikely to be true in practice.  Really what happens is that the sample counts would be assigned to the parent block, which doesn't help the profiler sort out what to do with the nested blocks.
>
> Admittedly, zapping the source attribution on the merged (parts of the) blocks doesn't let you attribute those counts to *any* block, but at least you aren't attributing counts incorrectly.

Yes, as far as PGO is concerned zapping the source attribution is much much preferable.   The sample loader uses the maximum instruction weight within the basic-block as the block weight so not all of the instructions needs to have been hit.  This means zapping the source attribution will have no affect on the weight of the correct block but will stop the other block being incorrectly counted.  In the testcase above imagine the function was called 100 times and both sides of the if-then-else was executed 50/50.  The if-block would get a weight of "50" from the call to //foo// but the else-block would get a weight of "100" from the //add// and so it would look like the else-part was executed twice as many times as the if-part - this may lead to incorrect decisions re-order blocks.  Zapping the debug info on the common-tail will lead to the correct weights.

> Robert may want to correct some of my assumptions here, but this is my understanding.

https://reviews.llvm.org/D25742