[PATCH] D57718: [PPC] Adjust the computed branch offset for the possible shorter distance

Fri Feb 15 15:50:03 PST 2019

jsji added a reviewer: hfinkel.
jsji added a comment.

Thanks for updating!

================
Comment at: lib/Target/PowerPC/PPCBranchSelector.cpp:214
+          //    ...
+          //   bne Far      100            10c
+          //   .p2align 4
----------------
Carrot wrote:
> jsji wrote:
> > jsji wrote:
> > > Carrot wrote:
> > > > jsji wrote:
> > > > > Looks like to me that the root cause here is that 
> > > > > 1. we **may have conservative (larger) MBBStartOffset** due to inline asm or alignment,  (eg: `100` -> `10c`)
> > > > > 2.  we are **subtracting MBBStartOffset** here  (eg: `0x8108 - 0x10c`)
> > > > > 
> > > > > So instead of getting larger `BranchSize`, subtracting may lead us to getting smaller `BranchSize`.
> > > > > 
> > > > > For other blocks, even if we may still have conservative (larger) size, it should be OK, as we are adding them towards `BranchSize`.
> > > > > 
> > > > > 
> > > > > So, shouldn't we just need to adjust the value of `MBBStartOffset` , checking inline asm & alignment when calculating it in previous loop?
> > > > The root cause in your understanding is correct!
> > > > 
> > > > The inline asm size is handled by 
> > > >   MBBStartOffset += TII->getInstSizeInBytes(*I);
> > > > The BB alignment is already handled when computing BlockSizes.
> > > > 
> > > > Here the problem is the computed MBBStartOffset can't be precise, and we can't tell what's the difference between its value and the actual address. For the BranchSize, if it is larger than actual offset, it is safe for us. If it is smaller than actual branch offset, the max delta is 
> > > >       (1 << MaxAlign) - 4
> > > > 
> > > Thanks. 
> > > Can we tell whether `MBBStartOffset` is precise or not by checking whether we meet inline asm & alignment?
> > > And only apply the delta when we are sure they are not precise?
> > > 
> > > Also is it possible that `MBBStartOffset` is smaller than ` (1 << MaxAlign) -4 `?
> > > 
> > > And can we add some comments about why the max delta is 
> > > ` (1 << MaxAlign) -4 ` ?
> > > 
> > > 
> > Actually, after seeing your new comments and the testcase. I think my initial understanding was wrong.
> > The root cause of the estimate gap is not due to that we are `subtracting MBBStartOffset `.
> > 
> > The root cause looks like to me now:
> > 
> > 1.  We may have conservative (larger) `BlockSizes` due to inline asm .
> >       eg: In your example, we add 4 extra bytes due to label in inline asm
> > 2.  We may get **smaller than actual alignment adjustment** due to that inaccurate `Offset`. 
> >       eg: In your example,  we get offset 0, as we though the offset will be `32`, however, it is actually `28`, 
> >             so , we should get an adjustment of `12` here! So that is why we are actually `-12` to real value here!
> > 
> > So looks like to me,  after an inaccurate inline asm BB, all the alignment adjustment of a BB **may get a smaller than actual value (negative!).** , and since we are adding all these toward `BranchSize`, I think this problem will  **not only happen in forward branch, but also in backwards branch.**:
> > 
> > So maybe we should fix this in `GetAlignmentAdjustment` to let is be conservative when a BB contains inline asm?
> > 
> > So looks like to me, after an inaccurate inline asm BB, all the alignment adjustment of a BB may get a smaller than actual value (negative!). , and since we are adding all these toward BranchSize, I think this problem will not only happen in forward branch, but also in backwards branch.:
> 
> Not all estimated alignment adjustment is smaller than actual alignment adjustment. Suppose the estimated bne address is 100, the actual address 9c, the accumulated function size to the end of this block is 104 for estimated case and 100 for actual value. So the following 16 bytes alignment will cause 12 padding bytes for estimated case and no padding for actual code. But this situation is not interesting for us here.
> The shorter estimated branch offset can indeed occur in backward branch, so I modified the patch.
> 
> > So maybe we should fix this in GetAlignmentAdjustment to let is be conservative when a BB contains inline asm?
> I don't know how to fix it in GetAlignmentAdjustment. For whatever extra alignment adjustment, you will add the value to branch instruction address and target address in following blocks, but you can't change the alignment padding, and you can't change the estimated branch offset.
> Not all estimated alignment adjustment is smaller than actual alignment adjustment. 

Yes, agree, not all adjustment will be smaller, that is why I use  *may* in my last comment. :)
>**may** get a smaller than actual value(negative!). 

>I don't know how to fix it in GetAlignmentAdjustment. For whatever extra alignment adjustment, you will add the value to branch instruction address and target address in following blocks, but you can't change the alignment padding, and you can't change the estimated branch offset.

My first thought was that we return adjustment of `(1 << Align) -4` when we know that the `Offset` is imprecise, alwasy add largest possible padding instead calculating values from `Offset` ? 

But yes looks like this is too conservative.

================
Comment at: test/CodeGen/PowerPC/branch_selector.ll:1
+; RUN: llc -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 -verify-machineinstrs < %s | FileCheck %s
+
----------------
Carrot wrote:
> jsji wrote:
> > Why we require -mcpu=pwr8 here? This should apply to all sub-target?
> Without -mcpu=pwr8 llvm doesn't generate aligned loop body. I learned it from test/CodeGen/PowerPC/code-align.ll.
OK. Thanks, so it applys to almost all targets, except the generic ones like ppc64.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57718/new/

https://reviews.llvm.org/D57718