[PATCH] D50591: [PGO] Control Height Reduction

Hiroshi Yamauchi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 17 16:17:48 PDT 2018


yamauchi added inline comments.


================
Comment at: test/Transforms/PGOProfile/chr.ll:14
+; CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[TMP1]], 3
+; CHECK-NEXT:    br i1 [[TMP2]], label [[BB0:%.*]], label [[ENTRY_SPLIT_NONCHR:%.*]], !prof !15
+; CHECK:       bb0:
----------------
davidxl wrote:
> yamauchi wrote:
> > davidxl wrote:
> > > Can you add check of the meta data itself (for testing of profile meta data merging)?
> > There's already a check for !15 (at the end of this line). I added a few more.
> that is not what I meant.  I think what is missing is the prof meta data's content. For instance
> 
> CHECK-NEXT: br i1 ....... !prof ![[COMBINED_PROF:%.*]]
> 
> 
> CHECK:![[COMBINED_PROF] = ...
Had understood that. I should have said "at the end of the *file*". There are checks on the prof metadata content there.


================
Comment at: test/Transforms/PGOProfile/chr.ll:268
+; }
+; t2 = *i
+; if ((t2 & 12) != 0) { // Likely true
----------------
davidxl wrote:
> If this can be hoisted above, will the optimization combine (t1 & 3) | (t2 &12) ?
Currently no if t1 and t2 are separate loads., and yes if t1/t2 are merged into one load.


================
Comment at: test/Transforms/PGOProfile/chr.ll:1084
+; two paths and may cause a bad hoisting that hoists it twice and puts a use
+; ahead of its definition.
+; Roughly,
----------------
davidxl wrote:
> Not sure if I understand the comments about bad hoisting here. You want to test the bad hoisting does not happen? What does the bad hoisting look like?
This is a regression test against a specific bug that existed regarding moving %conv. I think the bad hoisting looked like:

entry:
  ...
  %div = fdiv double 1.000000e+00, %conv
  %conv = sitofp i32 %0 to double
  %mul16 = fmul double %div, %conv

It moved %conv to the end of %entry once for %div (from %bb1) and once more for %mul16 (within %entry).

Updated the comment.


================
Comment at: test/Transforms/PGOProfile/chr.ll:1242
+; i0 = *i
+; v2 = (z != 1) ? pred : true  // Likely false
+; if (z == 0 && pred)  // Likely false
----------------
davidxl wrote:
> likely true?
This should be false because the branch weight is 0:1.


================
Comment at: test/Transforms/PGOProfile/chr.ll:1243
+; v2 = (z != 1) ? pred : true  // Likely false
+; if (z == 0 && pred)  // Likely false
+;   foo()
----------------
davidxl wrote:
> likely true ?
Same.


================
Comment at: test/Transforms/PGOProfile/chr.ll:1254
+; i0 = *i
+; if (z != 1 && (z == 0 && pred)) // First CHR
+;   foo()
----------------
davidxl wrote:
> is z != 1 redundant?
yes, I think instcombine doesn't seem to fold due to limited reassociation. This doesn't matter for the point of the test, I think.


================
Comment at: test/Transforms/PGOProfile/chr.ll:1256
+;   foo()
+; // A trivial phi for i0 is inserted here by the first CHR (which gets removed
+; // later) and the subsequent branch condition (for the second CHR) uses it.
----------------
davidxl wrote:
> why was it inserted?
As part of the CHR transformation, a trivial phi is inserted at the exit of a region when CHR is applied so that we can easily know what variables/values are live at the exit and need to have the incoming values corresponding to the cloned cold path added to those phis. IOW, when we split/clone a path/region into two, we need a new phi for anything that's alive at the exit. To make this easier, we first insert trivial phis at the exit for anything alive there and later add operands corresponding to the second path to those phis. This is also a regression test for an old bug.


================
Comment at: test/Transforms/PGOProfile/chr.ll:1439
+; return v42 + v40
+; ->
+; t0 = *i
----------------
davidxl wrote:
> should it be:
> 
> ; t0 = *i
> ; v40 = t0 + 44
> ; if ((t0 & 3) == 3) // Likely true
> ;   foo()
> ;   v41 = t0 + 99
> ;   foo()
> ; } else {
> ;   if ((t0 & 1) != 0) // Likely true
> ;     foo()
> ;   if ((t0 & 2) != 0) // Likely true
> ;     v41_nc = t0 + 99
> ;     foo()
> ;   }
> ; }
> ; v42 = phi (v41, v41_nc, v40)
> ; return v42 + v40
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
That's valid optimization but, I don't think CHR/instcombine don't specifically try hoisting v40.



Repository:
  rL LLVM

https://reviews.llvm.org/D50591





More information about the llvm-commits mailing list