[PATCH] D62818: [InstCombine] Introduce fold for icmp pred (and X, (sh signbit, Y)), 0.

Fri Jun 28 10:58:52 PDT 2019

spatel added inline comments.

================
Comment at: llvm/test/Transforms/InstCombine/onehot_merge.ll:18-23
+;CHECK:       [[TMP:%.*]] = shl i32 1, %c1
+;CHECK-NEXT:  [[TMP1:%.*]] = and i32 [[TMP]], %k
+;CHECK-NEXT:  icmp eq i32 [[TMP1]], 0
+;CHECK-NEXT:  [[TMP2:%.*]] = shl i32 %k, %c2
+;CHECK-NEXT:  icmp sgt i32 [[TMP2]], -1
+;CHECK-NEXT:  or i1
----------------
lebedev.ri wrote:
> huihuiz wrote:
> > spatel wrote:
> > > lebedev.ri wrote:
> > > > lebedev.ri wrote:
> > > > > lebedev.ri wrote:
> > > > > > Can you please regenerate the original test?
> > > > > I'm not sure what's on the LHS of the diff, but ignoring the instruction count this looks like improvement to me.
> > > > AH, you also want to str-replace `%tmp` with `%t`, it confuses the update script likely.
> > > rL364546
> > Actually we missed the fold for 
> > 
> > ```
> > ((k & ( 1 l<< C1 )) == 0) || ((k & ( signbit l>> C2 )) == 0)
> > -->
> > ((k & (( 1 l<< C1 ) || ( signbit l>> C2 ))) != 0)
> > ```
> Thanks for the analysis!
> @spatel does this fall into the nowadays reasoning that we shouldn't be doing too much folds into bitmath
> here in instcombine? I'm almost tempted to say that this isn't a regression,
> but the original fold that now no longer happens should be removed instead.
> 
If we say that the longer IR sequence is more canonical, then we'd want to add a transform to create that longer sequence starting from the shorter sequence. Are we willing to do that to improve analysis in IR?

As a practical matter, we probably also want to look at asm output for the alternatives on a few targets to see how much backend logic is required to do/undo this.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62818/new/

https://reviews.llvm.org/D62818