[PATCH] D123453: [InstCombine] Fold mul nuw+lshr exact to a single multiplication when the latter is a factor

Sat Apr 9 12:20:28 PDT 2022

lebedev.ri added a comment.

In D123453#3441197 <https://reviews.llvm.org/D123453#3441197>, @bcl5980 wrote:

> In D123453#3441148 <https://reviews.llvm.org/D123453#3441148>, @lebedev.ri wrote:
>
>> I see. Then the proof is wrong.
>
> I'm sorry I really don't know how to proof by alive2. Can you teach me how to proof it?

You need to either adjust the fold to do what the proof says, or write another proof for the change in question.

Abstracting a bit, is there a generalization of this pattern where the shift amounts don't match?

================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp:1177
+      // lshr exact (mul nuw x, (c << ShAmtC)), ShAmtC -> mul nuw x, c
+      if (Op0->hasOneUse() && I.isExact()) {
+        APInt NewMulC = MulC->lshr(ShAmtC);
----------------
bcl5980 wrote:
> bcl5980 wrote:
> > lebedev.ri wrote:
> > > bcl5980 wrote:
> > > > bcl5980 wrote:
> > > > > lebedev.ri wrote:
> > > > > > 1. No one-use check needed, only a single instruction is produced.
> > > > > > 2. I believe, while that `lshr` is obviously `exact`, your own proof shows that there is no need to check that it is marked as such?
> > > > > If there is no one-use, we may generate 2 mul instead of mul+shr
> > > > > 
> > > > > ```
> > > > > define i64 @lshr_mul_negative_oneuse(i64 %0) {
> > > > > ; CHECK-LABEL: @lshr_mul_negative_oneuse(
> > > > > ; CHECK-NEXT:    [[TMP2:%.*]] = mul nuw i64 [[TMP0:%.*]], 52
> > > > > ; CHECK-NEXT:    call void @use(i64 [[TMP2]])
> > > > > ; CHECK-NEXT:    [[TMP3:%.*]] = mul nuw i64 [[TMP0]], 13
> > > > > ; CHECK-NEXT:    ret i64 [[TMP3]]
> > > > > ;
> > > > >   %2 = mul nuw i64 %0, 52
> > > > >   call void @use(i64 %2)
> > > > >   %3 = lshr i64 %2, 2
> > > > >   ret i64 %3
> > > > > }
> > > > > ```
> > > > > 
> > > > > We need a condition to make sure ShAmtC is divisible by NewMulC. The proof use shl that is always sure. But in the code we still need to check eact flag.
> > > > > We need a condition to make sure ShAmtC is divisible by NewMulC. The proof use shl that is always sure. But in the code we still need to check eact flag.
> > > > Should be "make sure MulC is divisible by NewMulC"
> > > > If there is no one-use, we may generate 2 mul instead of mul+shr
> > > 
> > > We only create only a single `mul` here, do we not?
> > > 
> > > > We need a condition to make sure ShAmtC is divisible by NewMulC. The proof use shl that is always sure. But in the code we still need to check eact flag.
> > > 
> > > Can you show the counter-proof that shows that not checking for `exact` is incorrect?
> > > > If there is no one-use, we may generate 2 mul instead of mul+shr
> > > 
> > > We only create only a single `mul` here, do we not?
> > > 
> > Yeah, we create only a single mul, but most of time mul should be heavier than lshr, am I right?
> > > > We need a condition to make sure ShAmtC is divisible by NewMulC. The proof use shl that is always sure. But in the code we still need to check eact flag.
> > > 
> > > Can you show the counter-proof that shows that not checking for `exact` is incorrect?
> > I don't know how to counter-proof with alive2 but this is the negative case on my machine after remove exact:
> > 
> > ```
> > define i64 @lshr_mul_negative_noexact(i64 %0) {
> > ; CHECK-LABEL: @lshr_mul_negative_noexact(
> > ; CHECK-NEXT:    [[TMP2:%.*]] = mul nuw i64 [[TMP0:%.*]], 13
> > ; CHECK-NEXT:    ret i64 [[TMP2]]
> > ;
> >   %2 = mul nuw i64 %0, 53
> >   %3 = lshr i64 %2, 2
> >   ret i64 %3
> > }
> > 
> > ```
> > > If there is no one-use, we may generate 2 mul instead of mul+shr
> > 
> > We only create only a single `mul` here, do we not?
> > 
> 
> ```
> declare i64 @use(i64, i64)
> define i64 @lshr_mul_negative_oneuse(i64 %0) {
> ; CHECK-LABEL: @lshr_mul_negative_oneuse(
> ; CHECK-NEXT:    [[TMP2:%.*]] = mul nuw i64 [[TMP0:%.*]], 52
> ; CHECK-NEXT:    [[TMP3:%.*]] = mul nuw i64 [[TMP0]], 13
> ; CHECK-NEXT:    [[TMP4:%.*]] = call i64 @use(i64 [[TMP2]], i64 [[TMP3]])
> ; CHECK-NEXT:    ret i64 [[TMP4]]
> ;
>   %2 = mul nuw i64 %0, 52
>   %3 = lshr i64 %2, 2
>   %4 = call i64 @use(i64 %2, i64 %3)
>   ret i64 %4
> }
> ```
> 
> This case should more clear why we need one use I think.
> 
The number of instructions didn't increase, so we're fine.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123453/new/

https://reviews.llvm.org/D123453