[llvm] [RISCV] Move strength reduction of mul X, 3/5/9*2^N to combine (PR #89966)

Yingwei Zheng via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 25 14:30:55 PDT 2024


dtcxzyw wrote:

> > This patch breaks bext pattern:
> > ```
> > ; bin/llc -O3 -mtriple=riscv64 -mattr=+m,+zbs,+zba test.ll -o -
> > define ptr @test(ptr %0, i32 %1, i32 %2) {
> > entry:
> >   %3 = lshr i32 %1, %2
> >   %4 = and i32 %3, 1
> >   %5 = zext nneg i32 %4 to i64
> >   %6 = getelementptr inbounds [3 x float], ptr %0, i64 %5
> >   ret ptr %6
> > }
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > Before:
> > ```
> > test:
> >         bext    a1, a1, a2
> >         sh1add  a1, a1, a1
> >         sh2add  a0, a1, a0
> >         ret
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > After:
> > ```
> > test:
> >         srl     a1, a1, a2
> >         andi    a1, a1, 1
> >         sh1add  a1, a1, a1
> >         sh2add  a0, a1, a0
> >         ret
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > See [dtcxzyw/llvm-codegen-benchmark#25 (comment)](https://github.com/dtcxzyw/llvm-codegen-benchmark/pull/25#discussion_r1578372935).
> 
> ```
> Optimized legalized selection DAG: %bb.0 'test:entry'
> SelectionDAG has 20 nodes:
>   t0: ch,glue = EntryToken
>         t4: i64,ch = CopyFromReg t0, Register:i64 %1
>           t6: i64,ch = CopyFromReg t0, Register:i64 %2
>         t31: i64 = and t6, Constant:i64<4294967295>
>       t21: i64 = srl t4, t31
>     t36: i64 = freeze t21
>   t29: i64 = and t36, Constant:i64<1>
>       t2: i64,ch = CopyFromReg t0, Register:i64 %0
>         t33: i64 = RISCVISD::SHL_ADD t29, Constant:i64<1>, t29
>       t35: i64 = shl t33, Constant:i64<2>
>     t16: i64 = add t2, t35
>   t18: ch,glue = CopyToReg t0, Register:i64 $x10, t16
>   t19: ch = RISCVISD::RET_GLUE t18, Register:i64 $x10, t18:1
> ```
> 
> DAGCombiner pushes freeze through the `and` node and breaks `srl + and` pair :(

It has been fixed :)

Another regression (extracted from qemu):
```
; bin/llc -mtriple=riscv64 test.ll -mattr=+m,+zba -o -
define ptr @test(ptr %0, i64 %1) {
entry:
  %2 = lshr exact i64 %1, 2
  %3 = and i64 %2, 4294967295
  %4 = getelementptr inbounds i8, ptr %0, i64 600
  %5 = getelementptr [80 x i8], ptr %4, i64 %3
  ret ptr %5
}
```
Before:
```
test:
        srli    a1, a1, 2
        slli.uw a1, a1, 4
        sh2add  a1, a1, a1
        add     a0, a0, a1
        addi    a0, a0, 600
        ret
```
After:
```
test:
        slli    a1, a1, 2
        srli    a1, a1, 4
        slli.uw a1, a1, 4
        sh2add  a1, a1, a1
        add     a0, a0, a1
        addi    a0, a0, 600
        ret
```

We may fix this by handling `ISD::SHL_ADD` in `SimplifyDemandedBits` after https://github.com/llvm/llvm-project/pull/88791 lands:)



https://github.com/llvm/llvm-project/pull/89966


More information about the llvm-commits mailing list