[PATCH] D100769: [RISCV] Optimize addition with immediate

Mon Apr 19 23:21:44 PDT 2021

benshi001 added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVInstrInfo.td:1298
+let Predicates = [IsRV64] in
+def : Pat<(sext_inreg (add GPR:$rs1, (AddiPair GPR:$rs2)), i32),
+          (ADDIW (ADDIW GPR:$rs1, (AddiPairImmB GPR:$rs2)),
----------------
craig.topper wrote:
> benshi001 wrote:
> > benshi001 wrote:
> > > craig.topper wrote:
> > > > craig.topper wrote:
> > > > > Do we need to make sure sext_inreg is the only user of the add? Otherwise we’ll emit two ADDIWs and two ADDIs.
> > > > I think if we emitted ADDI followed by ADDIW for sign_ext case, the first ADDI would CSE with the first ADDI from a non-sext case if there were multiple uses. Then we wouldn't need to check for multiple uses.
> > > I do not quite understand your concern, why do you think "Otherwise we’ll emit two ADDIWs and two ADDIs." ? That is impossible. 
> > > 
> > > 1. for the IR "%a = add i32 %b, 3001", two ADDIs are emitted on both rv32 and rv64.
> > > 
> > > 2. for the IR "%a = add i64 %b, 3001", two ADDIs are emitted on rv64.
> > > 
> > > 3. for the IR "%a = add i64 %b, 3001", two ADDIs are emitted on rv32 for the lower 32-bit, (along with other instrs for the upper 32-bit).
> > > 
> > > 4. for the IR pattern "%a = add i32 %b, 3001; %c = sext_inreg %a, i32", two ADDIWs are emitted without any ADDI.
> > > 
> > > I did not find any other cases/IR patterns for this optimization.
> > In current patch, there is no possibility that ADDI/ADDIW are mixedly emmitted. I can remove the one-use check, but I do concern the immediate composed by a lui/addi pair has further use, and my transform leads to less efficient code.
> Example test
> 
> ```
> define signext i32 @add32_sext_accept(i32 signext %a, i32* %b) nounwind {                                                                                                                      
>   %1 = add i32 %a, 2999                                                                                                                                                                        
>   store i32 %1, i32* %b                                                                                                                                                                        
>   ret i32 %1                                                                                                                                                                                   
> }    
> ``` 
> 
> Produces 
> ```
>         addi    a2, a0, 1500
>         addi    a2, a2, 1499
>         addiw   a0, a0, 1500
>         addiw   a0, a0, 1499
>         sw      a2, 0(a1)
>         ret
> ```
> 
> Though for that example we could use the addiw result for the sw, but that's a bit hard to fix at the moment.
> 
> Here's another example
> 
> ```
> define i64 @add32_sext_accept(i64 %a, i64* %b) nounwind {                                                                                                                                      
>   %1 = add i64 %a, 2999                                                                                                                                                                        
>   store i64 %1, i64* %b                                                                                                                                                                        
>   %2 = shl i64 %1, 32                                                                                                                                                                          
>   %3 = ashr i64 %2, 32                                                                                                                                                                         
>   ret i64 %3                                                                                                                                                                                   
> }    
> ```
> 
> produces
> ```
>         addi    a2, a0, 1500
>         addi    a2, a2, 1499
>         addiw   a0, a0, 1500
>         addiw   a0, a0, 1499
>         sd      a2, 0(a1)
>         ret
> ```
I see. Your concern does matter. Currently I can not figure an easy way to cover all special cases. So I will remove the ADDIW rule.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100769/new/

https://reviews.llvm.org/D100769