[PATCH] D147368: [LoongArch] Optimize bitwise and with immediates

Sun Apr 2 19:09:31 PDT 2023

xen0n accepted this revision.
xen0n added a comment.

LGTM from a cursory look, thanks! Although putting your informative replies into the code as comments would be even better, as others would see the rationale along with the code and not have to do archaeology themselves.

================
Comment at: llvm/test/CodeGen/LoongArch/ir-instruction/and.ll:348-352
 ; LA64-NEXT:    lu12i.w $a2, 15
 ; LA64-NEXT:    ori $a2, $a2, 4080
 ; LA64-NEXT:    and $a1, $a1, $a2
 ; LA64-NEXT:    and $a0, $a0, $a2
 ; LA64-NEXT:    mul.d $a0, $a0, $a1
----------------
benshi001 wrote:
> SixWeining wrote:
> > benshi001 wrote:
> > > SixWeining wrote:
> > > > It's also a win if optimized to `bstrpick + slli` since we can save a register `$a2`. Right?
> > > > ```
> > > > ; LA64-NEXT:    bstrpick.d $a0, $a0, 15, 4
> > > > ; LA64-NEXT:    slli.d $a0, $a0, 4
> > > > ; LA64-NEXT:    bstrpick.d $a1, $a1, 15, 4
> > > > ; LA64-NEXT:    slli.d $a1, $a1, 4
> > > > ; LA64-NEXT:    mul.d $a0, $a0, $a1
> > > > ```
> > > > 
> > > > But if the immediate `0xfff0` is used 3 times, we have 2 choices:
> > > > 1. Use less registers but one more instruction.
> > > > 2. Use less instructions but one more register.
> > > > 
> > > > I'm not sure how to balance this.
> > > > 
> > > > 
> > > I am also not sure about the case of more then 2 uses. Maybe we make a convervative choice, just make  it unchanged? And we only handle 1 and 2 uses ?
> > > 
> > >  
> > I agree with you.
> I have updated my code
> 
> 1. 1 and 2 uses are handled, more uses are rejected;
> 2. corresponding tests are added for the above logic.
Fine with me. If we were to aim for the maximum performance possible then that'll have to be backed with actual micro-architecture details, so the optimizer can do the right thing for the right models. Less instructions executed usually can't hurt after all. (In my Go benchmarking experiences, //loop alignment// could have far more profound influence on the numbers than micro-optimization like this, so it's probably fine to not care as much here.)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147368/new/

https://reviews.llvm.org/D147368