[PATCH] D122829: [AArch64] Optimize SDIV with pow2 constant divisor

Fri Apr 1 10:42:48 PDT 2022

bcl5980 added a comment.

In D122829#3422761 <https://reviews.llvm.org/D122829#3422761>, @efriedma wrote:

> In D122829#3422022 <https://reviews.llvm.org/D122829#3422022>, @bcl5980 wrote:
>
>> And one other point is:
>> Case @dont_fold_srem_i16_smax save 6 instructions with 3 extra add+shift
>> Case @dont_fold_srem_power_of_two, it save a 9 instructions with 1 extra add+shift
>> So maybe we can use the general path for vector case at least?
>
> Where is the savings actually coming from here?  I don't think it's related to it being a vector; we're just unrolling it into scalar ops.

It looks two cases use some extra fmov with v1 register origin version. And they share some SRA with srem with non-power2 case.

> True; are there specific opportunities that matter?

It's hard to say.  There are a lot of combine code in DAGCombiner::visitADD and DAGCombiner::visitSRA. And also SRA can be shared in the vector case.

> That looks like an improvement, sure.

I will use this way to fix the issue later.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122829/new/

https://reviews.llvm.org/D122829