[PATCH] D122829: [AArch64] Optimize SDIV with pow2 constant divisor
chenglin.bi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 1 10:42:48 PDT 2022
bcl5980 added a comment.
In D122829#3422761 <https://reviews.llvm.org/D122829#3422761>, @efriedma wrote:
> In D122829#3422022 <https://reviews.llvm.org/D122829#3422022>, @bcl5980 wrote:
>
>> And one other point is:
>> Case @dont_fold_srem_i16_smax save 6 instructions with 3 extra add+shift
>> Case @dont_fold_srem_power_of_two, it save a 9 instructions with 1 extra add+shift
>> So maybe we can use the general path for vector case at least?
>
> Where is the savings actually coming from here? I don't think it's related to it being a vector; we're just unrolling it into scalar ops.
It looks two cases use some extra fmov with v1 register origin version. And they share some SRA with srem with non-power2 case.
> True; are there specific opportunities that matter?
It's hard to say. There are a lot of combine code in DAGCombiner::visitADD and DAGCombiner::visitSRA. And also SRA can be shared in the vector case.
> That looks like an improvement, sure.
I will use this way to fix the issue later.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D122829/new/
https://reviews.llvm.org/D122829
More information about the llvm-commits
mailing list