[all-commits] [llvm/llvm-project] a5d8b7: [AArch64] Improve urem by constant costs (#122236)

David Green via All-commits all-commits at lists.llvm.org
Wed Feb 26 05:50:10 PST 2025


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a5d8b7aeb6b360f20eec88715081ecfdb286b83d
      https://github.com/llvm/llvm-project/commit/a5d8b7aeb6b360f20eec88715081ecfdb286b83d
  Author: David Green <david.green at arm.com>
  Date:   2025-02-26 (Wed, 26 Feb 2025)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/AArch64/div.ll
    M llvm/test/Analysis/CostModel/AArch64/div_cte.ll
    M llvm/test/Analysis/CostModel/AArch64/fshl.ll
    M llvm/test/Analysis/CostModel/AArch64/fshr.ll
    M llvm/test/Analysis/CostModel/AArch64/rem.ll
    M llvm/test/Analysis/CostModel/AArch64/sve-div.ll
    M llvm/test/Analysis/CostModel/AArch64/sve-rem.ll

  Log Message:
  -----------
  [AArch64] Improve urem by constant costs (#122236)

A urem by a constant, much like a udiv by a constant, can be expanded
into a series of mul/add/shift instructions. The exact sequence of
instructions depends on the constants and the types.

If the constant is a power-2 then a shift / and will be used, so the
cost will be 1. This canonicalization happens relatively early so this
likely has very little effect in practice (it does help the cost of
funnel shifts).

For a non-power 2 the code for div will expand to a series of UMULH +
Add + Shift + Add, depending on the constant. urem is generally udiv +
mul + sub, so involves a few extra instructions. The UMULH is not always
available, i32 will use umull+shift, and vector types will use
umull+shift or umull+umull2+uzp depending on the vector size. v2i64 will
be scalarized because there is no mul available. SVE does have a UMULH
instruction.

The end result is that the costs should be closer to reality, with
scalable types a little lower cost than the fixed-width versions. (In
the future we might be able to use umulh for fixed-width when the SVE
instruction is available, but for the moment this should favour scalable
vectorization a little).

I've tried to make this patch only apply to constant UREM/UDIV
instructions. SDIV and SREM are left until a later patch to prevent this
becoming too complex. The funnel shift costs are changing as it believes
it will need a urem to clamp the shift amount, which should be a power-2
value for most common types.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list