[all-commits] [llvm/llvm-project] dde2a7: [RISCV] Exploit fact that vscale is always power o...

Wed Jul 13 10:55:25 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: dde2a7fb6da46da2b2f765fa569d8fddb4270eb6
      https://github.com/llvm/llvm-project/commit/dde2a7fb6da46da2b2f765fa569d8fddb4270eb6
  Author: Philip Reames <preames at rivosinc.com>
  Date:   2022-07-13 (Wed, 13 Jul 2022)

  Changed paths:
    M llvm/include/llvm/CodeGen/TargetLowering.h
    M llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
    M llvm/lib/Target/RISCV/RISCVISelLowering.cpp
    M llvm/lib/Target/RISCV/RISCVISelLowering.h
    M llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll
    M llvm/test/CodeGen/RISCV/rvv/vscale-power-of-two.ll

  Log Message:
  -----------
  [RISCV] Exploit fact that vscale is always power of two to replace urem sequence

When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The RHS of that urem is a (possibly shifted) call to @llvm.vscale.

vscale is effectively the number of "blocks" in the vector register. (That is, types such as <vscale x 8 x i8> and <vscale x 1 x i8> both fill one 64 bit block, and vscale is essentially how many of those blocks there are in a single vector register at runtime.)

We know from the RISCV V extension specification that VLEN must be a power of two between ELEN and 2^16. Since our block size is 64 bits, the must be a power of two numbers of blocks. (For everything other than VLEN<=32, but that's already broken.)

It is worth noting that AArch64 SVE specification explicitly allows non-power-of-two sizes for the vector registers and thus can't claim that vscale is a power of two by this logic.

Differential Revision: https://reviews.llvm.org/D129609