[all-commits] [llvm/llvm-project] 1c096b: [SVE][LSR] Teach LSR to enable simple scaled-index...

Mon Jun 14 16:42:57 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 1c096bf09ffd3d51665b60942d6bde19e7dbbd5a
      https://github.com/llvm/llvm-project/commit/1c096bf09ffd3d51665b60942d6bde19e7dbbd5a
  Author: Huihui Zhang <huihuiz at quicinc.com>
  Date:   2021-06-14 (Mon, 14 Jun 2021)

  Changed paths:
    M llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
    M llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
    M llvm/test/CodeGen/AArch64/sve-fold-vscale.ll
    A llvm/test/CodeGen/AArch64/sve-lsr-scaled-index-addressing-mode.ll

  Log Message:
  -----------
  [SVE][LSR] Teach LSR to enable simple scaled-index addressing mode generation for SVE.

Currently, Loop strengh reduce is not handling loops with scalable stride very well.

Take loop vectorized with scalable vector type <vscale x 8 x i16> for instance,
(refer to test/CodeGen/AArch64/sve-lsr-scaled-index-addressing-mode.ll added).

Memory accesses are incremented by "16*vscale", while induction variable is incremented
by "8*vscale". The scaling factor "2" needs to be extracted to build candidate formula
i.e., "reg(%in) + 2*reg({0,+,(8 * %vscale)}". So that addrec register reg({0,+,(8*vscale)})
can be reused among Address and ICmpZero LSRUses to enable optimal solution selection.

This patch allow LSR getExactSDiv to recognize special cases like "C1*X*Y /s C2*X*Y",
and pull out "C1 /s C2" as scaling factor whenever possible. Without this change, LSR
is missing candidate formula with proper scaled factor to leverage target scaled-index
addressing mode.

Note: This patch doesn't fully fix AArch64 isLegalAddressingMode for scalable
vector. But allow simple valid scale to pass through.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D103939