[PATCH] D99272: [AArch64] Adds a pre-indexed Load/Store optimization for LDRQ-STRQ.

Stelios Ioannou via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 24 08:50:05 PDT 2021


stelios-arm created this revision.
stelios-arm added reviewers: SjoerdMeijer, dmgreen, sanwou01, samparker, fhahn, NickGuy.
Herald added subscribers: danielkiss, arphaman, hiraditya, kristof.beyls.
stelios-arm requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

This patch merges `STRQpre`-`STRQui` and `LDRQpre`-`LDRQui` instruction pairs into a single `STPQpre` and `LDPQpre` instruction, respectively.

For each pair, there is a MIR test that verifies this optimization.

---

This was  a missed opportunity in the `AArch64` load/store optimiser for cases like this:

  #define float32_t float
  #define uint32_t unsigned
  
  void test(float32_t * S, float32_t * D, uint32_t N) {
    for (uint32_t i = 0; i < N; i++) {
      D[i] = D[i] + S[i];
    }
  }

When compiled with:

  -Ofast -target aarch64-arm-none-eabi -mcpu=cortex-a55 -mllvm -lsr-preferred-addressing-mode=preindexed

It results in: <https://godbolt.org/z/YGKxrr6Gs>

  .LBB0_9:                                // =>This Inner Loop Header: Depth=1
          ldr     q0, [x11, #32]!
          ldr     q1, [x11, #16]
          subs    x12, x12, #8                    // =8
          ldr     q2, [x10, #32]!
          ldr     q3, [x10, #16]
          fadd    v0.4s, v2.4s, v0.4s
          fadd    v1.4s, v3.4s, v1.4s
          stp     q0, q1, [x11]
          b.ne    .LBB0_9

where:

  ldr     q0, [x11, #32]!
  ldr     q1, [x11, #16]

should be:

  ldp	q0, q1, [x11, #32]!

---

Additionally for cases like:

  define <4 x i32>* @strqpre-strqui-merge(<4 x i32>* %p, <4 x i32> %a, <4 x i32> %b) {
  entry:
    %p0 = getelementptr <4 x i32>, <4 x i32>* %p, i32 2
    store <4 x i32> %a, <4 x i32>* %p0
    %p1 = getelementptr <4 x i32>, <4 x i32>* %p, i32 3
    store <4 x i32> %b, <4 x i32>* %p1
    ret <4 x i32>* %p0
  }

It results in <https://godbolt.org/z/9YbYrh8E5>:

  "strqpre-strqui-merge":                 // @strqpre-strqui-merge
          str     q0, [x0, #32]!
          str     q1, [x0, #16]
          ret

where the store instruction should be merged with:

  stp	q0, q1, [x0, #32]!

---

This patch covers both cases.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D99272

Files:
  llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
  llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
  llvm/test/CodeGen/AArch64/ldrqpre-ldrqui-merge.mir
  llvm/test/CodeGen/AArch64/strqpre-strqui-merge.mir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D99272.333004.patch
Type: text/x-patch
Size: 22085 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210324/783ecc09/attachment.bin>


More information about the llvm-commits mailing list