[PATCH] D103597: [AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier

Martin Storsjö via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 10 00:29:40 PDT 2021


mstorsjo added a comment.

The problem I'm seeing can be reproduced with this source file: https://martin.st/temp/cdf-preproc.c

To reproduce:

  $ clang -target aarch64-w64-mingw32 -S -O3 cdf-preproc.c

One difference in the generated asm looks like this, where I think the error is visible:

          ldrh    w10, [x2, #6032]
          strh    wzr, [x1, #6034]
          strh    w10, [x1, #6032]
  -       ldr     q0, [x2, #16]
  -       str     q0, [x1, #16]
  -       ldr     q0, [x2]
  +       ldp     q0, q1, [x2]
          strh    wzr, [x1, #24]
  -       str     q0, [x1]
  -       ldr     q0, [x2, #48]
  -       str     q0, [x1, #48]
  -       ldr     q0, [x2, #32]
  +       stp     q0, q1, [x1]
  +       ldp     q0, q1, [x2, #32]

Previously the `strh wzr, [x1, #24]` was done after the `str q0, [x1, #16]` (so it would overwrite one part of the written qword with zeros), but now the strh remains intact and is done before the new `stp q0, q1, [x1]`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103597/new/

https://reviews.llvm.org/D103597



More information about the llvm-commits mailing list