[PATCH] D103597: [AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier
Martin Storsjö via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 10 00:29:40 PDT 2021
mstorsjo added a comment.
The problem I'm seeing can be reproduced with this source file: https://martin.st/temp/cdf-preproc.c
To reproduce:
$ clang -target aarch64-w64-mingw32 -S -O3 cdf-preproc.c
One difference in the generated asm looks like this, where I think the error is visible:
ldrh w10, [x2, #6032]
strh wzr, [x1, #6034]
strh w10, [x1, #6032]
- ldr q0, [x2, #16]
- str q0, [x1, #16]
- ldr q0, [x2]
+ ldp q0, q1, [x2]
strh wzr, [x1, #24]
- str q0, [x1]
- ldr q0, [x2, #48]
- str q0, [x1, #48]
- ldr q0, [x2, #32]
+ stp q0, q1, [x1]
+ ldp q0, q1, [x2, #32]
Previously the `strh wzr, [x1, #24]` was done after the `str q0, [x1, #16]` (so it would overwrite one part of the written qword with zeros), but now the strh remains intact and is done before the new `stp q0, q1, [x1]`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103597/new/
https://reviews.llvm.org/D103597
More information about the llvm-commits
mailing list