[PATCH] D103597: [AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier
Martin Storsjö via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 23 00:01:37 PDT 2021
mstorsjo added a comment.
In D103597#2809810 <https://reviews.llvm.org/D103597#2809810>, @mstorsjo wrote:
> The problem I'm seeing can be reproduced with this source file: https://martin.st/temp/cdf-preproc.c
>
> To reproduce:
>
> $ clang -target aarch64-w64-mingw32 -S -O3 cdf-preproc.c
There still are issues that are reproducible with this particular sample. (I'm also seeing miscompilations still in some other samples that I haven't narrowed down yet, hoping that they just are more cases of this same issue.) The diff of the assembly output of that particular sample looks like this:
--- good.s 2021-06-23 09:42:24.381499762 +0300
+++ bad.s 2021-06-23 09:42:40.521154128 +0300
@@ -1955,17 +1955,14 @@
ldrh w10, [x2, #5756]
strh wzr, [x1, #5758]
strh w10, [x1, #5756]
- ldr q0, [x2, #976]
+ ldr q1, [x2, #976]
add x10, x1, #960 // =960
- str q0, [x1, #976]
ldr q0, [x2, #960]
strh wzr, [x10, #30]
- str q0, [x1, #960]
- ldr q0, [x2, #1008]
- str q0, [x1, #1008]
- ldr q0, [x2, #992]
+ stp q0, q1, [x1, #960]
+ ldp q0, q2, [x2, #992]
strh wzr, [x10, #62]
- str q0, [x1, #992]
+ stp q0, q2, [x1, #992]
ldr q0, [x2, #1040]
str q0, [x1, #1040]
ldr q0, [x2, #1024]
Note that `x10 = x1 + 960`. The `strh wzr, [x10, #30]` clears a halfword at `[x1, #990]` (i.e. bytes `x1[990-991]`); this used to be done after `str q0, [x1, #976]` (which writes `x1[976-991]`), but now this write is done before the `stp q0, q1, [x1, #960]` (which writes `x1[960-991]`).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103597/new/
https://reviews.llvm.org/D103597
More information about the llvm-commits
mailing list