[llvm] [AArch64] Disable consecutive store merging when Neon is unavailable (PR #111519)
Benjamin Maxwell via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 9 02:23:43 PDT 2024
================
@@ -27924,6 +27924,24 @@ bool AArch64TargetLowering::isIntDivCheap(EVT VT, AttributeList Attr) const {
return OptSize && !VT.isVector();
}
+bool AArch64TargetLowering::canMergeStoresTo(unsigned AddressSpace, EVT MemVT,
+ const MachineFunction &MF) const {
+ // Avoid merging stores into fixed-length vectors when Neon is unavailable.
+ // In future, we could allow this when SVE is available, but currently,
+ // the SVE lowerings for BUILD_VECTOR are limited to a few specific cases (and
+ // the general lowering may introduce stack spills/reloads).
----------------
MacDue wrote:
I think there are two (slightly) independent cases here. There's unwanted store merging (for non-streaming functions), because we could just use a stp instead. E.g.
```
mov v0.s[1], v1.s[0]
str d0, [x0]
```
->
```
stp s0, s1, [x0]
```
That's not fixed in this PR.
Then there's streaming mode store merging, which results in stack spills due to the BUILD_VECTOR lowering. Disabling store mering means, in some cases, we use a more preferable `stp` in streaming mode, but that's a secondary goal here; the main aim is to avoid the stack spills.
As for a streaming-mode/SVE BUILD_VECTOR lowering, I think there are a few options, but likely not as efficient as NEON (though maybe others have better ideas :smile:).
E.g. for <4 x float>:
You could make a chain of `INSR`:
```
insr z3.s, s2
insr z3.s, s1
insr z3.s, s0
str q3, [x0]
```
But `INSR` has a higher latency than a `MOV`. Also, there is a dependency chain here, as each `INSR` depends on the previous one.
Another option is a chain of `ZIP1`:
```
zip1 z2.s, z2.s, z3.s
zip1 z0.s, z0.s, z1.s
zip1 z0.d, z0.d, z2.d
str q0, [x0]
```
This seems like it may be more efficient than `INSR`, and also allows for a shorter dependency chain (logn), but it is still likely not as efficient as just `MOV`s.
https://github.com/llvm/llvm-project/pull/111519
More information about the llvm-commits
mailing list