[llvm] [AArch64] Disable consecutive store merging when Neon is unavailable (PR #111519)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 9 01:09:33 PDT 2024
================
@@ -27924,6 +27924,24 @@ bool AArch64TargetLowering::isIntDivCheap(EVT VT, AttributeList Attr) const {
return OptSize && !VT.isVector();
}
+bool AArch64TargetLowering::canMergeStoresTo(unsigned AddressSpace, EVT MemVT,
+ const MachineFunction &MF) const {
+ // Avoid merging stores into fixed-length vectors when Neon is unavailable.
+ // In future, we could allow this when SVE is available, but currently,
+ // the SVE lowerings for BUILD_VECTOR are limited to a few specific cases (and
+ // the general lowering may introduce stack spills/reloads).
----------------
sdesmalen-arm wrote:
[Just thinking out loud here] My understanding is that for the example in the test, the reason we don't want to do this optimisation is because we can use the `stp` instructions instead, there is no upside to merging the stores although there is a possible downside that the insert operation is expensive. At the moment, it is expensive because we use a spill/reload, but for streaming[-compatible] SVE we could implement the operation using the SVE `INSR` instruction, which may not be any less efficient than the NEON operation if the value being inserted is also in a FPR/SIMD register. With the lack of upside, disabling the merging of stores avoids this complexity altogether, which understandably is the route chosen here.
I guess the question is; for which cases is merging stores beneficial when NEON is available? and for those cases, can we implement these efficiently using Streaming SVE?
https://github.com/llvm/llvm-project/pull/111519
More information about the llvm-commits
mailing list