[llvm] [AArch64] Fixes for BigEndian 128bit volatile, atomic and non-temporal loads/stores (PR #67413)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 26 04:12:33 PDT 2023
================
@@ -5705,11 +5705,11 @@ SDValue AArch64TargetLowering::LowerSTORE(SDValue Op,
// legalization will break up 256 bit inputs.
ElementCount EC = MemVT.getVectorElementCount();
if (StoreNode->isNonTemporal() && MemVT.getSizeInBits() == 256u &&
- EC.isKnownEven() &&
- ((MemVT.getScalarSizeInBits() == 8u ||
- MemVT.getScalarSizeInBits() == 16u ||
- MemVT.getScalarSizeInBits() == 32u ||
- MemVT.getScalarSizeInBits() == 64u))) {
+ EC.isKnownEven() && DAG.getDataLayout().isLittleEndian() &&
----------------
davemgreen wrote:
Non-temporal stores are a performance optimization, we can always drop the "non-temporal" part and just store them normally. The LDNP nodes are already handled this same way, this brings the STNP in-line.
If you see the tests there is a lot less shuffling which should be better for performance.
https://github.com/llvm/llvm-project/pull/67413
More information about the llvm-commits
mailing list