[llvm] [AArch64] Fixes for BigEndian 128bit volatile, atomic and non-temporal loads/stores (PR #67413)

David Green via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 26 04:12:33 PDT 2023


================
@@ -5705,11 +5705,11 @@ SDValue AArch64TargetLowering::LowerSTORE(SDValue Op,
     // legalization will break up 256 bit inputs.
     ElementCount EC = MemVT.getVectorElementCount();
     if (StoreNode->isNonTemporal() && MemVT.getSizeInBits() == 256u &&
-        EC.isKnownEven() &&
-        ((MemVT.getScalarSizeInBits() == 8u ||
-          MemVT.getScalarSizeInBits() == 16u ||
-          MemVT.getScalarSizeInBits() == 32u ||
-          MemVT.getScalarSizeInBits() == 64u))) {
+        EC.isKnownEven() && DAG.getDataLayout().isLittleEndian() &&
----------------
davemgreen wrote:

Non-temporal stores are a performance optimization, we can always drop the "non-temporal" part and just store them normally. The LDNP nodes are already handled this same way, this brings the STNP in-line.
If you see the tests there is a lot less shuffling which should be better for performance.

https://github.com/llvm/llvm-project/pull/67413


More information about the llvm-commits mailing list