[PATCH] D133421: [AArch64] break non-temporal loads over 256 into 256-loads and a smaller load

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 7 07:08:10 PDT 2022


fhahn requested changes to this revision.
fhahn added a comment.
This revision now requires changes to proceed.

Thanks for sharing the patch. As is,  it looks like there are multiple test failures that are caused by this patch and need addressing (some comments inline about correctness issues)



================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:53
 #include "llvm/IR/Attributes.h"
+#include "llvm/IR/Constant.h"
 #include "llvm/IR/Constants.h"
----------------
Is this needed?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:73
 #include "llvm/IR/Value.h"
+#include "llvm/IR/VectorBuilder.h"
 #include "llvm/MC/MCRegisterInfo.h"
----------------
Is this needed?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17820
+  EVT MemVT = LD->getMemoryVT();
+  SDValue Chain = LD->getChain();
+  SDValue Ptr = LD->getBasePtr();
----------------
move all variables that are used after the early exit down after the early exit.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17825
+  SmallVector<SDValue, 4> LoadOps;
+  if (MemVT.getSizeInBits() <= 256 || MemVT.getSizeInBits() % 256 == 0 ||
+      256 % MemVT.getScalarSizeInBits() != 0)
----------------
We should also only do this for non-temporal loads.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17829
+
+  unsigned PtrOffset = 256;
+  MVT NewVT =
----------------
Could you add a comment here illustrating what kind of DAG nodes we create to replace the original load?


It would also be good to document the motivation for special handling for non-temporal loads here.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17836
+    SDValue NewPtr =
+        DAG.getMemBasePlusOffset(Ptr, TypeSize::Fixed(PtrOffset), DL, Flags);
+    Align NewAlign = commonAlignment(LD->getAlign(), PtrOffset);
----------------
I don't think that's correct, you are passing the offset in bits, but I think it should be in bytes.

Looking at some of the test changes, the offsets of the loads are wrong, e.g. ` ldnp q0, q2, [x0, #256]`.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17867
+  SDValue ExtendedReminingLoad = DAG.getNode(
+      ISD::INSERT_SUBVECTOR, DL, NewVT, {NullVector, RemainingLoad, InsertIdx});
+  LoadOps.push_back(ExtendedReminingLoad);
----------------
Do you need to create `NullVector` explicitly here? Could you just use `DAG.degUNDEF(NewVT)` here?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:19860
   case ISD::LOAD:
-    if (performTBISimplification(N->getOperand(1), DCI, DAG))
-      return SDValue(N, 0);
----------------
This should be kept I think.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133421/new/

https://reviews.llvm.org/D133421



More information about the llvm-commits mailing list