[PATCH] [AArch64][ARM] Match interleaved memory accesses into ldN/stN/vldN/vstN intrinsics.

Michael Zolotukhin mzolotukhin at apple.com
Wed Jun 24 11:59:20 PDT 2015


Hi Hao,

The code generally looks fine, but I have a question regarding `lowerInterleavedStore` (please see inline).

Thanks,
Michael


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:32-33
@@ +31,4 @@
+// E.g. An interleaved store (Factor = 2):
+//        %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
+//        store <8 x i32> %i.vec, <8 x i32>* %ptr
+//
----------------
How would IR look for 4 vectors? Will we have a shuffle of shuffles?

================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:56-57
@@ +55,4 @@
+
+static const unsigned MIN_FACTOR = 2;
+static const unsigned MAX_FACTOR = 4;
+
----------------
Do these names comply with the coding standards?

================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:240-242
@@ +239,5 @@
+
+  ShuffleVectorInst *SVI = dyn_cast<ShuffleVectorInst>(SI->getValueOperand());
+  if (!SVI || !SVI->hasOneUse())
+    return false;
+
----------------
Will it work for `Factor != 2`? If not, and other factors aren't supported for now, please add an explicit assert and TODO for it. If yes, should we also check the other shuffles?

================
Comment at: lib/Target/AArch64/AArch64TargetTransformInfo.cpp:415
@@ +414,3 @@
+
+  if (Factor > 1 && Factor < 5) {
+    unsigned NumElts = VecTy->getVectorNumElements();
----------------
Nitpick: I'd prefer comparing with 2 and 4, instead of 1 and 5. I.e.
```
if (Factor >= 2 && Factor <= 4)
```
Also, could we somehow reuse `MIN_FACTOR` and `MAX_FACTOR` from `InterleavedAccessPass.cpp` here? Having the same constants in different places will lead to bugs in future.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:142
@@ -141,3 +141,3 @@
 static cl::opt<bool> EnableInterleavedMemAccesses(
-    "enable-interleaved-mem-accesses", cl::init(false), cl::Hidden,
+    "enable-interleaved-mem-accesses", cl::init(true), cl::Hidden,
     cl::desc("Enable vectorization on interleaved memory accesses in a loop"));
----------------
This change doesn't belong here and anyway needs a separate discussion.

http://reviews.llvm.org/D10533

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list