[PATCH] [AArch64][ARM] Match interleaved memory accesses into ldN/stN/vldN/vstN intrinsics.

Hao Liu Hao.Liu at arm.com
Thu Jun 25 00:19:58 PDT 2015


Hi,

I refactored the patch according to Michael's comments. As well as inline comments to answer questions from Renato and Michael.

Review please.

Thanks,
-Hao


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:134
@@ +133,3 @@
+  // Check potential Factors.
+  for (Factor = MIN_FACTOR; Factor <= MAX_FACTOR; Factor++)
+    if (isDeInterleaveMaskOfFactor(Mask, Factor, Index))
----------------
rengolin wrote:
> HaoLiu wrote:
> > rengolin wrote:
> > > Checking for all factors "up to" in isDeInterleaveMaskOfFactor() is redundant with this line.
> > > 
> > > Though, I see that you're using it in other functions that may need that functionality.
> > > 
> > > Not sure how to split this, but it looks inefficient...
> > I merged isDeInterleaveMask() and isDeInterleaveMaskOfFactor() into one function isDeInterleaveMask().
> Hum, I'm still seeing isDeInterleaveMaskOfFactor in the latest patch...
Sorry. I misunderstood and misleaded. I merged isReInterleaveMask() and isReInterleaveMaskOfFactor().

I cannot merge isDeInterleaveMask() and isDeInterleaveMaskOfFactor(), which are both used in lowerInterleavedLoad(). The former is used to check and find an interleave factor. The later is only used to check whether the given mask is the DE-interleaved of the given factor.

================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:32-33
@@ +31,4 @@
+// E.g. An interleaved store (Factor = 2):
+//        %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
+//        store <8 x i32> %i.vec, <8 x i32>* %ptr
+//
----------------
mzolotukhin wrote:
> How would IR look for 4 vectors? Will we have a shuffle of shuffles?
Will have a shuffle.
E.g. An interleaved store of factor 4.
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v2, <0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15>
        store <16 x i32> %i.vec, <16 x i32>* %ptr

%v0 and %v2 could be concatenated from other small vectors, like:
        %v0 = shuffle <4 x i32> %A, <4 x i32> %B, <0, 1, 2, 3, 4, 5, 6, 7>
        %v1 = shuffle <4 x i32> %C, <4 x i32> %D, <0, 1, 2, 3, 4, 5, 6, 7>

 but we only need to check the last shuffle with the RE-interleaved mask.


================
Comment at: lib/CodeGen/InterleavedAccessPass.cpp:240-242
@@ +239,5 @@
+
+  ShuffleVectorInst *SVI = dyn_cast<ShuffleVectorInst>(SI->getValueOperand());
+  if (!SVI || !SVI->hasOneUse())
+    return false;
+
----------------
mzolotukhin wrote:
> Will it work for `Factor != 2`? If not, and other factors aren't supported for now, please add an explicit assert and TODO for it. If yes, should we also check the other shuffles?
Yes, it will work for other factor. 
As the previous example of factor 4, we only need to check the last shuffle with RE-interleaved mask.

================
Comment at: lib/Target/AArch64/AArch64TargetTransformInfo.cpp:415
@@ +414,3 @@
+
+  if (Factor > 1 && Factor < 5) {
+    unsigned NumElts = VecTy->getVectorNumElements();
----------------
mzolotukhin wrote:
> Nitpick: I'd prefer comparing with 2 and 4, instead of 1 and 5. I.e.
> ```
> if (Factor >= 2 && Factor <= 4)
> ```
> Also, could we somehow reuse `MIN_FACTOR` and `MAX_FACTOR` from `InterleavedAccessPass.cpp` here? Having the same constants in different places will lead to bugs in future.
I refactored to add a hook called getMaxSupportedInterleaveFactor(), which is used to share the maximum factor supported by a target. 
No need to get the minimum factor, which is always 2.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:142
@@ -141,3 +141,3 @@
 static cl::opt<bool> EnableInterleavedMemAccesses(
-    "enable-interleaved-mem-accesses", cl::init(false), cl::Hidden,
+    "enable-interleaved-mem-accesses", cl::init(true), cl::Hidden,
     cl::desc("Enable vectorization on interleaved memory accesses in a loop"));
----------------
mzolotukhin wrote:
> This change doesn't belong here and anyway needs a separate discussion.
Yes. Forgot to clean.

http://reviews.llvm.org/D10533

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list