[PATCH] D146218: [AArch64][CodeGen] Lower (de)interleave2 intrinsics to ld2/st2

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 1 09:35:11 PDT 2023


paulwalker-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14428
 
+static Function *GetStructuredLoadFunction(Module *M, unsigned Factor,
+                                           bool Scalable, Type *LDVTy,
----------------
Function names should start with a lower case letter?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14443
+
+static Function *GetStructuredStoreFunction(Module *M, unsigned Factor,
+                                            bool Scalable, Type *STVTy,
----------------
As above.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14794-14795
+  VectorType *VTy = cast<VectorType>(DI->getType()->getContainedType(0));
+  if (!Subtarget->hasNEON() || (VTy->isScalableTy() && !Subtarget->hasSVE()))
+    return false;
+
----------------
Perhaps this is better done in `isLegalInterleavedAccessType`? as it cannot really be considered legal if the necessary target features are not available. You likely also want to use `hasSVEorSME` because SME supports these instructions as well.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14810-14812
+  VectorType *LdTy =
+      VectorType::get(VTy->getElementType(),
+                      VTy->getElementCount().divideCoefficientBy(NumLoads));
----------------
I suspect this patch will ICE the compiler when faced with SVE fixed length vectors and there's no tests to prove otherwise.  The nuance here is the result vectors are fixed length but the target intrinsics will generate scalable vectors.

If you look at `lowerInterleavedLoad` you'll see the extra complexity relating to working with a container type.

With that said, I'm happy for this patch to not support SVE fixed length vectors, it just cannot trigger an ICE.  This might be as simple as adding a bail out for `UseScalable != isa<ScalableVector>(VTy)` and adding a few tests.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14828
+        LdTy->getElementCount(),
+        ConstantInt::getTrue(IntegerType::getInt1Ty(VTy->getContext())));
+
----------------
I think just `ConstantInt::getTrue(LdTy->getContext())` should work here?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14877
+  VectorType *VTy = cast<VectorType>(II->getOperand(0)->getType());
+  if (!Subtarget->hasNEON() || (VTy->isScalableTy() && !Subtarget->hasSVE()))
+    return false;
----------------
Same comment as with lowerDeinterleaveIntrinsicToLoad.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14890-14892
+  StoreInst *SI = dyn_cast<StoreInst>(*(II->users().begin()));
+  if (!SI)
+    return false;
----------------
I don't really see what you're gaining by passing in `Address` rather than just passing in `SI`?  I guess similar is true for `lowerDeinterleaveIntrinsicToLoad`.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:14916
+        StTy->getElementCount(),
+        ConstantInt::getTrue(IntegerType::getInt1Ty(StTy->getContext())));
+
----------------
`ConstantInt::getTrue(StTy->getContext())`?


================
Comment at: llvm/test/Transforms/InterleavedAccess/AArch64/sve-deinterleave-intrinsics.ll:18-19
+  %deinterleave = tail call { <vscale x 16 x i8>, <vscale x 16 x i8> } @llvm.experimental.vector.deinterleave2.nxv32i8(<vscale x 32 x i8> %load)
+  %l = extractvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %deinterleave, 0
+  %r = extractvalue { <vscale x 16 x i8>, <vscale x 16 x i8> } %deinterleave, 1
+  ret void
----------------
How about returning `%deinterleave` instead of the floating `extractvalue`'s, or do they exist to test something specific?


================
Comment at: llvm/test/Transforms/InterleavedAccess/AArch64/sve-deinterleave-intrinsics.ll:190-192
+;;; Fixed types
+
+define void @deinterleave_i8_factor2(ptr %ptr) #0 {
----------------
I think it's better for NEON to have its own test file.  Likewise for SVE fixed length support, which I suspect doesn't work.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146218/new/

https://reviews.llvm.org/D146218



More information about the llvm-commits mailing list