[llvm] [IA]: Construct (de)interleave4 out of (de)interleave2 (PR #89276)

Paul Walker via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 30 09:56:16 PDT 2024


================
@@ -17020,25 +17195,25 @@ bool AArch64TargetLowering::lowerInterleaveIntrinsicToStore(
     Pred =
         Builder.CreateVectorSplat(StTy->getElementCount(), Builder.getTrue());
 
-  Value *L = II->getOperand(0);
-  Value *R = II->getOperand(1);
-
+  auto WideValues = ValuesToInterleave;
+  if (UseScalable)
+    ValuesToInterleave.push_back(Pred);
+  ValuesToInterleave.push_back(BaseAddr);
   for (unsigned I = 0; I < NumStores; ++I) {
     Value *Address = BaseAddr;
     if (NumStores > 1) {
       Value *Offset = Builder.getInt64(I * Factor);
       Address = Builder.CreateGEP(StTy, BaseAddr, {Offset});
-
       Value *Idx =
           Builder.getInt64(I * StTy->getElementCount().getKnownMinValue());
-      L = Builder.CreateExtractVector(StTy, II->getOperand(0), Idx);
-      R = Builder.CreateExtractVector(StTy, II->getOperand(1), Idx);
+      for (unsigned J = 0; J < Factor; J++) {
+        ValuesToInterleave[J] =
+            Builder.CreateExtractVector(StTy, WideValues[J], Idx);
+      }
+      // update the address
+      ValuesToInterleave[ValuesToInterleave.size() - 1] = Address;
     }
-
-    if (UseScalable)
-      Builder.CreateCall(StNFunc, {L, R, Pred, Address});
-    else
-      Builder.CreateCall(StNFunc, {L, R, Address});
+    Builder.CreateCall(StNFunc, ValuesToInterleave);
----------------
paulwalker-arm wrote:

After this point it looks like there will be some dead instructions that remain after the transformation because the caller of `lowerInterleaveIntrinsicToStore` will only erase the store and first call to vector.interleave2.

You'll have to maintain a dead instruction list like you have for the deinterleaving case but because instructions are erased "uses first" I think it best for the SmallVector to be passed into `lowerInterleaveIntrinsicToStore` which we they populate and then the caller should iterate across calling erase.

If you agree with this change then I think it makes sense to do the same for `lowerDeinterleaveIntrinsicToLoad` so that all instruction deletion is the responsibility of the caller.  The effect on other targets will be minimal because as it stands they don't have any dead instruction and so will just ignore the new parameter.



https://github.com/llvm/llvm-project/pull/89276


More information about the llvm-commits mailing list