[llvm] [IA]: Construct (de)interleave4 out of (de)interleave2 (PR #89276)

Paul Walker via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 5 10:40:11 PDT 2024


================
@@ -16493,39 +16529,71 @@ bool AArch64TargetLowering::lowerDeinterleaveIntrinsicToLoad(
         LdN = Builder.CreateCall(LdNFunc, {Pred, Address}, "ldN");
       else
         LdN = Builder.CreateCall(LdNFunc, Address, "ldN");
-
       Value *Idx =
           Builder.getInt64(I * LdTy->getElementCount().getKnownMinValue());
-      Left = Builder.CreateInsertVector(
-          VTy, Left, Builder.CreateExtractValue(LdN, 0), Idx);
-      Right = Builder.CreateInsertVector(
-          VTy, Right, Builder.CreateExtractValue(LdN, 1), Idx);
+      for (int J = 0; J < Factor; ++J) {
+        WideValues[J] = Builder.CreateInsertVector(
+            VTy, WideValues[J], Builder.CreateExtractValue(LdN, J), Idx);
+      }
+    }
+    // FIXME: the types should NOT be added manually.
+    if (2 == Factor)
+      Result = PoisonValue::get(StructType::get(VTy, VTy));
+    else
+      Result = PoisonValue::get(StructType::get(VTy, VTy, VTy, VTy));
+    // Construct the wide result out of the small results.
+    for (int J = 0; J < Factor; ++J) {
+      Result = Builder.CreateInsertValue(Result, WideValues[J], J);
     }
-
-    Result = PoisonValue::get(DI->getType());
-    Result = Builder.CreateInsertValue(Result, Left, 0);
-    Result = Builder.CreateInsertValue(Result, Right, 1);
   } else {
     if (UseScalable)
       Result = Builder.CreateCall(LdNFunc, {Pred, BaseAddr}, "ldN");
     else
       Result = Builder.CreateCall(LdNFunc, BaseAddr, "ldN");
   }
+  if (Factor > 2) {
+    for (unsigned I = 0; I < ValuesToDeinterleave.size(); I++) {
+      llvm::Value *CurrentExtract = ValuesToDeinterleave[I];
+      Value *NewExtract = Builder.CreateExtractValue(Result, I);
+      CurrentExtract->replaceAllUsesWith(NewExtract);
+      cast<Instruction>(CurrentExtract)->eraseFromParent();
+    }
 
+    for (auto &dead : DeadInsts)
+      dead->eraseFromParent();
+    return true;
+  }
   DI->replaceAllUsesWith(Result);
   return true;
 }
 
+bool GetInterleaveLeaves(Value *II, SmallVectorImpl<Value *> &InterleaveOps) {
+  Value *Op0, *Op1;
+  if (!match(II, m_Interleave2(m_Value(Op0), m_Value(Op1))))
+    return false;
+
+  if (!GetInterleaveLeaves(Op0, InterleaveOps)) {
+    InterleaveOps.push_back(Op0);
+  }
+
+  if (!GetInterleaveLeaves(Op1, InterleaveOps)) {
+    InterleaveOps.push_back(Op1);
+  }
+  return true;
----------------
paulwalker-arm wrote:

This looks too general to me given we require one of two specific patterns.  I see a couple of problems:

1. The return operands don't look to be ordered correctly for how `st4` works?
`interleave2(interleave2(Value(A), Value(B)), interleave2(Value(C), Value(D)))` with return `InterleaveOps = {A, B, C, D}` but from the previous conversation I believe `interleave4(A, B, C, D)` is the equivalent of `interleave2(interleave2(Value(A), Value(C)), interleave2(Value(B), Value(D)))`?

2. We'll miss places where `st2` or `st4` can be used based purely because their operands are the result of a call to interleave2.  For example, `interleave2(interleave2(Value(D), Value(C)), interleave2(Value(B), Value(A)))` can still use `st2` it's just the two child `interleave2` calls will remain.

Perhaps we can just look for the specific `st4` pattern and if that fails we then look for the `st2` pattern (which is a given because we already know `II` is a call to `interleave2`.

https://github.com/llvm/llvm-project/pull/89276


More information about the llvm-commits mailing list