[llvm] [LV]: Teach LV to recursively (de)interleave. (PR #89018)

Fri Sep 13 10:21:17 PDT 2024

================
@@ -2126,10 +2127,39 @@ static Value *interleaveVectors(IRBuilderBase &Builder, ArrayRef<Value *> Vals,
   // Scalable vectors cannot use arbitrary shufflevectors (only splats), so
   // must use intrinsics to interleave.
   if (VecTy->isScalableTy()) {
-    VectorType *WideVecTy = VectorType::getDoubleElementsVectorType(VecTy);
-    return Builder.CreateIntrinsic(WideVecTy, Intrinsic::vector_interleave2,
-                                   Vals,
-                                   /*FMFSource=*/nullptr, Name);
+    unsigned InterleaveFactor = Vals.size();
+    SmallVector<Value *> InterleavingValues;
+    unsigned InterleavingValuesCount =
+        InterleaveFactor + (InterleaveFactor - 2);
+    InterleavingValues.resize(InterleaveFactor);
+    // Place the values to be interleaved in the correct order for the
+    // interleaving
+    for (unsigned I = 0, J = InterleaveFactor / 2, K = 0; K < InterleaveFactor;
+         K++) {
+      if (K % 2 == 0) {
+        InterleavingValues[K] = Vals[I];
+        I++;
+      } else {
+        InterleavingValues[K] = Vals[J];
+        J++;
+      }
+    }
----------------
paulwalker-arm wrote:

Would the following simplification work?
```
for (unsigned I = 0; I < InterleaveFactor/2; ++I) {
  InterleavingValues[2*I] = Value[I];
  InterleavingValues[2*I+1] = Value[I + InterleaveFactor/2];
}
```

Simplification aside, does this two stage algorithm work?  Or rather, I'm pretty sure it doesn't work, but I'm unsure if there are intentional restrictions that means it is only supposed to work for specific factors.  

I could be wrong but I think the algorithm works for InterleavingValues==2 and InterleavingValues==4 but fails for InterleavingValues==8.  This would be kind of ok given the original code only worked for InterleavingValues==2, but the other changes in this PR (and the new code's complexity) imply you expect the algorithm to support all powers-of-two?

It would be good to know your intent here because then I can either suggest simplifying the code or help fix the algorithm if my observation is valid.


https://github.com/llvm/llvm-project/pull/89018