[llvm] [RISCV] Handle fixed length vectors with exact VLEN in loweringEXTRACT_SUBVECTOR (PR #79949)

Wed Feb 7 20:17:09 PST 2024

================
@@ -9653,19 +9656,46 @@ SDValue RISCVTargetLowering::lowerEXTRACT_SUBVECTOR(SDValue Op,
     return DAG.getBitcast(Op.getValueType(), Slidedown);
   }
 
+  if (VecVT.isFixedLengthVector()) {
+    VecVT = getContainerForFixedLengthVector(VecVT);
+    Vec = convertToScalableVector(VecVT, Vec, DAG, Subtarget);
+  }
+
+  MVT ContainerSubVecVT = SubVecVT;
   unsigned SubRegIdx, RemIdx;
-  std::tie(SubRegIdx, RemIdx) =
-      RISCVTargetLowering::decomposeSubvectorInsertExtractToSubRegs(
-          VecVT, SubVecVT, OrigIdx, TRI);
+
+  // extract_subvector scales the index by vscale is the subvector is scalable,
+  // and decomposeSubvectorInsertExtractToSubRegs takes this into account. So if
+  // we have a fixed length subvector, we need to adjust the index by 1/vscale.
+  if (SubVecVT.isFixedLengthVector()) {
+    assert(MinVLen == MaxVLen);
+    ContainerSubVecVT = getContainerForFixedLengthVector(SubVecVT);
+    unsigned Vscale = MinVLen / RISCV::RVVBitsPerBlock;
+    std::tie(SubRegIdx, RemIdx) =
+        RISCVTargetLowering::decomposeSubvectorInsertExtractToSubRegs(
+            VecVT, ContainerSubVecVT, OrigIdx / Vscale, TRI);
+    RemIdx = (RemIdx * Vscale) + (OrigIdx % Vscale);
+  } else {
+    std::tie(SubRegIdx, RemIdx) =
+        RISCVTargetLowering::decomposeSubvectorInsertExtractToSubRegs(
+            VecVT, ContainerSubVecVT, OrigIdx, TRI);
+  }
 
   // If the Idx has been completely eliminated then this is a subvector extract
   // which naturally aligns to a vector register. These can easily be handled
   // using subregister manipulation.
-  if (RemIdx == 0)
+  if (RemIdx == 0) {
+    if (SubVecVT.isFixedLengthVector()) {
+      Vec = DAG.getTargetExtractSubreg(SubRegIdx, DL, ContainerSubVecVT, Vec);
+      return convertFromScalableVector(SubVecVT, Vec, DAG, Subtarget);
+    }
     return Op;
+  }
 
-  // Else SubVecVT is a fractional LMUL and may need to be slid down.
-  assert(RISCVVType::decodeVLMUL(getLMUL(SubVecVT)).second);
+  // Else SubVecVT is a fractional LMUL and may need to be slid down: if
+  // SubVecVT was > M1 then the index would need to be a multiple of VLMAX, and
+  // so would divide exactly.
+  assert(RISCVVType::decodeVLMUL(getLMUL(ContainerSubVecVT)).second);
 
   // If the vector type is an LMUL-group type, extract a subvector equal to the
   // nearest full vector register type.
----------------
lukel97 wrote:

OrigIdx isn't a valid index since it needs to be a multiple of SubVT's vector length:

```
  /// EXTRACT_SUBVECTOR(VECTOR, IDX) - Returns a subvector from VECTOR.
  /// Let the result type be T, then IDX represents the starting element number
  /// from which a subvector of type T is extracted. IDX must be a constant
  /// multiple of T's known minimum vector length. If T is a scalable vector,
  /// IDX is first scaled by the runtime scaling factor of T. Elements IDX
  /// through (IDX + num_elements(T) - 1) must be valid VECTOR indices. If this
  /// condition cannot be determined statically but is false at runtime, then
  /// the result vector is undefined. The IDX parameter must be a vector index
  /// constant type, which for most targets will be an integer pointer type.
```

So everything past line 9659 is handling an extract on an LMUL boundary.

The only time we need to do a slide after this point is if SubVT is a fractional LMUL with a non zero index, e.g.

VLEN128, extract <1 x i64> (LMUL=MF2) from <2 x i64> (LMUL=1) at Index = 1.

So after the assert above, RemIdx will be != 0 and ContainerSubVecVT will be < M1. Which is why we can shrink the slidedown to M1 by capping InterSubVT to M1 too.

https://github.com/llvm/llvm-project/pull/79949