[PATCH] D141924: [IR] Add new intrinsics interleave and deinterleave vectors

Thu Feb 9 08:55:47 PST 2023

paulwalker-arm added a comment.

Hi @CarolineConcatto, it's fair to say most of my comments here are likely petty and relate more to the documentation side of things where I'd rather be more explicit in describing the operations being added.  That said, they're all just suggestions for you to decide what to do with.   That just leaves the way `OutVT` is calculated and the use of `getVTList()` as changes I more strongly encourage.  Otherwise I think the patch looks great.

================
Comment at: llvm/docs/LangRef.rst:17689
+
+The '``llvm.experimental.vector.deinterleave2``' intrinsic construct two
+vectors by deinterleaving the even and odd lanes of the input vector.
----------------
"constructs"

I've offered an alternate description of the ISD nodes that might be worth adapting for the intrinsic text if you think there's value.

================
Comment at: llvm/docs/LangRef.rst:17692-17693
+
+This intrinsic work for both fixed and scalable vectors. While this intrinsic
+is marked as experimental, the recommended way to express this operation for
+fixed-width vectors is still to use a shufflevector, as that may allow for more
----------------
craig.topper wrote:
> work -> works
Perhaps just "While this intrinsic supports all vector types the recommended...."?

================
Comment at: llvm/docs/LangRef.rst:17706
+
+The argument to this intrinsic must be a vector and that is twice the size of
+one output vector.
----------------
"The argument is a vector whose type corresponds to the logical concatenation of the two result types."?

================
Comment at: llvm/docs/LangRef.rst:17727
+
+This intrinsic work for both fixed and scalable vectors. While this intrinsic
+is marked as experimental, the recommended way to express this operation for
----------------
craig.topper wrote:
> work -> works
As above.

================
Comment at: llvm/docs/LangRef.rst:17740-17741
+""""""""""
+The argument to this intrinsic must be two vectors where each vector must be
+half the size of the output.
+
----------------
"Both arguments must be vectors of the same type whereby their logical concatenation matches the result type."?

================
Comment at: llvm/include/llvm/CodeGen/ISDOpcodes.h:574-577
+  /// VECTOR_DEINTERLEAVE(VEC1, VEC2) - Returns two deinterleaved vectors;
+  /// the even indexes from VEC1 and VEC2 construct the first result vector.
+  /// The odd indexes from VEC1 and VEC2 contruct the second result vector.
+  /// The type of the two result vectors must match the type of VEC1 and VEC2.
----------------
It's worth being explicit as to what "from VEC1 and VEC2" means.  Perhaps:
```
Returns two vectors with all input and output vectors having the same type. The first output contains the even indices from CONCAT_VECTORS(VEC1, VEC2), with the second output containing the odd indices. The relative order of elements within an output match that of the concatenated input.
```

================
Comment at: llvm/include/llvm/CodeGen/ISDOpcodes.h:580-584
+  /// VECTOR_INTERLEAVE(VEC1, VEC2) - Returns two interleaved vectors;
+  /// the first result vector is constructed from interleaving the low halves
+  /// of VEC1 and VEC2. The second result vector is constructed from
+  /// interleaving the high halves of VEC1 and VEC2.
+  /// The type of the two result vectors must match the type of VEC1 and VEC2.
----------------
Similar to the above, perhaps:
```
Returns two vectors with all input and output vectors having the same type. The first output contains the result of interleaving the low half of CONCAT_VECTORS(VEC1, VEC2), with the second output containing the result of interleaving the high half.
```

================
Comment at: llvm/include/llvm/IR/Intrinsics.td:2120-2128
+def int_experimental_vector_interleave2   : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
+                                                                  [LLVMHalfElementsVectorType<0>,
+                                                                   LLVMHalfElementsVectorType<0>],
+                                                                  [IntrNoMem]>;
+
+def int_experimental_vector_deinterleave2 : DefaultAttrsIntrinsic<[LLVMHalfElementsVectorType<0>,
+                                                                   LLVMHalfElementsVectorType<0>],
----------------
CarolineConcatto wrote:
> paulwalker-arm wrote:
> > These intrinsic definitions don't look to match the langref description or tests.
> > 
> > The text says:
> > ```
> > declare <4 x double> @llvm.experimental.vector.interleave2.v2f64(<2 x double> %vec1, <2 x double> %vec2)
> > ```
> > making `v2f64` the overloaded type from which the others (i.e. the `<4 x double>` return type) are derived.  However, `int_experimental_vector_interleave2` is defined such that the return type is the overloaded type.  You can see this if you run your tests through `opt`. For example, interleave2_nxv2f16 becomes:
> > ```
> > define <vscale x 4 x half> @interleave2_nxv2f16(<vscale x 2 x half> %vec0, <vscale x 2 x half> %vec1) #0 {
> >   %retval = call <vscale x 4 x half> @llvm.experimental.vector.interleave2.nxv4f16(<vscale x 2 x half> %vec0, <vscale x 2 x half> %vec1)
> >   ret <vscale x 4 x half> %retval
> > }
> > ```
> > For what it's worth I think the text is correct because overloading the smaller type will look more consistent when adding other shapes. Which is to say I have a slight preference for:
> > ```
> > <vscale x 4 x half> @llvm.experimental.vector.interleave2.nxv2f16(<vscale x 2 x half>...
> > <vscale x 6 x half> @llvm.experimental.vector.interleave3.nxv2f16(<vscale x 2 x half>...
> > ```
> > That will mean adding `LLVMDoubleElementsVectorType` though, unless there's already a class that'll do what we need.
>  I believe we want to keep like this and not create a LLVMDoubleElementsVectorType. AFAIU we should use what we already have available in LLVM. More over if we want more strides in the future we may not even have to use LLVMHalfElementsVectorType.
> But again, if you have a good argument about the use of LLVMDoubleElementsVectorType  I can change. ATM I only update the test and the examples to use the biggest size type as overloaded.
You'll need to extend the overloaded type support anyway when adding support for other strides so I don't really by the argument.  That said, the type to overload on is subjective so if you prefer to use the bigger type then sure I can live with that.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:11563-11564
+  SmallVector<EVT, 4> OutVTs;
+  ComputeValueVTs(DAG.getTargetLoweringInfo(), DAG.getDataLayout(), I.getType(),
+                  OutVTs);
+
----------------
Given how the intrinsic is defined, what do you think of using `EVT OutVT = InVec.getValueType().getHalfNumVectorElementsVT();` rather than the more expensive looking call to ComputeValueVTs.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:11574
+
+  // Use VECTOR_SHUFFLE for the fixed-length vector to to benefit from
+  // existing combines for recognizing specific deinterleave patterns using
----------------
"Use VECTOR_SHUFFLE for fixed-length vectors to benefit from existing legalisation and combines."

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:11577
+  // VECTOR_SHUFFLE.
+  if (!OutVTs[0].isScalableVector()) {
+    SDValue Even = DAG.getVectorShuffle(OutVTs[0], DL, Lo, Hi,
----------------
It's better to be specific, so `OutVTs[0].isFixedLengthVector()`.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:11596
+  EVT InVT = getValue(I.getOperand(0)).getValueType();
+  SDValue InVec[] = {getValue(I.getOperand(0)), getValue(I.getOperand(1))};
+  const TargetLowering &TLI = DAG.getTargetLoweringInfo();
----------------
Not that bothered but does the array gives us anything useful over the more typical:
```
SDValue InVec1 = getValue(I.getOperand(0));
SDValue InVec2 = getValue(I.getOperand(1));
```

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:11600-11602
+  // Use VECTOR_SHUFFLE for the fixed-length vector to to benefit from
+  // existing combines for recognizing specific interleave patterns using
+  // VECTOR_SHUFFLE.
----------------
Similar typos as mentioned in visitVectorDeinterleave.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23575
+  assert(OpVT.isScalableVector() &&
+         "Unexpected fixed length vector in LowerVECTOR_DEINTERLEAVE.");
+  SDValue Even = DAG.getNode(AArch64ISD::UZP1, DL, OpVT, Op.getOperand(0),
----------------
I think `Expected scalable vector` better reflects the assert.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23580
+                            Op.getOperand(1));
+  return DAG.getNode(ISD::MERGE_VALUES, DL, DAG.getVTList(OpVT, OpVT), Even,
+                     Odd);
----------------
I think `Op->getVTList()` should work here?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23589
+  assert(OpVT.isScalableVector() &&
+         "Unexpected fixed length vector in LowerVECTOR_INTERLEAVE.");
+
----------------
Same comment as with LowerVECTOR_DEINTERLEAVE.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:23595
+                           Op.getOperand(1));
+  return DAG.getNode(ISD::MERGE_VALUES, DL, DAG.getVTList(OpVT, OpVT), Lo, Hi);
+}
----------------
Same comment as with LowerVECTOR_DEINTERLEAVE.

================
Comment at: llvm/test/CodeGen/AArch64/fixed-vector-deinterleave.ll:123-125
+define {<16 x i1>, <16 x i1>} @vector_deinterleave_v16i1_v32i1(<32 x i1> %vec) {
+; CHECK-LABEL: vector_deinterleave_v16i1_v32i1:
+; CHECK:       // %bb.0:
----------------
I don't think this test offers any value.  It's really showing how `VECTOR_SHUFFLE` is legalised, which this patch doesn't care about.

================
Comment at: llvm/test/CodeGen/AArch64/fixed-vector-interleave.ll:121
+
+define <32 x i1> @interleave2_v32i1(<16 x i1> %vec0, <16 x i1> %vec1) {
+; CHECK-LABEL: interleave2_v32i1:
----------------
As with the deinterleave case I don't think this is testing anything the patch really cares about and my worry it'll just cause unnecessary pain for unrelated patches.  If we plan to significantly improve the code generation then fine but if not then perhaps they're best removed?

================
Comment at: llvm/test/CodeGen/AArch64/sve-vector-deinterleave.ll:12
+; CHECK-NEXT:    ret
+%retval = call {<vscale x 2 x half>, <vscale x 2 x half>} @llvm.experimental.vector.deinterleave2.nxv4f16(<vscale x 4 x half> %vec)
+ret {<vscale x 2 x half>, <vscale x 2 x half>} %retval
----------------
Please can you indent the IR here and for the other functions in this file.style

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141924/new/

https://reviews.llvm.org/D141924