[PATCH] D152205: WIP: [Aarch64][SVE]SVE2] Enable tbl, tbl2 for shuffle lowring for fixed vector types.

Tue Jun 6 00:51:29 PDT 2023

sdesmalen added a comment.

Hi @dtemirbulatov, I only had a glance so haven't reviewed this properly yet, but already left you a few comments and questions.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25149

-  return SDValue();
+  if (!MaxSVESize || MinSVESize != MaxSVESize)
+    return SDValue();
----------------
This could do with a code comment describing why we don't do this when we don't know the exact SVE size.

I think there are several cases to distinguish here:
* Elements are selected from either the first source vector, or from the second, but not both. This can be implemented with SVE and when the SVE size isn't known.
* Elements are selected from both source vectors.
    * If we only have SVE, then we can mimic this with multiple (single-source) TBL instructions and selects, but I suspect there's little value in handling this, given that SVE2 is the prevalent feature going forward.
    * If we don't know the size, then we need to do more work to update the indices for the TBL. For example:
```<2 x i64> %x, <2 x i64> %y, <4 x i32> <0, 1, 2, 3>

if SVE VL = 256, then we'd basically insert the <2 x i64> vector value into a conceptual <4 x i64> register where only the first two lanes are relevant, so the offsets must be: <0, 1, 4, 5>
if SVE VL = 512, then we'd basically insert the <2 x i64> vector value into a conceptual <8 x i64> register where only the first two lanes are relevant, so the offsets must be: <0, 1, 8, 9>

So we'd somehow need to do something like this:

<0, 1, 2, 3> + <0, 0, VL, VL>```

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25150
+  if (!MaxSVESize || MinSVESize != MaxSVESize)
+    return SDValue();
+
----------------
nit: tab?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25152
+
+  bool Swap = false;
+  if (Op1.isUndef() || isZerosVector(Op1.getNode())) {
----------------
It's not really clear to me what `Swap` and `IsUndefOrZero` are supposed to do. I would have expected it to look at the indices rather than the source vector operands. Can you add a comment explaining this logic?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25195-25206
+  } else {
+    if (Subtarget->hasSVE2())
+      Shuffle = convertFromScalableVector(
+          DAG, VT,
+          DAG.getNode(
+              ISD::INTRINSIC_WO_CHAIN, DL, ContainerVT,
+              DAG.getConstant(Intrinsic::aarch64_sve_tbl2, DL, MVT::i32), Op1,
----------------
nit: what about this?

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll:411
 ; CHECK-NEXT:    st1 { v1.h }[5], [x10]
-; CHECK-NEXT:    mov w10, #26
+; CHECK-NEXT:    mov w10, #26 // =0x1a
 ; CHECK-NEXT:    orr x10, x8, x10
----------------
Seems like these changes can be committed separately.

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-vector-shuffle.ll:945
+; CHECK-NEXT:    adrp x8, .LCPI42_0
+; CHECK-NEXT:    add x8, x8, :lo12:.LCPI42_0
+; CHECK-NEXT:    ptrue p0.d
----------------
This needs some check lines for the constant.

================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-vector-shuffle.ll:392
+; CHECK-NEXT:	.hword	255
+define <4 x i16> @shuffle_v4i16_tbl2(ptr %a, ptr %b) #1 {
+; CHECK-LABEL: shuffle_v4i16_tbl2:
----------------
Can you make it somehow clear form the name of the function, or by inlining the attribute here (e.g. `@shuffle_v4i16_tbl2(ptr %a, ptr %b) "target-features"="+sve2" {`) that this relates to SVE2?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152205/new/

https://reviews.llvm.org/D152205