[PATCH] D152205: WIP: [Aarch64][SVE]SVE2] Enable tbl, tbl2 for shuffle lowring for fixed vector types.
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 6 00:51:29 PDT 2023
sdesmalen added a comment.
Hi @dtemirbulatov, I only had a glance so haven't reviewed this properly yet, but already left you a few comments and questions.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25149
- return SDValue();
+ if (!MaxSVESize || MinSVESize != MaxSVESize)
+ return SDValue();
----------------
This could do with a code comment describing why we don't do this when we don't know the exact SVE size.
I think there are several cases to distinguish here:
* Elements are selected from either the first source vector, or from the second, but not both. This can be implemented with SVE and when the SVE size isn't known.
* Elements are selected from both source vectors.
* If we only have SVE, then we can mimic this with multiple (single-source) TBL instructions and selects, but I suspect there's little value in handling this, given that SVE2 is the prevalent feature going forward.
* If we don't know the size, then we need to do more work to update the indices for the TBL. For example:
```<2 x i64> %x, <2 x i64> %y, <4 x i32> <0, 1, 2, 3>
if SVE VL = 256, then we'd basically insert the <2 x i64> vector value into a conceptual <4 x i64> register where only the first two lanes are relevant, so the offsets must be: <0, 1, 4, 5>
if SVE VL = 512, then we'd basically insert the <2 x i64> vector value into a conceptual <8 x i64> register where only the first two lanes are relevant, so the offsets must be: <0, 1, 8, 9>
So we'd somehow need to do something like this:
<0, 1, 2, 3> + <0, 0, VL, VL>```
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25150
+ if (!MaxSVESize || MinSVESize != MaxSVESize)
+ return SDValue();
+
----------------
nit: tab?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25152
+
+ bool Swap = false;
+ if (Op1.isUndef() || isZerosVector(Op1.getNode())) {
----------------
It's not really clear to me what `Swap` and `IsUndefOrZero` are supposed to do. I would have expected it to look at the indices rather than the source vector operands. Can you add a comment explaining this logic?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:25195-25206
+ } else {
+ if (Subtarget->hasSVE2())
+ Shuffle = convertFromScalableVector(
+ DAG, VT,
+ DAG.getNode(
+ ISD::INTRINSIC_WO_CHAIN, DL, ContainerVT,
+ DAG.getConstant(Intrinsic::aarch64_sve_tbl2, DL, MVT::i32), Op1,
----------------
nit: what about this?
================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll:411
; CHECK-NEXT: st1 { v1.h }[5], [x10]
-; CHECK-NEXT: mov w10, #26
+; CHECK-NEXT: mov w10, #26 // =0x1a
; CHECK-NEXT: orr x10, x8, x10
----------------
Seems like these changes can be committed separately.
================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-vector-shuffle.ll:945
+; CHECK-NEXT: adrp x8, .LCPI42_0
+; CHECK-NEXT: add x8, x8, :lo12:.LCPI42_0
+; CHECK-NEXT: ptrue p0.d
----------------
This needs some check lines for the constant.
================
Comment at: llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-vector-shuffle.ll:392
+; CHECK-NEXT: .hword 255
+define <4 x i16> @shuffle_v4i16_tbl2(ptr %a, ptr %b) #1 {
+; CHECK-LABEL: shuffle_v4i16_tbl2:
----------------
Can you make it somehow clear form the name of the function, or by inlining the attribute here (e.g. `@shuffle_v4i16_tbl2(ptr %a, ptr %b) "target-features"="+sve2" {`) that this relates to SVE2?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152205/new/
https://reviews.llvm.org/D152205
More information about the llvm-commits
mailing list