[PATCH] D114960: [AArch64][SVE] Lower shuffles to permute instructions: rev/revb/revh/revw

Thu Dec 9 08:38:56 PST 2021

wwei added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:19332
+  if (MinSVESize == MaxSVESize && MaxSVESize == VT.getSizeInBits() &&
+      ShuffleVectorInst::isReverseMask(ShuffleMask)) {
+    Op = DAG.getNode(ISD::VECTOR_REVERSE, DL, ContainerVT, Op1, Op2);
----------------
paulwalker-arm wrote:
> Is it valid to use `class Instruction` methods this far into code generation?  That said, `ShuffleVectorInst::isReverseMask` looks potentially unsafe in this context anyway because given a 4 element mask it'll return true for both `3, 2, 1, 0` and `7, 6, 5, 4` but as mentioned below `ISD::VECTOR_REVERSE` only supports a single operand and so you need to know specifically which of these scenarios apply.
> 
> Do you care about the `7, 6, 5, 4` case? I can see `test_revv8i32v8i32` is a test for this but I think that is being simplified to a `3, 2, 1, 0` based shuffle before you get here.
> 
> I'll note that `ARMISelLowering.cpp` has the `isReverseMask` helper function that could be useful.
Maybe we use `ShuffleVectorInst::isReverseMask` here will be better, because it will call `isSingleSourceMask` to ensure the shuffle mask chooses elements from exactly one source vector(`isReverseMask` in `ARMISelLowering.cpp` will not check this). 
I tried various test case forms, but I couldn’t construct this kind(`7, 6, 5, 4`) of case. It would always be converted to `3, 2, 1, 0` form. To be on the safe side, I still added a check-- `Op2.isUndef()`, and `Op1` will be the only valid operand for `VECTOR_REVERSE`

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:19333
+      ShuffleVectorInst::isReverseMask(ShuffleMask)) {
+    Op = DAG.getNode(ISD::VECTOR_REVERSE, DL, ContainerVT, Op1, Op2);
+    return convertFromScalableVector(DAG, VT, Op);
----------------
paulwalker-arm wrote:
> This looks wrong because ISD::VECTOR_REVERSE only takes a single operand? I presume `Op2` it just been ignored and there's nothing in getNode to ensure only a single operand.
yeah, you're right. I removed `Op2` since it will always be `Undef`

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll:202-203
+
+define void @test_revv8i32(<8 x i32>* %a) #0 {
+; CHECK-LABEL: test_revv8i32:
+; CHECK:       // %bb.0:
----------------
paulwalker-arm wrote:
> I originally wrote a comment asking why the i32 case emitted worse code than the f32 one but then spotted the attribute difference, so this seems more like a negative test.  It's worth adding a function comment to make this clear to the user, which also goes for the other "negative" tests.  Perhaps it's also worth adding `_vl256` to functions where `vscale_range` is set just to drive the point home?
I rearranged the order of the test cases, sorted according to attributes, and added some comments for the `rev` instructions. 

================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll:270
+  %tmp3 = shufflevector <8 x float> %tmp1, <8 x float> undef, <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 5, i32 4, i32 7, i32 6>
+  %tmp4 = fadd <8 x float> %tmp2, %tmp3
+  store <8 x float> %tmp4, <8 x float>* %a
----------------
paulwalker-arm wrote:
> Is there a need for `fadd` here? I'm kind of assuming that without it the float shuffles are being converted to integer ones and thus you lost test coverage? but then I can see `test_revv8f32` doesn't need it so figured I'd ask.
`fadd` is not needed, I removed it

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114960/new/

https://reviews.llvm.org/D114960