[PATCH] D158195: [RISCV] Combine (vrot{l,r} vxi16, 8) -> vrev8

Fri Aug 18 06:24:53 PDT 2023

luke added a comment.

In D158195#4596052 <https://reviews.llvm.org/D158195#4596052>, @craig.topper wrote:

> If the rotate came in as a fshl/fshr intrinsic or as shl+shr+or would we already get vrev8 for fixed vectors? Is only the shuffle case that is being optimized?

Yeah we already get vrev8 for these, DAGCombiner canonicalises them before they would be legalised to vl nodes:

  define <4 x i16> @rot_via_fshr(<4 x i16> %a) {
    %res = call <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %a, <4 x i16> <i16 8, i16 8, i16 8, i16 8>)
    ret <4 x i16> %res
  }

  declare <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %b, <4 x i16> %c)

  define <4 x i16> @rot_via_shift(<4 x i16> %a, <4 x i16> %amt) {
    %1 = shl <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
    %2 = lshr <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
    %3 = or <4 x i16> %1, %2
    ret <4 x i16> %3
  }

  === rot_via_fshr
  Initial selection DAG: %bb.0 'rot_via_fshr:'
  SelectionDAG has 13 nodes:
    t0: ch,glue = EntryToken
            t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
          t4: v4i16 = extract_subvector t2, Constant:i64<0>
          t6: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
        t7: v4i16 = rotr t4, t6
      t9: nxv2i16 = insert_subvector undef:nxv2i16, t7, Constant:i64<0>
    t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
    t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1

  Optimized lowered selection DAG: %bb.0 'rot_via_fshr:'
  SelectionDAG has 11 nodes:
    t0: ch,glue = EntryToken
            t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
          t4: v4i16 = extract_subvector t2, Constant:i64<0>
        t13: v4i16 = bswap t4
      t9: nxv2i16 = insert_subvector undef:nxv2i16, t13, Constant:i64<0>
    t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
    t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1

  === rot_via_shift
  Initial selection DAG: %bb.0 'rot_via_shift:'
  SelectionDAG has 18 nodes:
    t0: ch,glue = EntryToken
      t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
    t4: v4i16 = extract_subvector t2, Constant:i64<0>
      t6: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %1
    t7: v4i16 = extract_subvector t6, Constant:i64<0>
    t9: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
          t10: v4i16 = shl t4, t9
          t11: v4i16 = srl t4, t9
        t12: v4i16 = or t10, t11
      t14: nxv2i16 = insert_subvector undef:nxv2i16, t12, Constant:i64<0>
    t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
    t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1

  Optimized lowered selection DAG: %bb.0 'rot_via_shift:'
  SelectionDAG has 11 nodes:
    t0: ch,glue = EntryToken
            t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
          t4: v4i16 = extract_subvector t2, Constant:i64<0>
        t19: v4i16 = bswap t4
      t14: nxv2i16 = insert_subvector undef:nxv2i16, t19, Constant:i64<0>
    t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
    t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1

================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-reverse.ll:278
+; ZVBB-NEXT:    vsetivli zero, 1, e32, mf2, ta, ma
+; ZVBB-NEXT:    vror.vi v8, v8, 16
+; ZVBB-NEXT:    ret
----------------
craig.topper wrote:
> How does this patch create new rotates?
Not sure how I didn't notice these. Looks like it always emitted rotates on zvbb, there's just an issue with the filecheck prefixes. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158195/new/

https://reviews.llvm.org/D158195