[PATCH] D142456: [AArch64][CostModel]: Add costs for zero/sign extend.

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 25 01:24:15 PST 2023


david-arm added a comment.

Thanks for this patch @hassnaa-arm! It's definitely an improvement on the existing cost model. However, I just have a few comments on the costs ...



================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2066
+    // each mov operation has a cost of 2.
+    { ISD::ZERO_EXTEND, MVT::nxv16i16, MVT::nxv16i8, 4},
+    { ISD::ZERO_EXTEND, MVT::nxv16i32, MVT::nxv16i8, 8},
----------------
For this example, something like

  define <vscale x 16 x i16> @sve_ext_i8_i16(<vscale x 16 x i8> %a) {
    %r = zext <vscale x 16 x i8> %a to <vscale x 16 x i16>
    ret <vscale x 16 x i16> %r
  }

would end up as

        uunpklo z2.h, z0.b
        uunpkhi z1.h, z0.b
        mov     z0.d, z2.d
        ret

So I think perhaps the comment above might be better written as something like

    // zero/sign extend are implemented by multiple unpack operations,
    // where each unpack operation has a cost of 2.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2067
+    { ISD::ZERO_EXTEND, MVT::nxv16i16, MVT::nxv16i8, 4},
+    { ISD::ZERO_EXTEND, MVT::nxv16i32, MVT::nxv16i8, 8},
+    { ISD::ZERO_EXTEND, MVT::nxv16i64, MVT::nxv16i8, 16},
----------------
I wonder if this cost should be even higher? Extending from nxv16i8 to nxv16i16 leads to 2 unpk instructions, where you have a cost of 4. However, going from nxv16i8  to nxv16i32 requires 6 unpk instructions, which I would expect to be 2*6=12. For example, I get:

  sve_ext_i8_i32:
        uunpklo z1.h, z0.b
        uunpkhi z3.h, z0.b
        uunpklo z0.s, z1.h
        uunpkhi z1.s, z1.h
        uunpklo z2.s, z3.h
        uunpkhi z3.s, z3.h
        ret

The costs are probably even worse extending from nxv16i8 to nxv16i64!


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2072
+    { ISD::SIGN_EXTEND, MVT::nxv16i32, MVT::nxv16i8, 8},
+    { ISD::SIGN_EXTEND, MVT::nxv16i64, MVT::nxv16i8, 16},
   };
----------------
It would be good to complete the other extends from legal types here too, such as 

  { ISD::ZERO_EXTEND, MVT::nxv8i32, MVT::nxv8i16, ?},
  { ISD::ZERO_EXTEND, MVT::nxv8i64, MVT::nxv8i16, ?},
  { ISD::ZERO_EXTEND, MVT::nxv4i64, MVT::nxv4i32, ?},

  { ISD::SIGN_EXTEND, MVT::nxv8i32, MVT::nxv8i16, ?},
  { ISD::SIGN_EXTEND, MVT::nxv8i64, MVT::nxv8i16, ?},
  { ISD::SIGN_EXTEND, MVT::nxv4i64, MVT::nxv4i32, ?},


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142456/new/

https://reviews.llvm.org/D142456



More information about the llvm-commits mailing list