[PATCH] ARM/NEON: Improve codegen for long sext/zext operations.
Arnold Schwaighofer
aschwaighofer at apple.com
Sat Mar 16 08:43:43 PDT 2013
Yes that sound right. I see for example:
vmovl.s8
vmovl.s8
vmovl.s16
vmovl.s16
vmovl.s16
vmovl.s16
for
%r = sext <16 x i8> %v0 to <16 x i32>
That would entail an entry of:
{ ISD::SIGN_EXTEND, MVT::v16i32, MVT::v16i8, 6 },
On Mar 16, 2013, at 9:28 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 16 March 2013 02:51, Pete Couperus <pjcoup at gmail.com> wrote:
> Thanks for the review. I'd be happy to modify the cost values.
> Is there a guideline for what this should be? I surmise looking at
> the comments and the other entries that this isn't quite as simple as
> something like the number of vmovl's. Or should I just remove the
> entries corresponding to the cases covered in the patch?
>
> Hi Pete,
>
> The costs for the operations you're changing are:
>
> + setOperationAction(ISD::SIGN_EXTEND, MVT::v8i32, Custom);
> + setOperationAction(ISD::ZERO_EXTEND, MVT::v8i32, Custom);
> + setOperationAction(ISD::SIGN_EXTEND, MVT::v16i32, Custom);
> + setOperationAction(ISD::ZERO_EXTEND, MVT::v16i32, Custom);
>
> They don't have specific entries (AFAICS) in the cost tables, so it ends up just exploding the cost. Fair enough, they were a bunch of strh instructions.
>
> What you have to do is to add entries in ARMTargetTransformInfo.cpp -> ARMTTI::getCastInstrCost() like:
>
> { ISD::SIGN_EXTEND, MVT::v16i64, MVT::v16i32, 4 },
> { ISD::ZERO_EXTEND, MVT::v16i64, MVT::v16i32, 4 },
> { ISD::SIGN_EXTEND, MVT::v8i64, MVT::v8i32, 3 },
> { ISD::ZERO_EXTEND, MVT::v8i64, MVT::v8i32, 3 },
>
> ---> Arnold, please correct me if I'm wrong <---
>
> And the ones you're adding new:
>
> + setOperationAction(ISD::SIGN_EXTEND, MVT::v4i64, Custom);
> + setOperationAction(ISD::ZERO_EXTEND, MVT::v4i64, Custom);
> + setOperationAction(ISD::SIGN_EXTEND, MVT::v8i64, Custom);
> + setOperationAction(ISD::ZERO_EXTEND, MVT::v8i64, Custom);
>
> That need similar entries on the table in ARMTTI taking each instruction as an ideal cost of 1.
>
> Format is { ISD, DestType, SourceType, Cost };
>
> That should improve some vectorized benchmarks, but I don't expect it to show above the noise level in LNT, so let's look for correctness rather than performance for now.
>
> cheers,
> --renato
More information about the llvm-commits
mailing list