[PATCH] ARM/NEON: Improve codegen for long sext/zext operations.
Arnold Schwaighofer
aschwaighofer at apple.com
Mon Mar 18 08:34:38 PDT 2013
LGTM, too.
Thanks,
Arnold
On Mar 18, 2013, at 9:04 AM, Pete Couperus <pjcoup at gmail.com> wrote:
> Hello,
>
> OK, thanks, that helps a lot. Here is the patch with the cost-model
> changes included.
> Let me know if it doesn't look right, or if there is anything else
> that I should fix.
> Thanks!
>
> Pete
>
>
> On Sat, Mar 16, 2013 at 8:43 AM, Arnold Schwaighofer
> <aschwaighofer at apple.com> wrote:
>> Yes that sound right. I see for example:
>>
>> vmovl.s8
>> vmovl.s8
>> vmovl.s16
>> vmovl.s16
>> vmovl.s16
>> vmovl.s16
>>
>> for
>>
>> %r = sext <16 x i8> %v0 to <16 x i32>
>>
>>
>> That would entail an entry of:
>>
>> { ISD::SIGN_EXTEND, MVT::v16i32, MVT::v16i8, 6 },
>>
>>
>> On Mar 16, 2013, at 9:28 AM, Renato Golin <renato.golin at linaro.org> wrote:
>>
>>> On 16 March 2013 02:51, Pete Couperus <pjcoup at gmail.com> wrote:
>>> Thanks for the review. I'd be happy to modify the cost values.
>>> Is there a guideline for what this should be? I surmise looking at
>>> the comments and the other entries that this isn't quite as simple as
>>> something like the number of vmovl's. Or should I just remove the
>>> entries corresponding to the cases covered in the patch?
>>>
>>> Hi Pete,
>>>
>>> The costs for the operations you're changing are:
>>>
>>> + setOperationAction(ISD::SIGN_EXTEND, MVT::v8i32, Custom);
>>> + setOperationAction(ISD::ZERO_EXTEND, MVT::v8i32, Custom);
>>> + setOperationAction(ISD::SIGN_EXTEND, MVT::v16i32, Custom);
>>> + setOperationAction(ISD::ZERO_EXTEND, MVT::v16i32, Custom);
>>>
>>> They don't have specific entries (AFAICS) in the cost tables, so it ends up just exploding the cost. Fair enough, they were a bunch of strh instructions.
>>>
>>> What you have to do is to add entries in ARMTargetTransformInfo.cpp -> ARMTTI::getCastInstrCost() like:
>>>
>>> { ISD::SIGN_EXTEND, MVT::v16i64, MVT::v16i32, 4 },
>>> { ISD::ZERO_EXTEND, MVT::v16i64, MVT::v16i32, 4 },
>>> { ISD::SIGN_EXTEND, MVT::v8i64, MVT::v8i32, 3 },
>>> { ISD::ZERO_EXTEND, MVT::v8i64, MVT::v8i32, 3 },
>>>
>>> ---> Arnold, please correct me if I'm wrong <---
>>>
>>> And the ones you're adding new:
>>>
>>> + setOperationAction(ISD::SIGN_EXTEND, MVT::v4i64, Custom);
>>> + setOperationAction(ISD::ZERO_EXTEND, MVT::v4i64, Custom);
>>> + setOperationAction(ISD::SIGN_EXTEND, MVT::v8i64, Custom);
>>> + setOperationAction(ISD::ZERO_EXTEND, MVT::v8i64, Custom);
>>>
>>> That need similar entries on the table in ARMTTI taking each instruction as an ideal cost of 1.
>>>
>>> Format is { ISD, DestType, SourceType, Cost };
>>>
>>> That should improve some vectorized benchmarks, but I don't expect it to show above the noise level in LNT, so let's look for correctness rather than performance for now.
>>>
>>> cheers,
>>> --renato
>>
> <szext_cost.diff>
More information about the llvm-commits
mailing list