[llvm-commits] [PATCH] ARM/NEON: Custom lower ctpop for appropriate vector types

Pete Couperus pjcoup at gmail.com
Tue Dec 4 23:00:41 PST 2012


Hi Evan, Shuxin,

Thanks for working with me on this.  I should have this done later this week.

Pete


On Tue, Dec 4, 2012 at 2:56 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Hi, Pete:
>
>   Do you mind to implement ScalarTargetTransformInfo::getPopcntHwSupport()
> for ARM?
> This function will enable scalar opt to recognize popcount idioms.
>
> Thanks
> Shuxin
>
>
> On 12/4/12 2:43 PM, Evan Cheng wrote:
>
>
> On Dec 4, 2012, at 6:44 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>
> Hi Evan,
>
> Thanks for the review.  Currently, this only enables HW support for vector
> types, not scalars.  It would be easy enough to add scalar support in.
> Should I roll that into this patch?  Or put it in a following patch?
> If you'd like a following patch, would you mind committing this one for me?
>
>
> I've committed it as r169325 after some minor changes (including naming the
> test case popcnt.ll). Please add the scalar support. Thanks!
>
> Evan
>
> Thanks!
>
> Pete
>
>
> On Mon, Dec 3, 2012 at 6:29 PM, Evan Cheng <evan.cheng at apple.com> wrote:
>>
>> I don't think use Pat<> buys anything here. The C++ lowering code is
>> actually much more readable.
>>
>> The patch looks fine to me. Please add
>> ScalarTargetTransformInfo::PopcntHwSupport() to take advantage of the
>> popcount loop idiom recognition support (See r168931). ARM crafty score
>> should improve!
>>
>> Evan
>>
>> On Nov 30, 2012, at 11:37 PM, Pete Couperus <pjcoup at gmail.com> wrote:
>>
>> Hi Anton,
>>
>> I don't think this will work cleanly using Pat<>.  The custom lowering
>> uses a bitcast, which gives me an unrecoverable error for using bitconvert
>> in an output pattern, or a non-fatal (?) Type inference contradiction error.
>>
>> Assuming there was some way around this, the unoptimized pattern for the
>> v4i16 case looks something like:
>> def : Pat<(v4i16 (ctpop DPR:$Dm)),
>>           (v4i16 (EXTRACT_SUBREG
>>             (v8i16 (VMOVLuv8i16
>>               (v8i8  (VUZPd8
>>                 (v8i8 (IMPLICIT_DEF)),
>>                 (v8i8  (VADDv8i8
>>                   (v8i8  (VREV16d8
>>                     (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                   )),
>>                   (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                 )),
>>                 (v8i8  (VADDv8i8
>>                   (v8i8  (VREV16d8
>>                     (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                   )),
>>                   (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                 ))
>>               ))
>>             )), dsub_0
>>           ))>;
>>
>> This actually builds with the Type inference errors, and gives the right
>> assembly.
>> As is, I think this is more clearly expressed in the code.
>> So, unless there are ways to deal with these issues in tablegen that I
>> haven't thought of, it needs to be lowered in the code.
>> Was fun to try though :).  Any other comments?
>>
>> Pete
>>
>>
>> On Fri, Nov 30, 2012 at 6:05 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>>>
>>> Hi Anton,
>>>
>>> Thanks for taking a look.  I had trouble with complex patterns for other
>>> similar items, but I'll take another look at this case if that actually
>>> sounds feasible.
>>> Looking at LowerCTTZ, using Pat<> would be the same for cttz, although
>>> its lowering is much simpler, correct?  Is there guidelines for when to use
>>> Pat<> and when to lay it out in the code?
>>> Thanks again for taking a look!
>>>
>>> Pete
>>>
>>>
>>>
>>>
>>> On Thu, Nov 29, 2012 at 9:53 PM, Anton Korobeynikov
>>> <anton at korobeynikov.info> wrote:
>>>>
>>>> Pete,
>>>>
>>>> > The population count intrinsic (ctpop) is supported on ARM/NEON for
>>>> > v8i8 and
>>>> > v16i8 types via NEON's vcnt instruction.
>>>> > This patch leverages vcnt and other NEON instructions to custom lower
>>>> > ctpop
>>>> > for v2i32/v4i32 and v4i16/v8i16 types.
>>>> > As you'll see, a fair chunk of this patch is comments describing the
>>>> > lowering, which I am happy to adjust to people's liking.
>>>> > Please review!
>>>> > Thanks!
>>>> Can't you use Pat<> magic in .td file to expand ctpop into series of
>>>> nodes? It seems exactly what's your lowering does...
>>>>
>>>> --
>>>> With best regards, Anton Korobeynikov
>>>> Faculty of Mathematics and Mechanics, Saint Petersburg State University
>>>
>>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>



More information about the llvm-commits mailing list