[llvm-commits] [PATCH] ARM/NEON: Custom lower ctpop for appropriate vector types

Evan Cheng evan.cheng at apple.com
Tue Dec 4 14:43:58 PST 2012


On Dec 4, 2012, at 6:44 AM, Pete Couperus <pjcoup at gmail.com> wrote:

> Hi Evan,
> 
> Thanks for the review.  Currently, this only enables HW support for vector types, not scalars.  It would be easy enough to add scalar support in.
> Should I roll that into this patch?  Or put it in a following patch?
> If you'd like a following patch, would you mind committing this one for me?

I've committed it as r169325 after some minor changes (including naming the test case popcnt.ll). Please add the scalar support. Thanks!

Evan

> Thanks!
> 
> Pete
> 
> 
> On Mon, Dec 3, 2012 at 6:29 PM, Evan Cheng <evan.cheng at apple.com> wrote:
> I don't think use Pat<> buys anything here. The C++ lowering code is actually much more readable.
> 
> The patch looks fine to me. Please add ScalarTargetTransformInfo::PopcntHwSupport() to take advantage of the popcount loop idiom recognition support (See r168931). ARM crafty score should improve!
> 
> Evan
> 
> On Nov 30, 2012, at 11:37 PM, Pete Couperus <pjcoup at gmail.com> wrote:
> 
>> Hi Anton,
>> 
>> I don't think this will work cleanly using Pat<>.  The custom lowering uses a bitcast, which gives me an unrecoverable error for using bitconvert in an output pattern, or a non-fatal (?) Type inference contradiction error.
>> 
>> Assuming there was some way around this, the unoptimized pattern for the v4i16 case looks something like:
>> def : Pat<(v4i16 (ctpop DPR:$Dm)),
>>           (v4i16 (EXTRACT_SUBREG
>>             (v8i16 (VMOVLuv8i16
>>               (v8i8  (VUZPd8
>>                 (v8i8 (IMPLICIT_DEF)),
>>                 (v8i8  (VADDv8i8
>>                   (v8i8  (VREV16d8
>>                     (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                   )),
>>                   (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                 )),
>>                 (v8i8  (VADDv8i8
>>                   (v8i8  (VREV16d8
>>                     (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                   )),
>>                   (v8i8 (VCNTd (v8i8 DPR:$Dm)))
>>                 ))
>>               ))
>>             )), dsub_0
>>           ))>;
>> 
>> This actually builds with the Type inference errors, and gives the right assembly.
>> As is, I think this is more clearly expressed in the code.
>> So, unless there are ways to deal with these issues in tablegen that I haven't thought of, it needs to be lowered in the code.
>> Was fun to try though :).  Any other comments?
>> 
>> Pete
>> 
>> 
>> On Fri, Nov 30, 2012 at 6:05 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>> Hi Anton,
>> 
>> Thanks for taking a look.  I had trouble with complex patterns for other similar items, but I'll take another look at this case if that actually sounds feasible.
>> Looking at LowerCTTZ, using Pat<> would be the same for cttz, although its lowering is much simpler, correct?  Is there guidelines for when to use Pat<> and when to lay it out in the code?
>> Thanks again for taking a look!
>> 
>> Pete
>> 
>> 
>> 
>> 
>> On Thu, Nov 29, 2012 at 9:53 PM, Anton Korobeynikov <anton at korobeynikov.info> wrote:
>> Pete,
>> 
>> > The population count intrinsic (ctpop) is supported on ARM/NEON for v8i8 and
>> > v16i8 types via NEON's vcnt instruction.
>> > This patch leverages vcnt and other NEON instructions to custom lower ctpop
>> > for v2i32/v4i32 and v4i16/v8i16 types.
>> > As you'll see, a fair chunk of this patch is comments describing the
>> > lowering, which I am happy to adjust to people's liking.
>> > Please review!
>> > Thanks!
>> Can't you use Pat<> magic in .td file to expand ctpop into series of
>> nodes? It seems exactly what's your lowering does...
>> 
>> --
>> With best regards, Anton Korobeynikov
>> Faculty of Mathematics and Mechanics, Saint Petersburg State University
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121204/2c02541f/attachment.html>


More information about the llvm-commits mailing list