[llvm-commits] [PATCH] ARM: Custom lower scalar ctpop

Owen Anderson resistor at mac.com
Sat Dec 8 20:47:09 PST 2012


You probably want to make the expansion subtarget specific.  Per-A9 microarchitectures had very slow moves between the GPRs and the vector registers.

-Owen

On Dec 8, 2012, at 3:35 PM, Pete Couperus <pjcoup at gmail.com> wrote:

> Hello,
> 
> I think you are right, probably vcnt and a couple of vaddl's.  I will verify, and resubmit shortly.
> 
> Pete
> 
> 
> Evan Cheng <evan.cheng at apple.com> wrote:
> 
>> This seems wrong to me. I think a scalar popcount should be 3-4 instructions. Owen, I believe you know the right code sequence for popcount with vcnt. What do you suggest?
>> 
>> Evan
>> 
>> On Dec 8, 2012, at 9:06 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>> 
>>> Hello,
>>> 
>>> Forgot the test case.  Reattached with test case.
>>> 
>>> Pete
>>> 
>>> 
>>> On Sat, Dec 8, 2012 at 8:39 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>>>> Hello,
>>>> 
>>>> This patch builds on the vector support for ARM/NEON ctpop lowering
>>>> (r169325) to give i32/i64 custom lowering.
>>>> It does not tie into the ctpop idiom recognition, as that patch seems
>>>> to being discussed right now.
>>>> Please review!
>>>> Thanks!
>>>> 
>>>> Pete
>>> <0001-scalar-ctpop.diff>



More information about the llvm-commits mailing list