[llvm-commits] [PATCH] ARM: Custom lower scalar ctpop
Owen Anderson
resistor at mac.com
Sat Dec 8 20:47:09 PST 2012
You probably want to make the expansion subtarget specific. Per-A9 microarchitectures had very slow moves between the GPRs and the vector registers.
-Owen
On Dec 8, 2012, at 3:35 PM, Pete Couperus <pjcoup at gmail.com> wrote:
> Hello,
>
> I think you are right, probably vcnt and a couple of vaddl's. I will verify, and resubmit shortly.
>
> Pete
>
>
> Evan Cheng <evan.cheng at apple.com> wrote:
>
>> This seems wrong to me. I think a scalar popcount should be 3-4 instructions. Owen, I believe you know the right code sequence for popcount with vcnt. What do you suggest?
>>
>> Evan
>>
>> On Dec 8, 2012, at 9:06 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> Forgot the test case. Reattached with test case.
>>>
>>> Pete
>>>
>>>
>>> On Sat, Dec 8, 2012 at 8:39 AM, Pete Couperus <pjcoup at gmail.com> wrote:
>>>> Hello,
>>>>
>>>> This patch builds on the vector support for ARM/NEON ctpop lowering
>>>> (r169325) to give i32/i64 custom lowering.
>>>> It does not tie into the ctpop idiom recognition, as that patch seems
>>>> to being discussed right now.
>>>> Please review!
>>>> Thanks!
>>>>
>>>> Pete
>>> <0001-scalar-ctpop.diff>
More information about the llvm-commits
mailing list