[llvm-commits] [Patch] Optimize integer ABS on X86
Manman Ren
mren at apple.com
Thu Jun 7 14:45:32 PDT 2012
Updated patch is attached. Please review,
Thanks,
Manman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: int_abs_x86.patch
Type: application/octet-stream
Size: 3417 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120607/9e9805a6/attachment.obj>
-------------- next part --------------
On Jun 7, 2012, at 11:07 AM, Evan Cheng wrote:
>
> On Jun 7, 2012, at 9:46 AM, Manman Ren wrote:
>
>>
>> Hi Evan,
>>
>> PerformXorCombine was enabled only when Subtarget has BMI.
>> Since now I have to check Xor for abs pattern, I enabled it for all X86 Subtargets.
>>
>> First, we check whether it is abs, if it is, RV will not be NULL.
>> We will continue optimizations for BMI if RV is NULL and Subtarget has BMI.
>
> Ok, I would suggest something like this to make it more obvious.
>
> SDValue RV = performIntegerAbsCombine(N, DAG);
> if (RV.getNode())
> return RV;
>
> // Try forming BMI if it's available.
> if (!Subtarget->hasBMI())
> return SDValue();
>
> Evan
>
>>
>> I will add more comments in the code.
>>
>> Thanks for reviewing.
>> Manman
>>
>> On Jun 6, 2012, at 6:15 PM, Evan Cheng wrote:
>>
>>> This part looks weird to me:
>>>
>>> + SDValue RV = performIntegerAbsCombine(N, DAG);
>>> + if (RV.getNode() || !Subtarget->hasBMI())
>>> + return RV;
>>> +
>>>
>>> Why is the code returning RV if its Node is null just because Subtarget->hasBMI() is false? Should the subtarget feature check happen earlier? The rest of the patch looks fine to me. Please commit after you have fixed the hasBMI issue.
>>>
>>> Thanks,
>>>
>>> Evan
>>>
>>> On Jun 6, 2012, at 4:01 PM, Manman Ren <mren at apple.com> wrote:
>>>
>>>>
>>>> Optimize generated code for integer ABS on X86:
>>>> movl %edi, %ecx
>>>> sarl $31, %ecx
>>>> leal (%rdi,%rcx), %eax
>>>> xorl %ecx, %eax
>>>> ret
>>>> TO
>>>> movl %edi, %eax
>>>> negl %eax
>>>> cmovll %edi, %eax
>>>> ret
>>>>
>>>> This reduces code size and has better performance on sandy bridge.
>>>> There exists a target-independent DAG combine for integer ABS, which converts abs to sar+add+xor.
>>>> For X86, I tried to match this pattern back to neg+cmov. This is implemented in PerformXorCombine.
>>>>
>>>> Thanks,
>>>> Manman
>>>>
>>>> <int_abs_x86.patch>_______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>
More information about the llvm-commits
mailing list