[llvm-commits] [Patch] Optimize integer ABS on X86
Evan Cheng
evan.cheng at apple.com
Thu Jun 7 14:56:08 PDT 2012
LGTM
Evan
On Jun 7, 2012, at 2:45 PM, Manman Ren wrote:
>
>
> Updated patch is attached. Please review,
>
> Thanks,
> Manman
> <int_abs_x86.patch>
> On Jun 7, 2012, at 11:07 AM, Evan Cheng wrote:
>
>>
>> On Jun 7, 2012, at 9:46 AM, Manman Ren wrote:
>>
>>>
>>> Hi Evan,
>>>
>>> PerformXorCombine was enabled only when Subtarget has BMI.
>>> Since now I have to check Xor for abs pattern, I enabled it for all X86 Subtargets.
>>>
>>> First, we check whether it is abs, if it is, RV will not be NULL.
>>> We will continue optimizations for BMI if RV is NULL and Subtarget has BMI.
>>
>> Ok, I would suggest something like this to make it more obvious.
>>
>> SDValue RV = performIntegerAbsCombine(N, DAG);
>> if (RV.getNode())
>> return RV;
>>
>> // Try forming BMI if it's available.
>> if (!Subtarget->hasBMI())
>> return SDValue();
>>
>> Evan
>>
>>>
>>> I will add more comments in the code.
>>>
>>> Thanks for reviewing.
>>> Manman
>>>
>>> On Jun 6, 2012, at 6:15 PM, Evan Cheng wrote:
>>>
>>>> This part looks weird to me:
>>>>
>>>> + SDValue RV = performIntegerAbsCombine(N, DAG);
>>>> + if (RV.getNode() || !Subtarget->hasBMI())
>>>> + return RV;
>>>> +
>>>>
>>>> Why is the code returning RV if its Node is null just because Subtarget->hasBMI() is false? Should the subtarget feature check happen earlier? The rest of the patch looks fine to me. Please commit after you have fixed the hasBMI issue.
>>>>
>>>> Thanks,
>>>>
>>>> Evan
>>>>
>>>> On Jun 6, 2012, at 4:01 PM, Manman Ren <mren at apple.com> wrote:
>>>>
>>>>>
>>>>> Optimize generated code for integer ABS on X86:
>>>>> movl %edi, %ecx
>>>>> sarl $31, %ecx
>>>>> leal (%rdi,%rcx), %eax
>>>>> xorl %ecx, %eax
>>>>> ret
>>>>> TO
>>>>> movl %edi, %eax
>>>>> negl %eax
>>>>> cmovll %edi, %eax
>>>>> ret
>>>>>
>>>>> This reduces code size and has better performance on sandy bridge.
>>>>> There exists a target-independent DAG combine for integer ABS, which converts abs to sar+add+xor.
>>>>> For X86, I tried to match this pattern back to neg+cmov. This is implemented in PerformXorCombine.
>>>>>
>>>>> Thanks,
>>>>> Manman
>>>>>
>>>>> <int_abs_x86.patch>_______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>
>
More information about the llvm-commits
mailing list