[llvm-commits] [Patch] Optimize integer ABS on X86

Thu Jun 7 14:56:08 PDT 2012

LGTM

Evan
On Jun 7, 2012, at 2:45 PM, Manman Ren wrote:

> 
> 
> Updated patch is attached. Please review,
> 
> Thanks,
> Manman
> <int_abs_x86.patch>
> On Jun 7, 2012, at 11:07 AM, Evan Cheng wrote:
> 
>> 
>> On Jun 7, 2012, at 9:46 AM, Manman Ren wrote:
>> 
>>> 
>>> Hi Evan,
>>> 
>>> PerformXorCombine was enabled only when Subtarget has BMI.
>>> Since now I have to check Xor for abs pattern, I enabled it for all X86 Subtargets.
>>> 
>>> First, we check whether it is abs, if it is, RV will not be NULL.
>>> We will continue optimizations for BMI if RV is NULL and Subtarget has BMI.
>> 
>> Ok, I would suggest something like this to make it more obvious.
>> 
>> SDValue RV = performIntegerAbsCombine(N, DAG);
>> if (RV.getNode())
>> return RV;
>> 
>> // Try forming BMI if it's available.
>> if (!Subtarget->hasBMI())
>> return SDValue();
>> 
>> Evan
>> 
>>> 
>>> I will add more comments in the code.
>>> 
>>> Thanks for reviewing.
>>> Manman 
>>> 
>>> On Jun 6, 2012, at 6:15 PM, Evan Cheng wrote:
>>> 
>>>> This part looks weird to me:
>>>> 
>>>> +  SDValue RV = performIntegerAbsCombine(N, DAG);
>>>> +  if (RV.getNode() || !Subtarget->hasBMI())
>>>> +    return RV;
>>>> +
>>>> 
>>>> Why is the code returning RV if its Node is null just because Subtarget->hasBMI() is false? Should the subtarget feature check happen earlier? The rest of the patch looks fine to me. Please commit after you have fixed the hasBMI issue.
>>>> 
>>>> Thanks,
>>>> 
>>>> Evan
>>>> 
>>>> On Jun 6, 2012, at 4:01 PM, Manman Ren <mren at apple.com> wrote:
>>>> 
>>>>> 
>>>>> Optimize generated code for integer ABS on X86:
>>>>>    movl    %edi, %ecx
>>>>>    sarl    $31, %ecx
>>>>>    leal    (%rdi,%rcx), %eax
>>>>>    xorl    %ecx, %eax
>>>>>    ret
>>>>> TO
>>>>>    movl	%edi, %eax
>>>>> 	negl	%eax
>>>>> 	cmovll	%edi, %eax
>>>>> 	ret
>>>>> 
>>>>> This reduces code size and has better performance on sandy bridge.
>>>>> There exists a target-independent DAG combine for integer ABS, which converts abs to sar+add+xor.
>>>>> For X86, I tried to match this pattern back to neg+cmov. This is implemented in PerformXorCombine.
>>>>> 
>>>>> Thanks,
>>>>> Manman
>>>>> 
>>>>> <int_abs_x86.patch>_______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>> 
>>> 
>> 
>