[llvm-commits] [llvm] r48356 - /llvm/trunk/lib/Target/X86/README.txt

Mon Mar 17 11:33:13 PDT 2008

For the sake of argument:

ARM:
_test:
       	mov r3, r0, lsr #16
	sxtb r0, r3
	bx lr

_test2:
         mov r3, r0, lsl #8
         mov r0, r3, asr #24
         bx lr

PPC32:
_test:
         srwi r2, r3, 16
         extsb r3, r2
         blr

_test2:
         slwi r2, r3, 8
         srawi r3, r2, 24
         blr

Don't diss x86! :-)

Evan

On Mar 14, 2008, at 10:34 AM, Chris Lattner wrote:

> On Mar 14, 2008, at 10:15 AM, Evan Cheng wrote:
>> This seems like a target independent dag combiner xform?
>
> Turning the first into the second is target independent, and should
> actually be done in instcombine.  The problem is that the second one
> generates worse code from the x86 backend than the first one.
>
> On PPC and ARM, they generate equally good code (on ARM they both
> compile to identical instructions even) so consider this to be "X86
> generates inefficient code for:
>
> define i32 @test2(i32 %f12) {
> 	%f11 = shl i32 %f12, 8
> 	%tmp7.25 = ashr i32 %f11, 24
> 	ret i32 %tmp7.25
> }"
>
> :)
>
> -Chris
>
>
>> On Mar 13, 2008, at 11:00 PM, Chris Lattner wrote:
>>
>>>
>>>
>>> +These two functions perform identical operations:
>>> +
>>> +define i32 @test(i32 %f12) {
>>> +	%tmp7.25 = lshr i32 %f12, 16		
>>> +	%tmp7.26 = trunc i32 %tmp7.25 to i8
>>> +	%tmp78.2 = sext i8 %tmp7.26 to i32
>>> +	ret i32 %tmp78.2
>>> +}
>>> +
>>> +define i32 @test2(i32 %f12) {
>>> +	%f11 = shl i32 %f12, 8
>>> +	%tmp7.25 = ashr i32 %f11, 24
>>> +	ret i32 %tmp7.25
>>> +}
>>> +
>>> +but the first compiles into significantly better code on x86-32:
>>> +
>>> +_test:
>>> +	movsbl	6(%esp), %eax
>>> +	ret
>>> +_test2:
>>> +	movl	4(%esp), %eax
>>> +	shll	$8, %eax
>>> +	sarl	$24, %eax
>>> +	ret
>>> +
>>> +and on x86-64:
>>> +
>>> +_test:
>>> +	shrl	$16, %edi
>>> +	movsbl	%dil, %eax
>>> +	ret
>>> +_test2:
>>> +	shll	$8, %edi
>>> +	movl	%edi, %eax
>>> +	sarl	$24, %eax
>>> +	ret
>>> +
>>> +I would like instcombine to canonicalize the first into the second
>>> (since it is
>>> +shorter and doesn't involve type width changes) but the x86 backend
>>> needs to do
>>> +the right thing with the later sequence first.
>>> +
>>> +//
>>> =
>>> =
>>> =
>>> ---------------------------------------------------------------------
>>> ===//
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits