[llvm-commits] [llvm] r107613 - /llvm/trunk/lib/Target/X86/README-SSE.txt
Eli Friedman
eli.friedman at gmail.com
Mon Jul 5 19:15:37 PDT 2010
On Mon, Jul 5, 2010 at 6:27 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Sun, Jul 4, 2010 at 10:48 PM, Chris Lattner <sabre at nondot.org> wrote:
>> Author: lattner
>> Date: Mon Jul 5 00:48:41 2010
>> New Revision: 107613
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=107613&view=rev
>> Log:
>> some notes about suboptimal insertps's
>>
>> Modified:
>> llvm/trunk/lib/Target/X86/README-SSE.txt
>>
>> Modified: llvm/trunk/lib/Target/X86/README-SSE.txt
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/README-SSE.txt?rev=107613&r1=107612&r2=107613&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Target/X86/README-SSE.txt (original)
>> +++ llvm/trunk/lib/Target/X86/README-SSE.txt Mon Jul 5 00:48:41 2010
>> @@ -846,3 +846,34 @@
>> doing a shuffle from v[1] to v[0] then a float store.
>>
>> //===---------------------------------------------------------------------===//
>> +
>> +On SSE4 machines, we compile this code:
>> +
>> +define <2 x float> @test2(<2 x float> %Q, <2 x float> %R,
>> + <2 x float> *%P) nounwind {
>> + %Z = fadd <2 x float> %Q, %R
>> +
>> + store <2 x float> %Z, <2 x float> *%P
>> + ret <2 x float> %Z
>> +}
>> +
>> +into:
>> +
>> +_test2: ## @test2
>> +## BB#0:
>> + insertps $0, %xmm2, %xmm2
>> + insertps $16, %xmm3, %xmm2
>> + insertps $0, %xmm0, %xmm3
>> + insertps $16, %xmm1, %xmm3
>> + addps %xmm2, %xmm3
>> + movq %xmm3, (%rdi)
>> + movaps %xmm3, %xmm0
>> + pshufd $1, %xmm3, %xmm1
>> + ## kill: XMM1<def> XMM1<kill>
>> + ret
>> +
>> +The insertps's of $0 are pointless complex copies.
>
> I'm also concerned about the pshufd at the end; is the ABI stuff
> really working the way you want it to?
Nevermind; I read this before your email to llvmdev.
-Eli
More information about the llvm-commits
mailing list