[llvm-commits] [llvm] r107613 - /llvm/trunk/lib/Target/X86/README-SSE.txt
Eli Friedman
eli.friedman at gmail.com
Mon Jul 5 18:27:34 PDT 2010
On Sun, Jul 4, 2010 at 10:48 PM, Chris Lattner <sabre at nondot.org> wrote:
> Author: lattner
> Date: Mon Jul 5 00:48:41 2010
> New Revision: 107613
>
> URL: http://llvm.org/viewvc/llvm-project?rev=107613&view=rev
> Log:
> some notes about suboptimal insertps's
>
> Modified:
> llvm/trunk/lib/Target/X86/README-SSE.txt
>
> Modified: llvm/trunk/lib/Target/X86/README-SSE.txt
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/README-SSE.txt?rev=107613&r1=107612&r2=107613&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/README-SSE.txt (original)
> +++ llvm/trunk/lib/Target/X86/README-SSE.txt Mon Jul 5 00:48:41 2010
> @@ -846,3 +846,34 @@
> doing a shuffle from v[1] to v[0] then a float store.
>
> //===---------------------------------------------------------------------===//
> +
> +On SSE4 machines, we compile this code:
> +
> +define <2 x float> @test2(<2 x float> %Q, <2 x float> %R,
> + <2 x float> *%P) nounwind {
> + %Z = fadd <2 x float> %Q, %R
> +
> + store <2 x float> %Z, <2 x float> *%P
> + ret <2 x float> %Z
> +}
> +
> +into:
> +
> +_test2: ## @test2
> +## BB#0:
> + insertps $0, %xmm2, %xmm2
> + insertps $16, %xmm3, %xmm2
> + insertps $0, %xmm0, %xmm3
> + insertps $16, %xmm1, %xmm3
> + addps %xmm2, %xmm3
> + movq %xmm3, (%rdi)
> + movaps %xmm3, %xmm0
> + pshufd $1, %xmm3, %xmm1
> + ## kill: XMM1<def> XMM1<kill>
> + ret
> +
> +The insertps's of $0 are pointless complex copies.
I'm also concerned about the pshufd at the end; is the ABI stuff
really working the way you want it to?
-Eli
More information about the llvm-commits
mailing list