[LLVMdev] Codegen for vector float->double cast fails on x86 above SSE3

Jonathan Ragan-Kelley jrk at csail.mit.edu
Wed Dec 28 10:42:22 PST 2011


I've isolated a bug in SSE codegen to the attached example.

	define void @f(<2 x float>* %in, <2 x double>* %out) {
	entry:
	  %0 = load <2 x float>* %in, align 8
	  %1 = fpext <2 x float> %0 to <2 x double>
	  store <2 x double> %1, <2 x double>* %out, align 1
	  ret void
	}

The code should load a <2 x float> vector from %in, fpext cast it to a
<2 x double>, and do an unaligned store (movupd) of the result to
%out. This works as expected on earlier SSE targets, generating this
with llc -mcpu=core2:


	movss	(%rdi), %xmm1
	movss	4(%rdi), %xmm0
	cvtss2sd	%xmm0, %xmm0
	cvtss2sd	%xmm1, %xmm1
	unpcklpd	%xmm0, %xmm1    ## xmm1 = xmm1[0],xmm0[0]
	movupd	%xmm1, (%rsi)
	ret

Load both, cast float to double (cvtss2sd), pack vectors, and store.

But with llc -mcpu=penryn or greater, it yields nonsense:

	movq	(%rdi), %xmm0
	pshufd	$16, %xmm0, %xmm0       ## xmm0 = xmm0[0,0,1,0]
	movdqu	%xmm0, (%rsi)
	ret
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vec_cast.ll
Type: application/octet-stream
Size: 406 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/3a33b948/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vec_cast.sse3.s
Type: application/octet-stream
Size: 368 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/3a33b948/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vec_cast.sse4.s
Type: application/octet-stream
Size: 303 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/3a33b948/attachment-0002.obj>


More information about the llvm-dev mailing list