[PATCH] [X86][SSE] Vectorize v2i32 to v2f64 conversions

Simon Pilgrim llvm-dev at redking.me.uk
Mon Jun 15 14:49:43 PDT 2015

Thanks guys for the reviews.

Andrea - I did investigate TableGen / ISel pattern approaches but couldn't manage to make anything work - it doesn't like the fact that v2i32 isn't valid. I believe its why the X86ISD VFPEXT and VFPROUND node types were added in a similar manner. I considered giving the node a more general name 'SINT_TO_FPEXT' but given that cvtdq2pd appears to be the only user of this pattern it didn't seem necessary.

Quentin - yes the MMX instructions (CVTPI2PD and CVTPI2PS) could work in a similar fashion. Are we actively adding MMX/3DNow! lowering? I always thought they were just hidden behind their builtin intrinsics.

I'll add the vectorization costs (+ a test) as part of the submission (or possibly as a followup).


Comment at: test/CodeGen/X86/vec_int_to_fp.ll:11
@@ -10,3 +10,3 @@
 ; SSE2:       # BB#0:
 ; SSE2-NEXT:    movd %xmm0, %rax
 ; SSE2-NEXT:    cvtsi2sdq %rax, %xmm1
andreadb wrote:
> I know that this is unrelated to your patch, but I noticed that on SSE2, this 'i64 extract element has been expanded to 'movd'. Shouldn't this be a 'movq' instead?
This has come up before - I was sure we made a bugzilla for this but can't find it (Sanjay can you remember?).

Comment at: test/CodeGen/X86/vec_int_to_fp.ll:26
@@ -25,3 +25,3 @@
 ; AVX-NEXT:    vmovq %xmm0, %rax
 ; AVX-NEXT:    vxorps %xmm0, %xmm0, %xmm0
 ; AVX-NEXT:    vcvtsi2sdq %rax, %xmm0, %xmm0
andreadb wrote:
> Again, this is unrelated to your patch but
> this vxorps seems redundant. I haven't looked at the code, but I suspect that this may be caused by a sub-optimal build_vector lowering.
I'll see if I can find out what's causing it - odd that xmm0 had just been used in exactly the same type of instruction without clearing the upper bits.



More information about the llvm-commits mailing list