[PATCH] D23808: [X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 17 02:34:12 PDT 2016

RKSimon added inline comments.

Comment at: test/CodeGen/X86/vec_fp_to_int.ll:102
+; SSE-NEXT:    cvttpd2dq %xmm0, %xmm1
+; SSE-NEXT:    cvttpd2dq %xmm0, %xmm0
 ; SSE-NEXT:    punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
craig.topper wrote:
> Why does this test case end up with 2 cvttpd2dq instructions that need to be unpacked?
Its demonstrating that we're not propagating the undef nature of the upper <2 x double> from the shuffle in the test. On SSE it results in an unnecessary extra cvttpd2dq and on AVX it results in a vcvttpd2dqy instead of just vcvttpd2dq. Makes a difference on 128-bit ALU Jaguar but is a separate fix from this patch.



More information about the llvm-commits mailing list