[PATCH] D41062: [X86] Legalize v2i32 via widening rather than promoting

Tue Jan 23 05:52:49 PST 2018

RKSimon added a comment.

Some random minor comments, but I'm not great on calling conventions and what issues we might encounter with this change.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:25059

-    if (ExperimentalVectorWideningLegalization) {
+    if (ExperimentalVectorWideningLegalization || DstVT == MVT::v2i32) {
       // If we are legalizing vectors by widening, we already have the desired
----------------
Is there a helper function we should be using instead of this?

================
Comment at: test/Analysis/CostModel/X86/sitofp.ll:73

-  ; SSE2: cost of 20 {{.*}} sitofp <2 x i32>
+  ; SSE2: cost of 40 {{.*}} sitofp <2 x i32>
   ; AVX1: cost of 4 {{.*}} sitofp <2 x i32>
----------------
What happened? This is way out!

================
Comment at: test/Analysis/CostModel/X86/uitofp.ll:73

-  ; SSE2: cost of 20 {{.*}} uitofp <2 x i32>
+  ; SSE2: cost of 40 {{.*}} uitofp <2 x i32>
   ; AVX1: cost of 6 {{.*}} uitofp <2 x i32>
----------------
Again, way too high

================
Comment at: test/CodeGen/X86/2012-01-18-vbitcast.ll:7
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    movdqa (%rcx), %xmm0
+; CHECK-NEXT:    psubd (%rdx), %xmm0
----------------
I think this is OK, but it still makes me nervous. We go from accessing 64-bits to 128-bits per argument.

================
Comment at: test/CodeGen/X86/avx2-masked-gather.ll:772
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    vmovd {{.*#+}} xmm2 = mem[0],zero,zero,zero
+; X86-NEXT:    vpinsrd $1, 4(%eax), %xmm2, %xmm2
----------------
zvi wrote:
> Any way to easily fix vmovd+vpinsrd -> vmovq?
Yes - why didn't EltsFromConsecutiveLoads convert this to a i64 VZEXT_LOAD (VMOVQ)?

================
Comment at: test/CodeGen/X86/avx2-masked-gather.ll:725
+; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
----------------
Ouch

================
Comment at: test/CodeGen/X86/known-signbits-vector.ll:19
+; X64-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
 ; X64-NEXT:    vcvtdq2pd %xmm0, %xmm0
 ; X64-NEXT:    retq
----------------
Regression

https://reviews.llvm.org/D41062