[PATCH] D15741: [X86] Avoid folding scalar loads into unary sse intrinsics

Tue Dec 29 10:10:09 PST 2015

spatel added a comment.

> This is consistent with the patterns we already have for the fp/int converts...

We still need to fix converts?

  #include <xmmintrin.h>
  __m128 foo(__m128 x, int *y) { return _mm_cvtsi32_ss(x, *y); }

$ ./clang -O1 ss2si.c -S -o -

  cvtsi2ssl  (%rdi), %xmm1  <--- false dependency on xmm1?
  movss      %xmm1, %xmm0         

================
Comment at: lib/Target/X86/X86InstrSSE.td:3392
@@ +3391,3 @@
+  // We don't want to fold scalar loads into these instructions unless optimizing
+  // for size. This is because the folded instruction will have a partial register
+  // update, while the unfolded sequence will not, e.g.
----------------
80-cols.

================
Comment at: lib/Target/X86/X86InstrSSE.td:3433
@@ +3432,3 @@
+  // We don't want to fold scalar loads into these instructions unless optimizing
+  // for size. This is because the folded instruction will have a partial register
+  // update, while the unfolded sequence will not, e.g.
----------------
80-cols.

http://reviews.llvm.org/D15741