I somewhat wonder if folding of scalar intrinsics works completely correctly. Those intrinsics have VR128 as their register class, but the load size is actually 32/64-bits. I think the folding code looks at the register class size to determine size. Adding Jakob for comment.<br>
<br>~Craig<br><br><div class="gmail_quote">On Mon, Aug 13, 2012 at 11:29 AM, Manman Ren <span dir="ltr"><<a href="mailto:mren@apple.com" target="_blank">mren@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Author: mren<br>
Date: Mon Aug 13 13:29:41 2012<br>
New Revision: 161769<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=161769&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=161769&view=rev</a><br>
Log:<br>
X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from<br>
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed<br>
to a memory operand.<br>
<br>
PR13576<br>
<br>
Modified:<br>
    llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br>
    llvm/trunk/test/CodeGen/X86/vec_ss_load_fold.ll<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=161769&r1=161768&r2=161769&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=161769&r1=161768&r2=161769&view=diff</a><br>

==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon Aug 13 13:29:41 2012<br>
@@ -414,12 +414,6 @@<br>
     { X86::CVTSD2SIrr,      X86::CVTSD2SIrm,          0 },<br>
     { X86::CVTSS2SI64rr,    X86::CVTSS2SI64rm,        0 },<br>
     { X86::CVTSS2SIrr,      X86::CVTSS2SIrm,          0 },<br>
-    { X86::Int_CVTSD2SSrr,  X86::Int_CVTSD2SSrm,      0 },<br>
-    { X86::Int_CVTSI2SD64rr,X86::Int_CVTSI2SD64rm,    0 },<br>
-    { X86::Int_CVTSI2SDrr,  X86::Int_CVTSI2SDrm,      0 },<br>
-    { X86::Int_CVTSI2SS64rr,X86::Int_CVTSI2SS64rm,    0 },<br>
-    { X86::Int_CVTSI2SSrr,  X86::Int_CVTSI2SSrm,      0 },<br>
-    { X86::Int_CVTSS2SDrr,  X86::Int_CVTSS2SDrm,      0 },<br>
     { X86::CVTTPD2DQrr,     X86::CVTTPD2DQrm,         TB_ALIGN_16 },<br>
     { X86::CVTTPS2DQrr,     X86::CVTTPS2DQrm,         TB_ALIGN_16 },<br>
     { X86::Int_CVTTSD2SI64rr,X86::Int_CVTTSD2SI64rm,  0 },<br>
@@ -680,6 +674,12 @@<br>
     { X86::IMUL64rr,        X86::IMUL64rm,      0 },<br>
     { X86::Int_CMPSDrr,     X86::Int_CMPSDrm,   0 },<br>
     { X86::Int_CMPSSrr,     X86::Int_CMPSSrm,   0 },<br>
+    { X86::Int_CVTSD2SSrr,  X86::Int_CVTSD2SSrm,      0 },<br>
+    { X86::Int_CVTSI2SD64rr,X86::Int_CVTSI2SD64rm,    0 },<br>
+    { X86::Int_CVTSI2SDrr,  X86::Int_CVTSI2SDrm,      0 },<br>
+    { X86::Int_CVTSI2SS64rr,X86::Int_CVTSI2SS64rm,    0 },<br>
+    { X86::Int_CVTSI2SSrr,  X86::Int_CVTSI2SSrm,      0 },<br>
+    { X86::Int_CVTSS2SDrr,  X86::Int_CVTSS2SDrm,      0 },<br>
     { X86::MAXPDrr,         X86::MAXPDrm,       TB_ALIGN_16 },<br>
     { X86::MAXPDrr_Int,     X86::MAXPDrm_Int,   TB_ALIGN_16 },<br>
     { X86::MAXPSrr,         X86::MAXPSrm,       TB_ALIGN_16 },<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/vec_ss_load_fold.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_ss_load_fold.ll?rev=161769&r1=161768&r2=161769&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vec_ss_load_fold.ll?rev=161769&r1=161768&r2=161769&view=diff</a><br>

==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/vec_ss_load_fold.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/vec_ss_load_fold.ll Mon Aug 13 13:29:41 2012<br>
@@ -70,3 +70,17 @@<br>
 ; CHECK: call<br>
 ; CHECK: roundss $4, %xmm{{.*}}, %xmm0<br>
 }<br>
+<br>
+; PR13576<br>
+define  <2 x double> @test5() nounwind uwtable readnone noinline {<br>
+entry:<br>
+  %0 = tail call <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double> <double<br>
+4.569870e+02, double 1.233210e+02>, i32 128) nounwind readnone<br>
+  ret <2 x double> %0<br>
+; CHECK: test5:<br>
+; CHECK: movl<br>
+; CHECK: mov<br>
+; CHECK: cvtsi2sd<br>
+}<br>
+<br>
+declare <2 x double> @llvm.x86.sse2.cvtsi2sd(<2 x double>, i32) nounwind readnone<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br><br clear="all"><br>-- <br>~Craig<br>