[PATCH] [X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq)
Quentin Colombet
qcolombet at apple.com
Mon Oct 27 12:20:51 PDT 2014
Hi Simon,
Thanks for having split the patches.
See my comments inlined.
Cheers,
-Quentin
================
Comment at: lib/Target/X86/X86InstrInfo.cpp:936
@@ -931,5 +935,3 @@
{ X86::VCVTSS2SDrr, X86::VCVTSS2SDrm, 0 },
{ X86::Int_VCVTSS2SDrr, X86::Int_VCVTSS2SDrm, 0 },
{ X86::VRSQRTSSr, X86::VRSQRTSSm, 0 },
----------------
While you are fixing this kind of issue, could you double check the opcode in there?
All the CVTs look suspicious to me.
================
Comment at: test/CodeGen/X86/avx1-stack-reload-folding.ll:22
@@ +21,3 @@
+ %3 = fptosi <64 x double> %1 to <64 x i32>
+ %4 = fptosi <64 x double> %2 to <64 x i32>
+ %5 = or <64 x i32> %3, %4
----------------
Could you trigger the transformation with something simpler (like load, cvt, store, with both addresses as argument)?
Maybe by using fast-isel?
http://reviews.llvm.org/D6001
More information about the llvm-commits
mailing list