[llvm-commits] PATCH: Teach constant folding about SSE[2] conversion intrinsics

Mon Jan 10 01:18:41 PST 2011

This resolves the majority of what I've been able to spot actually happening
due to my last README entry. I've been able to think of at least one more
really trivial fold that we could do: (cvtsd2si (sitofp x)) -> x. However, I
don't yet have any benchmark that shows this is useful.

At the very least, my examples:

#include <emmintrin.h>
int f(double x) { return _mm_cvtsd_si32(_mm_set_sd(x)); }
int g(double x) { return _mm_cvttsd_si32(_mm_set_sd(x)); }
int h() { return f(1.1) + g(2.2); }

Now compiles to:
define i32 @_Z1fd(double %x) nounwind readnone {
entry:
  %vecinit.i = insertelement <2 x double> undef, double %x, i32 0
  %0 = tail call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %vecinit.i)
nounwind
  ret i32 %0
}

define i32 @_Z1gd(double %x) nounwind readnone {
entry:
  %conv.i = fptosi double %x to i32
  ret i32 %conv.i
}

define i32 @_Z1av() nounwind readnone {
entry:
  ret i32 3
}

which looks pretty good to me. =]

One question, where is the best place to test this? I couldn't find a direct
test for ConstantFolding.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110110/e6802e45/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fold_cvt_intrins.patch
Type: application/octet-stream
Size: 2792 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110110/e6802e45/attachment.obj>