[llvm-dev] Should llvm optimize 1.0 / x ?
Alexandre Bique via llvm-dev
llvm-dev at lists.llvm.org
Mon Aug 31 14:21:12 PDT 2020
Hi,
Here is a small C++ program:
vec.cc:
#include <cmath>
using v4f32 = float __attribute__((__vector_size__(16)));
v4f32 fct1(v4f32 x)
{
return 1.0 / x;
}
v4f32 fct2(v4f32 x)
{
return __builtin_ia32_rcpps(x);
}
Which is compiled to:
vec.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z4fct1Dv4_f>:
0: c4 e2 79 18 0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9
<_Z4fct1Dv4_f+0x9>
7: 00 00
9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0
d: c3 retq
e: 66 90 xchg %ax,%ax
0000000000000010 <_Z4fct2Dv4_f>:
10: c5 f8 53 c0 vrcpps %xmm0,%xmm0
14: c3 retq
As you can see, 1.0 / x is not turned into vrcpps. Is it because of
precision or a missing optimization?
Regards,
--
Alexandre Bique
More information about the llvm-dev
mailing list