[llvm-dev] Should llvm optimize 1.0 / x ?
Quentin Colombet via llvm-dev
llvm-dev at lists.llvm.org
Mon Aug 31 15:59:19 PDT 2020
Hi Alexandre,
Have you tried to compile this with fast-math enabled (`-ffast-math` https://clang.llvm.org/docs/UsersManual.html#controlling-floating-point-behavior)?
I would expect LLVM to require the `arcp` flag to perform this optimization (https://www.llvm.org/docs/LangRef.html#fast-math-flags).
Cheers,
-Quentin
> On Aug 31, 2020, at 2:21 PM, Alexandre Bique via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
> Here is a small C++ program:
>
> vec.cc:
>
> #include <cmath>
>
> using v4f32 = float __attribute__((__vector_size__(16)));
>
> v4f32 fct1(v4f32 x)
> {
> return 1.0 / x;
> }
>
> v4f32 fct2(v4f32 x)
> {
> return __builtin_ia32_rcpps(x);
> }
>
> Which is compiled to:
>
> vec.o: file format elf64-x86-64
>
>
> Disassembly of section .text:
>
> 0000000000000000 <_Z4fct1Dv4_f>:
> 0: c4 e2 79 18 0d 00 00 vbroadcastss 0x0(%rip),%xmm1 # 9
> <_Z4fct1Dv4_f+0x9>
> 7: 00 00
> 9: c5 f0 5e c0 vdivps %xmm0,%xmm1,%xmm0
> d: c3 retq
> e: 66 90 xchg %ax,%ax
>
> 0000000000000010 <_Z4fct2Dv4_f>:
> 10: c5 f8 53 c0 vrcpps %xmm0,%xmm0
> 14: c3 retq
>
>
> As you can see, 1.0 / x is not turned into vrcpps. Is it because of
> precision or a missing optimization?
>
> Regards,
> --
> Alexandre Bique
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list