[llvm-bugs] [Bug 31872] New: Complex division is not optimised with -ffast-math
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun Feb 5 13:52:51 PST 2017
https://llvm.org/bugs/show_bug.cgi?id=31872
Bug ID: 31872
Summary: Complex division is not optimised with -ffast-math
Product: new-bugs
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: drraph at gmail.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
Consider:
#include <complex.h>
complex float f(complex float x, complex float y) {
return x/y;
}
clang trunk with -O3 -march=core-avx2 but with or without -ffast-math gives:
f: # @f
vmovaps xmm2, xmm1
vmovshdup xmm1, xmm0 # xmm1 = xmm0[1,1,3,3]
vmovshdup xmm3, xmm2 # xmm3 = xmm2[1,1,3,3]
jmp __divsc3 # TAILCALL
However both gcc and ICC attempt to optimise this code when -ffast-math (or
equivalent) is enabled.
ICC appears to give the fastest code which is:
f:
vcvtps2pd xmm2, xmm1 #3.12
vcvtps2pd xmm4, xmm0 #3.12
vmulpd xmm8, xmm2, xmm2 #3.12
vunpckhpd xmm3, xmm2, xmm2 #3.12
vmulpd xmm6, xmm3, xmm4 #3.12
vmovddup xmm7, xmm2 #3.12
vshufpd xmm5, xmm4, xmm4, 1 #3.12
vshufpd xmm9, xmm8, xmm8, 1 #3.12
vfmaddsub213pd xmm7, xmm5, xmm6 #3.12
vaddpd xmm11, xmm8, xmm9 #3.12
vshufpd xmm10, xmm7, xmm7, 1 #3.12
vdivpd xmm12, xmm10, xmm11 #3.12
vcvtpd2ps xmm0, xmm12 #3.12
ret
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170205/d1792c17/attachment.html>
More information about the llvm-bugs
mailing list