[LLVMbugs] [Bug 20043] New: Only one version of FMA3 instruction is being generated
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Sat Jun 14 16:56:57 PDT 2014
http://llvm.org/bugs/show_bug.cgi?id=20043
Bug ID: 20043
Summary: Only one version of FMA3 instruction is being
generated
Product: clang
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: -New Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: chris.a.ferguson at gmail.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
Given the following code:
#include <immintrin.h>
__m128 fmatest(__m128 x)
{
return _mm_fmadd_ps(x, _mm_set1_ps(2.0f), _mm_set1_ps(-1.0f));
}
I get the following output from Clang 3.4 (using -O3 -march=core-avx2):
.LCPI0_0:
.long 3212836864 # float -1
.LCPI0_1:
.long 1073741824 # float 2
fmatest(float __vector(4)): # @fmatest(float
__vector(4))
vbroadcastss xmm2, dword ptr [rip + .LCPI0_0]
vbroadcastss xmm1, dword ptr [rip + .LCPI0_1]
vfmadd213ps xmm1, xmm0, xmm2
vmovaps xmm0, xmm1
ret
The vmovaps would be unnecessary if an alternate fmadd instruction were used.
For instance this is what GCC 4.9 produces:
fmatest(float __vector):
vmovaps xmm1, XMMWORD PTR .LC1[rip]
vfmadd132ps xmm0, xmm1, XMMWORD PTR .LC0[rip]
ret
.LC0:
.long 1073741824
.long 1073741824
.long 1073741824
.long 1073741824
.LC1:
.long 3212836864
.long 3212836864
.long 3212836864
.long 3212836864
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20140614/184e3e3e/attachment.html>
More information about the llvm-bugs
mailing list