[LLVMbugs] [Bug 22428] New: Floating-point "and" not optimized on x86-64
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Sun Feb 1 11:48:43 PST 2015
http://llvm.org/bugs/show_bug.cgi?id=22428
Bug ID: 22428
Summary: Floating-point "and" not optimized on x86-64
Product: new-bugs
Version: 3.5
Hardware: Macintosh
OS: MacOS X
Status: NEW
Severity: enhancement
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: schnetter at gmail.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
I notice that clang does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
unsigned long ix;
memcpy(&ix, &x, 8);
ix &= 0x7fffffffffffffffUL;
memcpy(&x, &ix, 8);
return x;
}
double fand2(double x)
{
return fabs(x);
}
}}}
When I compile this via:
{{{
clang-mp-3.5 -O3 -march=native -S fand.c -o fand-clang-3.5.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1: ## @fand1
pushq %rbp
movq %rsp, %rbp
vmovq %xmm0, %rax
movabsq $9223372036854775807, %rcx ## imm = 0x7FFFFFFFFFFFFFFF
andq %rax, %rcx
vmovq %rcx, %xmm0
popq %rbp
retq
_fand2: ## @fand2
pushq %rbp
movq %rsp, %rbp
vandpd LCPI1_0(%rip), %xmm0, %xmm0
popq %rbp
retq
}}}
This shows that (a) clang performs the bitwise and operation in an integer
register, which is probably slower, while (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150201/36d6c3a9/attachment.html>
More information about the llvm-bugs
mailing list