[LLVMbugs] [Bug 22428] New: Floating-point "and" not optimized on x86-64

Sun Feb 1 11:48:43 PST 2015

http://llvm.org/bugs/show_bug.cgi?id=22428

            Bug ID: 22428
           Summary: Floating-point "and" not optimized on x86-64
           Product: new-bugs
           Version: 3.5
          Hardware: Macintosh
                OS: MacOS X
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: schnetter at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

I notice that clang does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
  unsigned long ix;
  memcpy(&ix, &x, 8);
  ix &= 0x7fffffffffffffffUL;
  memcpy(&x, &ix, 8);
  return x;
}
double fand2(double x)
{
  return fabs(x);
}
}}}

When I compile this via:
{{{
clang-mp-3.5 -O3 -march=native -S fand.c -o fand-clang-3.5.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1:                                 ## @fand1
    pushq    %rbp
    movq    %rsp, %rbp
    vmovq    %xmm0, %rax
    movabsq    $9223372036854775807, %rcx ## imm = 0x7FFFFFFFFFFFFFFF
    andq    %rax, %rcx
    vmovq    %rcx, %xmm0
    popq    %rbp
    retq

_fand2:                                 ## @fand2
    pushq    %rbp
    movq    %rsp, %rbp
    vandpd    LCPI1_0(%rip), %xmm0, %xmm0
    popq    %rbp
    retq
}}}

This shows that (a) clang performs the bitwise and operation in an integer
register, which is probably slower, while (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150201/36d6c3a9/attachment.html>