<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Floating-point "and" not optimized on x86-64"
   href="http://llvm.org/bugs/show_bug.cgi?id=22428">22428</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Floating-point "and" not optimized on x86-64
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>new-bugs
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>3.5
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Macintosh
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>MacOS X
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>new bugs
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>schnetter@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I notice that clang does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
  unsigned long ix;
  memcpy(&ix, &x, 8);
  ix &= 0x7fffffffffffffffUL;
  memcpy(&x, &ix, 8);
  return x;
}
double fand2(double x)
{
  return fabs(x);
}
}}}

When I compile this via:
{{{
clang-mp-3.5 -O3 -march=native -S fand.c -o fand-clang-3.5.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1:                                 ## @fand1
    pushq    %rbp
    movq    %rsp, %rbp
    vmovq    %xmm0, %rax
    movabsq    $9223372036854775807, %rcx ## imm = 0x7FFFFFFFFFFFFFFF
    andq    %rax, %rcx
    vmovq    %rcx, %xmm0
    popq    %rbp
    retq

_fand2:                                 ## @fand2
    pushq    %rbp
    movq    %rsp, %rbp
    vandpd    LCPI1_0(%rip), %xmm0, %xmm0
    popq    %rbp
    retq
}}}

This shows that (a) clang performs the bitwise and operation in an integer
register, which is probably slower, while (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>