<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - [x86] codegen for fcmp oeq is inconsistent"

   href="https://bugs.llvm.org/show_bug.cgi?id=34563">34563</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>[x86] codegen for fcmp oeq is inconsistent

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>spatel+llvm@rotateright.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>bool fcmp_oeq(double f1, double f2) {

  return f1 == f2;

}

bool fcmp_oeq_twice(double f1, double f2, double f3, double f4) {

  return f1 == f2 && f3 == f4;

}

Or as IR:

define i1 @fcmp_oeq(double %f1, double %f2) {

  %cmp = fcmp oeq double %f1, %f2

  ret i1 %cmp

}

define i1 @fcmp_oeq_twice(double %f1, double %f2, double %f3, double %f4) {

  %cmp1 = fcmp oeq double %f1, %f2

  %cmp2 = fcmp oeq double %f3, %f4

  %and = and i1 %cmp1, %cmp2

  ret i1 %and

}

----------------------------------------------------------------------------

$ ./llc -o - -mtriple=x86_64-unknown-unknown fcmps.ll

fcmp_oeq(double, double):                          # @fcmp_oeq(double, double)

        cmpeqsd %xmm1, %xmm0

        movq    %xmm0, %rax

        andl    $1, %eax

        retq

fcmp_oeq_twice(double, double, double, double):                 #

@fcmp_oeq_twice(double, double, double, double)

        ucomisd %xmm1, %xmm0

        setnp   %al

        sete    %cl

        andb    %al, %cl

        ucomisd %xmm3, %xmm2

        setnp   %dl

        sete    %al

        andb    %dl, %al

        andb    %cl, %al

        retq

-----------------------------------------------------------------------------

x86 doesn't have a 'setcc' for oeq (?!), so if we're using 'ucomisd', we have

to do an and-of-setcc to generate that predicate. If we use 'cmpeqsd' as in the

first example, we incur a vector-to-scalar register move. That might not be as

fast?

The inconsistency here should be investigated. But it's also possible that

we're doing the wrong thing for both cases. In the 2nd example if we use

'cmpeqsd', then we could reduce the instruction count with something like:

        cmpeqsd %xmm1, %xmm0

        cmpeqsd %xmm3, %xmm2

        andps   %xmm0, %xmm2

        movd    %xmm0, %eax

        andl    $1, %eax</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>