<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - [powerpc] missed CSE opportunity in DAG?"
   href="https://llvm.org/bugs/show_bug.cgi?id=24363">24363</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[powerpc] missed CSE opportunity in DAG?
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: PowerPC
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>spatel+llvm@rotateright.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>While investigating fast-math-flags propagation in the DAG (see r244053), I
noticed that PPC ends up with different code than AArch64 or x86 for this
example:

  define float @fmf(float %a, float %b) {
    %mul1 = fmul fast float %a, %b
    %nega = fsub fast float 0.0, %a
    %mul2 = fmul fast float %nega, %b
    %abx2 = fsub fast float %mul1, %mul2
    ret float %abx2
  }

$ ./llc -o - badflags.ll -march=ppc64 -enable-unsafe-fp-math
...
    fmuls f0, f1, f2
    fmadds f1, f1, f2, f0
    blr

--------------------------------------------------------------------------------

I was expecting to see an 'fadds f1, f0, f0' instead of the 'fmadds' (less
register usage; potentially faster to execute an 'add' than an 'fma').

AArch64 does this:
$ ./llc -o - badflags.ll -march=aarch64 -enable-unsafe-fp-math
...
    fmul    s0, s0, s1
    fadd    s0, s0, s0
    ret</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>