<html>

    <head>

      <base href="http://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - Missed optimization in CMPXCHG-loop"

   href="http://llvm.org/bugs/show_bug.cgi?id=16429">16429</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Missed optimization in CMPXCHG-loop

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>hammacher@cs.uni-saarland.de

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvmbugs@cs.uiuc.edu

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=10749" name="attach_10749" title="bitcode using atomicrmw and cmpxchg-loop">attachment 10749</a> <a href="attachment.cgi?id=10749&action=edit" title="bitcode using atomicrmw and cmpxchg-loop">[details]</a></span>

bitcode using atomicrmw and cmpxchg-loop

An atomicrmw with an operation different from "add" and "sub" gets translated

to a cmpxchg-loop on x86, since there is no single hardware instruction for

doing that.

If you try to write that loop manually however, the generated assembly is

longer and uses one more register.

The backend seems to miss the fact that cmpxchg does set the ZF flag, and hence

emits an unnecessary cmp.

I attached a bitcode file containing two methods. "bar" uses an atomicrmw

instruction, "baz" a cmpxchg-loop. "bar" results in optimal assembly, "baz"

contains an additional cmp plus several movs, and uses one more register.

I generate assembly using "llc -o - test.ll".

This is the non-optimal assembly part:

        [...]

        movl    L_foo(%rip), %eax

LBB1_1:                                 ## %loop

 (1)    movl    %eax, %ecx

 (2)    movl    %ecx, %edx

        andl    %edi, %edx

        lock

        cmpxchgl        %edx, L_foo(%rip)

 (3)    cmpl    %eax, %ecx

        jne     LBB1_1

        [...]

Lines (2) and (3) can be skipped completely if (1) copies to %edx directly.

This also saves register %ecx. This would also match the code generated for the

atomicrmw instruction.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>