<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - atomic cmpxchg release acquire doesn't reject or honor failure memory ordering"
   href="https://bugs.llvm.org/show_bug.cgi?id=33332">33332</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>atomic cmpxchg release acquire doesn't reject or honor failure memory ordering
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Common Code Generator Code
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>gberry@codeaurora.org
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Currently the AtomicExpandPass will lower the following IR:

define i1 @foo(i32* %obj, i32 %old, i32 %new) {
entry:
  %v0 = cmpxchg weak volatile i32* %obj, i32 %old, i32 %new release acquire
  %v1 = extractvalue { i32, i1 } %v0, 1
  ret i1 %v1
}

to the equivalent of the following on AArch64:

    ldxr    w8, [x0]
    cmp        w8, w1
    b.ne    .LBB0_3
// BB#1:                                // %cmpxchg.trystore
    stlxr    w8, w2, [x0]
    cbz    w8, .LBB0_4
// BB#2:                                // %cmpxchg.failure
    mov     w0, wzr
    ret
.LBB0_3:                                // %cmpxchg.nostore
    clrex
    mov     w0, wzr
    ret
.LBB0_4:
    orr    w0, wzr, #0x1
    ret

GCC instead generates a ldaxr for the initial load, which seems more correct to
me since it is honoring the requested failure case acquire ordering.  I'd like
to get other opinions on this before filing a bug.

I believe the code in AtomicExpand::expandAtomicCmpXchg() is responsible for
this discrepancy, since it only uses the failure case memory order for targets
that use fences (i.e. when TLI->shouldInsertFencesForAtomic(CI) is true).</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>