[llvm-dev] [atomics][AArch64] Possible bug in cmpxchg lowering

Tue May 30 15:29:05 PDT 2017

Currently the AtomicExpandPass will lower the following IR:

define i1 @foo(i32* %obj, i32 %old, i32 %new) {
entry:
   %v0 = cmpxchg weak volatile i32* %obj, i32 %old, i32 %new _*release 
acquire*_
   %v1 = extractvalue { i32, i1 } %v0, 1
   ret i1 %v1
}

to the equivalent of the following on AArch64:

_*ldxr    w8, [x0]*_
     cmp        w8, w1
     b.ne    .LBB0_3
// BB#1:                                // %cmpxchg.trystore
     stlxr    w8, w2, [x0]
     cbz    w8, .LBB0_4
// BB#2:                                // %cmpxchg.failure
     mov     w0, wzr
     ret
.LBB0_3:                                // %cmpxchg.nostore
     clrex
     mov     w0, wzr
     ret
.LBB0_4:
     orr    w0, wzr, #0x1
     ret

GCC instead generates a ldaxr for the initial load, which seems more 
correct to me since it is honoring the requested failure case acquire 
ordering.  I'd like to get other opinions on this before filing a bug.

I believe the code in AtomicExpand::expandAtomicCmpXchg() is responsible 
for this discrepancy, since it only uses the failure case memory order 
for targets that use fences (i.e. when 
TLI->shouldInsertFencesForAtomic(CI) is true).

-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
  Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170530/8f300dfd/attachment.html>