[LLVMdev] [RFC] Add second "failure" AtomicOrdering to cmpxchg instruction

Fri Mar 7 15:46:42 PST 2014

Hi Tim,

The approach also sounds good to me. I had a look at the patch and the
huge size of it seems seems to be driven mostly by the replace of
order with success order and the addition of failure order (plus the
tests). Apart from a few whitespace changes in the beginning, it looks
good to me.

cheers,
--renato

On 8 March 2014 02:20, JF Bastien <jfb at google.com> wrote:
> This sounds exactly like the right thing to do. FWIW PNaCl provisions for
> this at the moment.
>
> I haven't reviewed the patch, but your proposal sounds good.
>
>
> On Fri, Mar 7, 2014 at 5:47 AM, Tim Northover <t.p.northover at gmail.com>
> wrote:
>>
>> Hi all,
>>
>> The C++11 (& C11) compare_exchange functions with explicit memory
>> order allow you to specify two sets of semantics, one for when the
>> exchange actually happens and one for when it fails. Unfortunately, at
>> the moment the LLVM IR "cmpxchg" instruction only has one ordering,
>> which means we get sub-optimal codegen.
>>
>> This probably affects all architectures which use
>> load-linked/store-conditional instructions for atomic operations and
>> don't have versions with built-in acquire/release semantics (and so
>> need barriers). For example on ARMv7, an "acquire, relaxed" cmpxchg
>> currently generates:
>>
>>         mov r9, #52
>>         mov r3, #42
>> LBB0_1:                                 @ =>This Inner Loop Header:
>> Depth=1
>>         ldrexb r1, [r0]
>>         cmp r1, r3
>>         bne LBB0_3
>> @ BB#2:                                 @   in Loop: Header=BB0_1 Depth=1
>>         strexb r2, r9, [r0]
>>         cmp r2, #0
>>         bne LBB0_1
>> LBB0_3:
>>         dmb ish
>>         [...]
>>
>> That second barrier could be skipped by the first "bne" if we knew
>> that the user didn't need any ordering in case of failure.
>>
>> The other atomic operations are not affected, so I'd like to add a
>> second ordering operand to "cmpxchg" to support the full range of
>> possible requests. The suggested syntax would be
>>
>>     cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new>
>> [singlethread] <success ordering> <failure ordering>
>>
>> That is, syntactically I propose:
>>   + No comma between the two
>>   + Both operands are required, even if "failure" is the natural pair
>> of "success"
>>
>> Semantically, I would like to enforce the constraints in the standards:
>>   + failure <= success
>>   + failure != Release
>>   + failure != AcquireRelease.
>>
>> Before the DAG, I suggest completely removing getOrdering from
>> CmpXchgInst, in favour of getSuccessOrdering and getFailureOrdering to
>> avoid errors. In the DAG, getOrdering would still exist and return the
>> success ordering, which I believe would make existing targets
>> conservatively correct. AtomicSDNode would add getFailureOrdering,
>> used only by ATOMIC_CMP_SWAP nodes.
>>
>> Existing bitcode files would obviously be supported, and retain
>> current semantics by BitcodeReader dropping the "release" part of any
>> single ordering, and using the result as the "failure ordering" (as
>> currently documented in the LanguageRef).
>>
>> I have attached initial patches for reference, but I'm not asking for
>> thorough review at this stage.
>>
>> What do people think?
>>
>> Cheers.
>>
>> Tim.
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>