[llvm-commits] [llvm] r137145 - /llvm/trunk/docs/Atomics.html
Eli Friedman
eli.friedman at gmail.com
Wed Aug 10 13:17:14 PDT 2011
On Wed, Aug 10, 2011 at 10:05 AM, Jeffrey Yasskin <jyasskin at google.com> wrote:
> On Tue, Aug 9, 2011 at 2:07 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> +<!-- *********************************************************************** -->
>> +<h2>
>> + <a name="ordering">Atomic orderings</a>
>> +</h2>
>> +<!-- *********************************************************************** -->
>> +
>> +<div>
>> +
>> +<p>In order to achieve a balance between performance and necessary guarantees,
>> + there are six levels of atomicity. They are listed in order of strength;
>
> Do you want to mention that we've intentionally skipped
> memory_order_consume for now?
I'm adding a note to the discussion of Acquire.
>> + each level includes all the guarantees of the previous level except for
>> + Acquire/Release.</p>
>> +
>> +<p>Unordered is the lowest level of atomicity. It essentially guarantees that
>> + races produce somewhat sane results instead of having undefined behavior.
>> + This is intended to match the Java memory model for shared variables. It
>> + cannot be used for synchronization, but is useful for Java and other
>> + "safe" languages which need to guarantee that the generated code never
>> + exhibits undefined behavior. Note that this guarantee is cheap on common
>> + platforms for loads of a native width, but can be expensive or unavailable
>> + for wider loads, like a 64-bit load on ARM. (A frontend for a "safe"
>> + language would normally split a 64-bit load on ARM into two 32-bit
>> + unordered loads.) In terms of the optimizer, this prohibits any
>> + transformation that transforms a single load into multiple loads,
>> + transforms a store into multiple stores, narrows a store, or stores a
>> + value which would not be stored otherwise. Some examples of unsafe
>> + optimizations are narrowing an assignment into a bitfield, rematerializing
>> + a load, and turning loads and stores into a memcpy call. Reordering
>> + unordered operations is safe, though, and optimizers should take
>> + advantage of that because unordered operations are common in
>> + languages that need them.</p>
>> +
>> +<p>Monotonic is the weakest level of atomicity that can be used in
>> + synchronization primitives, although it does not provide any general
>> + synchronization. It essentially guarantees that if you take all the
>> + operations affecting a specific address, a consistent ordering exists.
>> + This corresponds to the C++0x/C1x <code>memory_order_relaxed</code>; see
>> + those standards for the exact definition. If you are writing a frontend, do
>> + not use the low-level synchronization primitives unless you are compiling
>> + a language which requires it or are sure a given pattern is correct. In
>> + terms of the optimizer, this can be treated as a read+write on the relevant
>> + memory location (and alias analysis will take advantage of that). In
>> + addition, it is legal to reorder non-atomic and Unordered loads around
>> + Monotonic loads. CSE/DSE and a few other optimizations are allowed, but
>
> It's also legal to reorder monotonic operations around each other as
> long as you can prove they don't alias. (Think 'load a; load b; load
> a" -> Normally it'd be fine to collapse the two 'load a's with no
> aliasing check, but with monotonic atomics, you can only do that if
> a!=b.)
I though that was implied by "can be treated as a read+write on the relevant
memory location".
>> + Monotonic operations are unlikely to be used in ways which would make
>> + those optimizations useful.</p>
>> +
>> +<p>Acquire provides a barrier of the sort necessary to acquire a lock to access
>> + other memory with normal loads and stores. This corresponds to the
>> + C++0x/C1x <code>memory_order_acquire</code>. This is a low-level
>> + synchronization primitive. In general, optimizers should treat this like
>> + a nothrow call.</p>
>> +
>> +<p>Release is similar to Acquire, but with a barrier of the sort necessary to
>> + release a lock.This corresponds to the C++0x/C1x
>> + <code>memory_order_release</code>.</p>
>
> Did you want to say, "In general, optimizers should treat this like a
> nothrow call." for Release too? Of course, optimizers might be able to
> do better by knowing it's a release, but doing that would probably
> take a lot more infrastructure work.
Done.
>> +
>> +<p>AcquireRelease (<code>acq_rel</code> in IR) provides both an Acquire and a Release barrier.
>> + This corresponds to the C++0x/C1x <code>memory_order_acq_rel</code>. In general,
>> + optimizers should treat this like a nothrow call.</p>
>> +
>> +<p>SequentiallyConsistent (<code>seq_cst</code> in IR) provides Acquire and/or
>> + Release semantics, and in addition guarantees a total ordering exists with
>> + all other SequentiallyConsistent operations. This corresponds to the
>> + C++0x/C1x <code>memory_order_seq_cst</code>, and Java volatile. The intent
>> + of this ordering level is to provide a programming model which is relatively
>> + easy to understand. In general, optimizers should treat this like a
>> + nothrow call.</p>
>> +
>> +</div>
>> +
>> +<!-- *********************************************************************** -->
>> +<h2>
>> + <a name="otherinst">Other atomic instructions</a>
>> +</h2>
>> +<!-- *********************************************************************** -->
>> +
>> +<div>
>> +
>> +<p><code>cmpxchg</code> and <code>atomicrmw</code> are essentially like an
>> + atomic load followed by an atomic store (where the store is conditional for
>> + <code>cmpxchg</code>), but no other memory operation operation can happen
>
> duplicate "operation"
>
>> + between the load and store.</p>
>
> Do you want to mention that we've intentionally skipped "weak"
> cmpxchgs and cmpxchgs with different success and failure ordering
> constraints?
Yes; added a note.
>> +<!-- *********************************************************************** -->
>> +<h2>
>> + <a name="iropt">Atomics and IR optimization</a>
>> +</h2>
>> +<!-- *********************************************************************** -->
>> +
>> +<div>
>> +
>> +<p>Predicates for optimizer writers to query:
>> +<ul>
>> + <li>isSimple(): A load or store which is not volatile or atomic. This is
>> + what, for example, memcpyopt would check for operations it might
>> + transform.
>> + <li>isUnordered(): A load or store which is not volatile and at most
>> + Unordered. This would be checked, for example, by LICM before hoisting
>> + an operation.
>> + <li>mayReadFromMemory()/mayWriteToMemory(): Existing predicate, but note
>> + that they returns true for any operation which is volatile or at least
>
> s/returns/return/
Done.
-Eli
More information about the llvm-commits
mailing list