[llvm-commits] [llvm] r137145 - /llvm/trunk/docs/Atomics.html
Jeffrey Yasskin
jyasskin at google.com
Wed Aug 10 10:05:47 PDT 2011
On Tue, Aug 9, 2011 at 2:07 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> +<!-- *********************************************************************** -->
> +<h2>
> + <a name="ordering">Atomic orderings</a>
> +</h2>
> +<!-- *********************************************************************** -->
> +
> +<div>
> +
> +<p>In order to achieve a balance between performance and necessary guarantees,
> + there are six levels of atomicity. They are listed in order of strength;
Do you want to mention that we've intentionally skipped
memory_order_consume for now?
> + each level includes all the guarantees of the previous level except for
> + Acquire/Release.</p>
> +
> +<p>Unordered is the lowest level of atomicity. It essentially guarantees that
> + races produce somewhat sane results instead of having undefined behavior.
> + This is intended to match the Java memory model for shared variables. It
> + cannot be used for synchronization, but is useful for Java and other
> + "safe" languages which need to guarantee that the generated code never
> + exhibits undefined behavior. Note that this guarantee is cheap on common
> + platforms for loads of a native width, but can be expensive or unavailable
> + for wider loads, like a 64-bit load on ARM. (A frontend for a "safe"
> + language would normally split a 64-bit load on ARM into two 32-bit
> + unordered loads.) In terms of the optimizer, this prohibits any
> + transformation that transforms a single load into multiple loads,
> + transforms a store into multiple stores, narrows a store, or stores a
> + value which would not be stored otherwise. Some examples of unsafe
> + optimizations are narrowing an assignment into a bitfield, rematerializing
> + a load, and turning loads and stores into a memcpy call. Reordering
> + unordered operations is safe, though, and optimizers should take
> + advantage of that because unordered operations are common in
> + languages that need them.</p>
> +
> +<p>Monotonic is the weakest level of atomicity that can be used in
> + synchronization primitives, although it does not provide any general
> + synchronization. It essentially guarantees that if you take all the
> + operations affecting a specific address, a consistent ordering exists.
> + This corresponds to the C++0x/C1x <code>memory_order_relaxed</code>; see
> + those standards for the exact definition. If you are writing a frontend, do
> + not use the low-level synchronization primitives unless you are compiling
> + a language which requires it or are sure a given pattern is correct. In
> + terms of the optimizer, this can be treated as a read+write on the relevant
> + memory location (and alias analysis will take advantage of that). In
> + addition, it is legal to reorder non-atomic and Unordered loads around
> + Monotonic loads. CSE/DSE and a few other optimizations are allowed, but
It's also legal to reorder monotonic operations around each other as
long as you can prove they don't alias. (Think 'load a; load b; load
a" -> Normally it'd be fine to collapse the two 'load a's with no
aliasing check, but with monotonic atomics, you can only do that if
a!=b.)
> + Monotonic operations are unlikely to be used in ways which would make
> + those optimizations useful.</p>
> +
> +<p>Acquire provides a barrier of the sort necessary to acquire a lock to access
> + other memory with normal loads and stores. This corresponds to the
> + C++0x/C1x <code>memory_order_acquire</code>. This is a low-level
> + synchronization primitive. In general, optimizers should treat this like
> + a nothrow call.</p>
> +
> +<p>Release is similar to Acquire, but with a barrier of the sort necessary to
> + release a lock.This corresponds to the C++0x/C1x
> + <code>memory_order_release</code>.</p>
Did you want to say, "In general, optimizers should treat this like a
nothrow call." for Release too? Of course, optimizers might be able to
do better by knowing it's a release, but doing that would probably
take a lot more infrastructure work.
> +
> +<p>AcquireRelease (<code>acq_rel</code> in IR) provides both an Acquire and a Release barrier.
> + This corresponds to the C++0x/C1x <code>memory_order_acq_rel</code>. In general,
> + optimizers should treat this like a nothrow call.</p>
> +
> +<p>SequentiallyConsistent (<code>seq_cst</code> in IR) provides Acquire and/or
> + Release semantics, and in addition guarantees a total ordering exists with
> + all other SequentiallyConsistent operations. This corresponds to the
> + C++0x/C1x <code>memory_order_seq_cst</code>, and Java volatile. The intent
> + of this ordering level is to provide a programming model which is relatively
> + easy to understand. In general, optimizers should treat this like a
> + nothrow call.</p>
> +
> +</div>
> +
> +<!-- *********************************************************************** -->
> +<h2>
> + <a name="otherinst">Other atomic instructions</a>
> +</h2>
> +<!-- *********************************************************************** -->
> +
> +<div>
> +
> +<p><code>cmpxchg</code> and <code>atomicrmw</code> are essentially like an
> + atomic load followed by an atomic store (where the store is conditional for
> + <code>cmpxchg</code>), but no other memory operation operation can happen
duplicate "operation"
> + between the load and store.</p>
Do you want to mention that we've intentionally skipped "weak"
cmpxchgs and cmpxchgs with different success and failure ordering
constraints?
> +<!-- *********************************************************************** -->
> +<h2>
> + <a name="iropt">Atomics and IR optimization</a>
> +</h2>
> +<!-- *********************************************************************** -->
> +
> +<div>
> +
> +<p>Predicates for optimizer writers to query:
> +<ul>
> + <li>isSimple(): A load or store which is not volatile or atomic. This is
> + what, for example, memcpyopt would check for operations it might
> + transform.
> + <li>isUnordered(): A load or store which is not volatile and at most
> + Unordered. This would be checked, for example, by LICM before hoisting
> + an operation.
> + <li>mayReadFromMemory()/mayWriteToMemory(): Existing predicate, but note
> + that they returns true for any operation which is volatile or at least
s/returns/return/
> + Monotonic.
More information about the llvm-commits
mailing list