[LLVMdev] LLVM Concurrency and Undef

Jeffrey Yasskin jyasskin at google.com
Mon Aug 22 16:27:58 PDT 2011


On Mon, Aug 22, 2011 at 3:49 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>> <santosh.nagarakatte at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> I have been trying to understand the use of undef in both sequential
>>>> and concurrent programs.
>>>>
>>>> >From the LLVM Language Reference Manual, I see the following
>>>> definition of undef.
>>>> "Undef can be used anywhere a constant is expected, and indicates that
>>>> the user of the value may receive an unspecified bit-pattern".
>>>>  LLVM Language Reference manual also demonstrates how optimizers can
>>>> use these undef values to  optimize the program.
>>>>
>>>> However, on the other hand, with the LLVM Atomics and Concurrency
>>>> Guide states that
>>>> If code accesses a memory location from multiple threads at the same
>>>> time, the resulting loads return 'undef'.
>>>> This is different from the C++ memory model, which provides undefined
>>>> behavior. What is the rationale for returning an undef on racing
>>>> reads?
>>>>
>>>> LLVM Atomics and Concurrency guide also states the following
>>>> "Note that speculative loads are allowed; a load which is part of a
>>>> race returns undef, but does not have undefined behavior"
>>>>
>>>> If the speculative loads returns an undef and the returned value is
>>>> used, then it results in an undefined behavior. Am I correct?
>>>
>>> It behaves like any other undef value... which do often lead to
>>> undefined behavior.
>>>
>>>> If so, what is the purpose of returning an undef with a speculative load?
>>>> Is it to ensure that the subsequent uses of the value of the
>>>> speculatively introduced load is caught/detected by the optimization?
>>>
>>> The point is primarily to allow optimizations like LICM to introduce
>>> loads whose value is never used.  It also keeps consistent semantics
>>> through CodeGen, where some targets widen loads.
>>>
>>>> Is it possible to separate the "undef" in a sequential setting and
>>>> "undef" with speculative loads in a concurrent setting with separate
>>>> undefs?
>>>
>>> The intention is that they should have the same semantics.
>>
>> Suppose there are three threads T1, T2 and T3,
>> T1 (S1 )stores to a location l as non-atomic,
>> T2 then (S2)stores to l as SC-atomic,
>> later T3 (L3)loads from l as SC-atomic.
>>
>> I think the load @ T3 should return undef, since it can see both
>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>> and the SC load @ T3 --- L3 introduces an acq/rel (synchronized-with)
>> edge.
>>
>> This will affect if later conflicting accesses are ordered or not, and
>> whether memory accesses are ordered makes load return undef or not.
>>
>> If the S2 and L3 still introduces an SW edge, the next question is
>> suppose there is a later SC-load L3' @ T3, does it also return undef?
>> But this potentially makes the SC atomic sequence S2 L3 L3'
>> inconsistent --- later SC loads can read different writes from earlier
>> loads if there are no SC-stores in between.
>>
>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>> Intuitively if the locations designed for locks are used by non-atomic
>> memory accesses, the locks cannot behave correctly. Is it correct?
>
> Yes, that is correct.  I am now convinced that it is worthwhile to
> include the definition of how SW edges form in LangRef.

It's already there:

"We define a happens-before partial order as the least partial order that

* Is a superset of single-thread program order, and
* When a synchronizes-with b, includes an edge from a to b.
Synchronizes-with pairs are introduced by platform-specific
techniques, like pthread locks, thread creation, thread joining, etc.,
and by atomic instructions. (See also Atomic Memory Ordering
Constraints)."

...

"acquire - In addition to the guarantees of monotonic, if this
operation reads a value written by a release atomic operation, it
synchronizes-with that operation."

(Strictly, the release operation synchronizes-with the acquire, not
the other way around.)

Since atomic/non-atomic races are defined to return undef from the
load, even if the load has seq_cst ordering, the load never reads a
value written, so none of the stores synchronize with the load.

The text does say that all seq_cst loads and stores participate in the
global seq_cst ordering that's compatible with the happens-before
ordering, but that doesn't imply that happens-before is a superset of
the seq_cst ordering. I'm not sure whether any paradoxes arise from
allowing racy seq_cst loads to return undef, but I haven't seen any
examples so far.

I don't object to clarifying the text, of course.

Jeffrey




More information about the llvm-dev mailing list