[LLVMdev] LLVM Concurrency and Undef

Mon Aug 22 15:49:52 PDT 2011

On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>> <santosh.nagarakatte at gmail.com> wrote:
>>> Hi all,
>>>
>>> I have been trying to understand the use of undef in both sequential
>>> and concurrent programs.
>>>
>>> >From the LLVM Language Reference Manual, I see the following
>>> definition of undef.
>>> "Undef can be used anywhere a constant is expected, and indicates that
>>> the user of the value may receive an unspecified bit-pattern".
>>>  LLVM Language Reference manual also demonstrates how optimizers can
>>> use these undef values to  optimize the program.
>>>
>>> However, on the other hand, with the LLVM Atomics and Concurrency
>>> Guide states that
>>> If code accesses a memory location from multiple threads at the same
>>> time, the resulting loads return 'undef'.
>>> This is different from the C++ memory model, which provides undefined
>>> behavior. What is the rationale for returning an undef on racing
>>> reads?
>>>
>>> LLVM Atomics and Concurrency guide also states the following
>>> "Note that speculative loads are allowed; a load which is part of a
>>> race returns undef, but does not have undefined behavior"
>>>
>>> If the speculative loads returns an undef and the returned value is
>>> used, then it results in an undefined behavior. Am I correct?
>>
>> It behaves like any other undef value... which do often lead to
>> undefined behavior.
>>
>>> If so, what is the purpose of returning an undef with a speculative load?
>>> Is it to ensure that the subsequent uses of the value of the
>>> speculatively introduced load is caught/detected by the optimization?
>>
>> The point is primarily to allow optimizations like LICM to introduce
>> loads whose value is never used.  It also keeps consistent semantics
>> through CodeGen, where some targets widen loads.
>>
>>> Is it possible to separate the "undef" in a sequential setting and
>>> "undef" with speculative loads in a concurrent setting with separate
>>> undefs?
>>
>> The intention is that they should have the same semantics.
>
> Suppose there are three threads T1, T2 and T3,
> T1 (S1 )stores to a location l as non-atomic,
> T2 then (S2)stores to l as SC-atomic,
> later T3 (L3)loads from l as SC-atomic.
>
> I think the load @ T3 should return undef, since it can see both
> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
> and the SC load @ T3 --- L3 introduces an acq/rel (synchronized-with)
> edge.
>
> This will affect if later conflicting accesses are ordered or not, and
> whether memory accesses are ordered makes load return undef or not.
>
> If the S2 and L3 still introduces an SW edge, the next question is
> suppose there is a later SC-load L3' @ T3, does it also return undef?
> But this potentially makes the SC atomic sequence S2 L3 L3'
> inconsistent --- later SC loads can read different writes from earlier
> loads if there are no SC-stores in between.
>
> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
> Intuitively if the locations designed for locks are used by non-atomic
> memory accesses, the locks cannot behave correctly. Is it correct?

Yes, that is correct.  I am now convinced that it is worthwhile to
include the definition of how SW edges form in LangRef.

-Eli