[LLVMdev] LLVM Concurrency and Undef

Mon Aug 22 16:35:49 PDT 2011

On Mon, Aug 22, 2011 at 4:20 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
> On Mon, Aug 22, 2011 at 6:49 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
>>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>>> <santosh.nagarakatte at gmail.com> wrote:
>>>>> Hi all,
>>>>>
>>>>> I have been trying to understand the use of undef in both sequential
>>>>> and concurrent programs.
>>>>>
>>>>> >From the LLVM Language Reference Manual, I see the following
>>>>> definition of undef.
>>>>> "Undef can be used anywhere a constant is expected, and indicates that
>>>>> the user of the value may receive an unspecified bit-pattern".
>>>>>  LLVM Language Reference manual also demonstrates how optimizers can
>>>>> use these undef values to  optimize the program.
>>>>>
>>>>> However, on the other hand, with the LLVM Atomics and Concurrency
>>>>> Guide states that
>>>>> If code accesses a memory location from multiple threads at the same
>>>>> time, the resulting loads return 'undef'.
>>>>> This is different from the C++ memory model, which provides undefined
>>>>> behavior. What is the rationale for returning an undef on racing
>>>>> reads?
>>>>>
>>>>> LLVM Atomics and Concurrency guide also states the following
>>>>> "Note that speculative loads are allowed; a load which is part of a
>>>>> race returns undef, but does not have undefined behavior"
>>>>>
>>>>> If the speculative loads returns an undef and the returned value is
>>>>> used, then it results in an undefined behavior. Am I correct?
>>>>
>>>> It behaves like any other undef value... which do often lead to
>>>> undefined behavior.
>>>>
>>>>> If so, what is the purpose of returning an undef with a speculative load?
>>>>> Is it to ensure that the subsequent uses of the value of the
>>>>> speculatively introduced load is caught/detected by the optimization?
>>>>
>>>> The point is primarily to allow optimizations like LICM to introduce
>>>> loads whose value is never used.  It also keeps consistent semantics
>>>> through CodeGen, where some targets widen loads.
>>>>
>>>>> Is it possible to separate the "undef" in a sequential setting and
>>>>> "undef" with speculative loads in a concurrent setting with separate
>>>>> undefs?
>>>>
>>>> The intention is that they should have the same semantics.
>>>
>>> Suppose there are three threads T1, T2 and T3,
>>> T1 (S1 )stores to a location l as non-atomic,
>>> T2 then (S2)stores to l as SC-atomic,
>>> later T3 (L3)loads from l as SC-atomic.
>>>
>>> I think the load @ T3 should return undef, since it can see both
>>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>>> and the SC load @ T3 --- L3 introduces an acq/rel (synchronized-with)
>>> edge.
>>>
>>> This will affect if later conflicting accesses are ordered or not, and
>>> whether memory accesses are ordered makes load return undef or not.
>>>
>>> If the S2 and L3 still introduces an SW edge, the next question is
>>> suppose there is a later SC-load L3' @ T3, does it also return undef?
>>> But this potentially makes the SC atomic sequence S2 L3 L3'
>>> inconsistent --- later SC loads can read different writes from earlier
>>> loads if there are no SC-stores in between.
>>>
>>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>>> Intuitively if the locations designed for locks are used by non-atomic
>>> memory accesses, the locks cannot behave correctly. Is it correct?
>>
>> Yes, that is correct.  I am now convinced that it is worthwhile to
>> include the definition of how SW edges form in LangRef.
>
> Some of the descriptions in LangRef and
> http://llvm.org/docs/Atomics.html are like
> 1) monotonic atomic to the same location are totally ordered
> 2) all SC atomic to to any location are totally ordered
> 3) monotonic loads should see monotonic stores monotonically
> ...
>
> However, these conditions can be broken if any of them are
> 'interferenced' by non-atomic memory accesses, since that can turn
> some of the atomic loads to be undef (with any values). So it is
> unlikely forming a total order any more.
>
> So, do these properties hold only for data race free atomics?  For example,
>
> T1:                                         T2:
> a:  sc store 1 to location x
>                                              b: non-atomic store 2 to x
>                                              c: sc store 3 to y
> d:  r1 = sc load from y
> e:  r2 = sc load from x
>
> Here, r2 still returns undef, since e can see a non-atomic store
> although from happens-before. Then a c d e may not be totally ordered.

Say r1==3, so we're guaranteed that 'e' happens after 'b'. Since
neither a nor b happens before the other, and at least one of them is
not atomic, r2==undef. The seq_cst ordering a, c, d, e is consistent
with the observable results. That's true whether printing r2 prints 1,
2, or 47 since undef can be observed as any value.

Jeffrey