[LLVMdev] LLVM Concurrency and Undef

Mon Aug 22 17:22:32 PDT 2011

On Mon, Aug 22, 2011 at 7:35 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:
> On Mon, Aug 22, 2011 at 4:20 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
>> On Mon, Aug 22, 2011 at 6:49 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 3:40 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
>>>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>>>> <santosh.nagarakatte at gmail.com> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I have been trying to understand the use of undef in both sequential
>>>>>> and concurrent programs.
>>>>>>
>>>>>> >From the LLVM Language Reference Manual, I see the following
>>>>>> definition of undef.
>>>>>> "Undef can be used anywhere a constant is expected, and indicates that
>>>>>> the user of the value may receive an unspecified bit-pattern".
>>>>>>  LLVM Language Reference manual also demonstrates how optimizers can
>>>>>> use these undef values to  optimize the program.
>>>>>>
>>>>>> However, on the other hand, with the LLVM Atomics and Concurrency
>>>>>> Guide states that
>>>>>> If code accesses a memory location from multiple threads at the same
>>>>>> time, the resulting loads return 'undef'.
>>>>>> This is different from the C++ memory model, which provides undefined
>>>>>> behavior. What is the rationale for returning an undef on racing
>>>>>> reads?
>>>>>>
>>>>>> LLVM Atomics and Concurrency guide also states the following
>>>>>> "Note that speculative loads are allowed; a load which is part of a
>>>>>> race returns undef, but does not have undefined behavior"
>>>>>>
>>>>>> If the speculative loads returns an undef and the returned value is
>>>>>> used, then it results in an undefined behavior. Am I correct?
>>>>>
>>>>> It behaves like any other undef value... which do often lead to
>>>>> undefined behavior.
>>>>>
>>>>>> If so, what is the purpose of returning an undef with a speculative load?
>>>>>> Is it to ensure that the subsequent uses of the value of the
>>>>>> speculatively introduced load is caught/detected by the optimization?
>>>>>
>>>>> The point is primarily to allow optimizations like LICM to introduce
>>>>> loads whose value is never used.  It also keeps consistent semantics
>>>>> through CodeGen, where some targets widen loads.
>>>>>
>>>>>> Is it possible to separate the "undef" in a sequential setting and
>>>>>> "undef" with speculative loads in a concurrent setting with separate
>>>>>> undefs?
>>>>>
>>>>> The intention is that they should have the same semantics.
>>>>
>>>> Suppose there are three threads T1, T2 and T3,
>>>> T1 (S1 )stores to a location l as non-atomic,
>>>> T2 then (S2)stores to l as SC-atomic,
>>>> later T3 (L3)loads from l as SC-atomic.
>>>>
>>>> I think the load @ T3 should return undef, since it can see both
>>>> writes from T1 T2. Then the question is if the SC store @ T2 --- S2
>>>> and the SC load @ T3 --- L3 introduces an acq/rel (synchronized-with)
>>>> edge.
>>>>
>>>> This will affect if later conflicting accesses are ordered or not, and
>>>> whether memory accesses are ordered makes load return undef or not.
>>>>
>>>> If the S2 and L3 still introduces an SW edge, the next question is
>>>> suppose there is a later SC-load L3' @ T3, does it also return undef?
>>>> But this potentially makes the SC atomic sequence S2 L3 L3'
>>>> inconsistent --- later SC loads can read different writes from earlier
>>>> loads if there are no SC-stores in between.
>>>>
>>>> So I think data-races SC/acq/rel atomics cannot introduce SW edges.
>>>> Intuitively if the locations designed for locks are used by non-atomic
>>>> memory accesses, the locks cannot behave correctly. Is it correct?
>>>
>>> Yes, that is correct.  I am now convinced that it is worthwhile to
>>> include the definition of how SW edges form in LangRef.
>>
>> Some of the descriptions in LangRef and
>> http://llvm.org/docs/Atomics.html are like
>> 1) monotonic atomic to the same location are totally ordered
>> 2) all SC atomic to to any location are totally ordered
>> 3) monotonic loads should see monotonic stores monotonically
>> ...
>>
>> However, these conditions can be broken if any of them are
>> 'interferenced' by non-atomic memory accesses, since that can turn
>> some of the atomic loads to be undef (with any values). So it is
>> unlikely forming a total order any more.
>>
>> So, do these properties hold only for data race free atomics?  For example,
>>
>> T1:                                         T2:
>> a:  sc store 1 to location x
>>                                              b: non-atomic store 2 to x
>>                                              c: sc store 3 to y
>> d:  r1 = sc load from y
>> e:  r2 = sc load from x
>>
>> Here, r2 still returns undef, since e can see a non-atomic store
>> although from happens-before. Then a c d e may not be totally ordered.
>
> Say r1==3, so we're guaranteed that 'e' happens after 'b'. Since
> neither a nor b happens before the other, and at least one of them is
> not atomic, r2==undef. The seq_cst ordering a, c, d, e is consistent
> with the observable results. That's true whether printing r2 prints 1,
> 2, or 47 since undef can be observed as any value.

My confusion is the difference between SC and AcquireRelease. The text says

...
SequentiallyConsistent

SequentiallyConsistent (seq_cst in IR) provides Acquire semantics for
loads and Release semantics for stores. Additionally, it guarantees
that a total ordering exists between all SequentiallyConsistent
operations.
...

Here, what does the total ordering mean? Does it mean given a sequence
of SC atomic in the total order, an SC load should return the recent
SC store from the sequence? This is like the meaning of Sequential
consistent memory model---any SC stores should be visible to all SC
loads immediately in the same order.

In the above program, we have the SC atomics

sc store 1 to x
sc store 3 to y
sc load from y = 3
sc load from x = 47 or undef

Here, no matter how we order them, there is no way to place a store 47
to x before load x with 47. Did I make a wrong assumption?

>
> Jeffrey
>

-- 
Jianzhou