[LLVMdev] LLVM Concurrency and Undef

Sat Sep 17 06:12:12 PDT 2011

On Tue, Aug 23, 2011 at 12:08 AM, Eli Friedman <eli.friedman at gmail.com> wrote:
> On Mon, Aug 22, 2011 at 8:46 PM, Jianzhou Zhao <jianzhou at seas.upenn.edu> wrote:
>> On Mon, Aug 22, 2011 at 6:08 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 2:49 PM, Santosh Nagarakatte
>>> <santosh.nagarakatte at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> I have been trying to understand the use of undef in both sequential
>>>> and concurrent programs.
>>>>
>>>> >From the LLVM Language Reference Manual, I see the following
>>>> definition of undef.
>>>> "Undef can be used anywhere a constant is expected, and indicates that
>>>> the user of the value may receive an unspecified bit-pattern".
>>>>  LLVM Language Reference manual also demonstrates how optimizers can
>>>> use these undef values to  optimize the program.
>>>>
>>>> However, on the other hand, with the LLVM Atomics and Concurrency
>>>> Guide states that
>>>> If code accesses a memory location from multiple threads at the same
>>>> time, the resulting loads return 'undef'.
>>>> This is different from the C++ memory model, which provides undefined
>>>> behavior. What is the rationale for returning an undef on racing
>>>> reads?
>>>>
>>>> LLVM Atomics and Concurrency guide also states the following
>>>> "Note that speculative loads are allowed; a load which is part of a
>>>> race returns undef, but does not have undefined behavior"
>>>>
>>>> If the speculative loads returns an undef and the returned value is
>>>> used, then it results in an undefined behavior. Am I correct?
>>>
>>> It behaves like any other undef value... which do often lead to
>>> undefined behavior.
>>>
>>>> If so, what is the purpose of returning an undef with a speculative load?
>>>> Is it to ensure that the subsequent uses of the value of the
>>>> speculatively introduced load is caught/detected by the optimization?
>>>
>>> The point is primarily to allow optimizations like LICM to introduce
>>> loads whose value is never used.  It also keeps consistent semantics
>>> through CodeGen, where some targets widen loads.
>>>
>>>> Is it possible to separate the "undef" in a sequential setting and
>>>> "undef" with speculative loads in a concurrent setting with separate
>>>> undefs?
>>>
>>> The intention is that they should have the same semantics.
>>
>> As for whether separating the sequential ``undef'' and the concurrent
>> ``undef'', there is an analogous problem that is why C defines
>> different undefined behaviors in term of contexts that result in
>> undefined behaviors. When designing any analysis tools for C to
>> prevent undefined behaviors,  we can reason about which kinds of
>> undefined behaviors can be eliminated. So classifying undefined
>> behaviors helps at the case.
>>
>> In the LLVM setting, have we already defined different kinds of
>> ``undef'' in a sequential setting? The question is whether or not LLVM
>> can define them without considering a high-level language that is
>> compiled to the IR, since usually the semantics in the high-level
>> language indicates how to classify them. However, the undef values
>> introduced by racy loads seem to be new, for example, C++ and Java do
>> not have such concepts. Then is it worth to separate it from those
>> existing undefs? Can we get any benefit from the view of compiler
>> design if not distinguishing the undefs?
>
> The conceptual undef's here will never show up in the IR because it's
> extremely difficult to prove that a race exists.
>
> Most of the effort towards tracking undefined behavior more generally
> hasn't really centered around the presence of undef's in the IR.  And
> I'm not sure they could be usefully classified because they tend to
> disappear from the IR quickly.

Did you mean the undefined values tend to disappear from the IR of LLVM 3.0?

>
> (Putting llvmdev back on the cc list.)
>
> -Eli
>

-- 
Jianzhou