[llvm-dev] Undef and Poison round table follow-up & a plan

Juneyoung Lee via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 8 16:13:38 PDT 2020


> It is important to note that this applies to trap representations and not
to unspecified values. A structure or union never has a trap representation.
Yes, nondeterministic bits would work for padding of struct/union, as
described in (3) The third case is the value of struct/union padding.
For the members of struct/union, it is allowed to have trap representation,
so poison can be used.

Juneyoung

On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong <hubert.reinterpretcast at gmail.com>
wrote:

> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hello all,
>>
>> Thank everyone who participated in the (impromptu) round table discussion
>> on Tuesday.
>> For those who are interested, I share the summary of the discussion.
>> Also, I share a short-term plan regarding this issue and relevant patches.
>>
>>
>> *Fixing Miscompilations using Freeze*
>> -----------------------------------
>>
>> To reduce the cost of fixing miscompilations using freeze instruction, we
>> need to
>> optimize freeze away whenever possible.
>> Using the no-undef/poison assumption from the source language (C/C++ in
>> this context) can play a significant role.
>> To make use the assumptions, here are short-term goals:
>>
>> *1. Preserve no-undef/poison assumption of function arguments from C/C++
>> when valid.*
>>
>> There is an ongoing relevant patch (that is written by others):
>> https://reviews.llvm.org/D81678
>>
>> *2. Preserve no-undef/poison assumption of lvalue reads in C/C++ when
>> valid.*
>>
>> Reading an indeterminate value from an lvalue that does not have char or
>> std::byte type is UB [1].
>> Since reading an lvalue is lowered to `load` in IR, we suggest attaching
>> a new
>> !noundef metadata to such `load`s.
>> The IR-side change is here: https://reviews.llvm.org/D89050
>> The clang-side change is going to be made after D81678 is reviewed,
>> because it is likely
>> that this patch will have a lot of changes in clang tests.
>>
>>
>> *Replacing Undef with Poison*
>> ---------------------------
>>
>> Since undef is known to be the source of many optimizations due to its
>> complexity,
>> we'd like to suggest gradually moving towards using poison only.
>> To make it, (1) `poison` constant should be introduced into LLVM IR
>> first, and (2)
>> transformations that introduce `undef` should be updated to introduce
>> `poison` instead.
>>
>> For the step (2), we need an experimental result showing that it does not
>> cause
>> performance degradation. This relies on better support for freeze (the
>> no-undef/poison analysis patches).
>>
>> *1. Introduce a new `poison` constant into IR*:
>> https://reviews.llvm.org/D71126
>>
>> Note that `poison` constant can be used as a true placeholder value as
>> well.
>> Undef cannot be used in general because it is less undefined than poison.
>>
>> *2. Update transformations that introduce `undef` to introduce `poison`
>> instead*
>>
>> (1) There are transformations that introduce `undef` as a placeholder
>> (e.g. phi operand
>> from an unreachable block).
>> For these, `poison` can be used instead.
>>
>> (2) The value of an uninitialized object (automatic or dynamic).
>> They are indeterminate values in C/C++, so okay to use poison instead.
>> A tricky case is a bitfield access, and we have two possible solutions:
>>
>> - i. Introduce a very-packed struct type
>> ```
>> <C>
>> struct {
>>   int a:2, b:6;
>> } s;
>>
>> v = s.a;
>>
>> =>
>>
>> <IR>
>>
>> s = alloca
>>
>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type
>> v = extractvalue tmp, 0
>> ```
>>   * Pros: Can be used to precisely lower C/C++'s struct typed function
>> argument into IR
>> (currently clang coerces a struct into int if small enough; I'll explain
>> about this detail if anyone requests)
>>   * Cons: Since optimizations aren’t aware of the new type, they should
>> be updated
>>
>> - ii. Use load-freeze
>> ```
>> <C>
>> struct {
>>   int a:2, b:6;
>> } s;
>>
>> v = s.a;
>>
>> =>
>>
>> <IR>
>> s = alloca
>>
>> // Poison bits are frozen and returned
>> tmp = *load freeze* i8* s
>> v = tmp & 3
>> ```
>>   * Pros: The change is simpler
>>   * Cons: Store forwarding isn’t free; needs insertion of freeze
>>     (store x, p; v = load freeze p => store x, p; v = freeze x)
>>
>>
>> (3) The third case is the value of struct/union padding.
>> Padding is filled with unspecified value in C, so it is too undefined to
>> use poison.
>> We can fill it with defined bits nondeterministically chosen at
>> allocation time (freeze poison).
>>
>> ```
>> <C>
>> struct {
>>   char a; // 3 bytes padding
>>   int b;
>> } s;
>>
>> v = s.b;
>>
>> =>
>>
>> <IR>
>> s = alloca {i8, i32} // alloca initializes bytes in a type-dependent
>> manner
>> // s[0], s[4~7]: poison
>> // s[1~3]: let's fill these bytes with nondet. bits
>>
>> s2 = gep (bitcast s to i8*), 4
>> v = load i32 s2
>> ```
>>
>>
>> Thanks,
>> Juneyoung
>>
>>
>>
>> [1]
>> C11 6.2.6.1.5: If the stored value of an object has such a representation
>> and is read by an lvalue expression that does not have character type, the
>> behavior is undefined.
>> (Similarly, C17 6.2.6.1.5)
>>
> It is important to note that this applies to trap representations and not
> to unspecified values. A structure or union never has a trap representation.
>
>
>> C++14 8.5.12: If an indeterminate value is produced by an evaluation, the
>> behavior is undefined except in the following cases: If an indeterminate
>> value of unsigned narrow character type ...
>> (Similarly, C++17 11.6.12 , C++11 4.1.1)
>>
> While loading undef for the unsigned character type case merely produces
> undef, for C++, operations such as sign-extend or zero-extend on an undef
> i8 is also undefined behaviour.
>
>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/dc379e78/attachment.html>


More information about the llvm-dev mailing list