[llvm-dev] Killing undef and spreading poison

Tue Dec 27 10:22:20 PST 2016

Related work:

http://plv.mpi-sws.org/llvmcs/paper.pdf

Hope everyone had a great Christmas!

John

On 12/7/16 4:27 PM, Nuno Lopes wrote:
>>>  1) a bit in the output is poison if flipping any set of poison bits
>>>  in the
>>> input may yield different values for the output bit.
>>
>> This is the definition I'd expect.
>>
>>> For example (bitwise notation):
>>> ppp * 000 == 000 (since flipping any of the poison bits cannot yield
>>> a
>>> result other than zero)
>>> 00p + 00p == 0pp
>>> ppp << 2 == p00
>>> icmp ugt ppp, uint_max  == p  (special case)
>>>
>>> 2) (thanks to Sanjoy):  for an operation R = A op B, bit R[i] is
>>> poison if
>>> bit i of 'A op undef' may yield 0 and 1  (same for 'undef op B').
>>> This one unfortunately breaks some algebraic rewrites like: x * -1
>>>   ->   0
>>> - x
>>>
>>>
>>> I've implemented both semantics in Alive (branches: newsema2 and
>>> bitwise-poison2, respectively) and then we run Alive over a large set
>>> of
>>> optimizations to see which were ok and which became wrong.
>>>
>>> Many optimizations became wrong with either of the bitwise semantics.
>>
>> To be clear, "wrong" here means that the value resulting from the
>> transformation might have more poison bits for some inputs than the
>> original value, correct?
>
> The vast majority of the examples were of that case, yes.  But we also
> have a few cases where the transformed expression computes a different
> value (I don't have a record of these handy, sorry).
>
>
>>> So, my suggestion is to go for value-wise poison, but maybe with a
>>> few
>>> tweaks from what I've presented (based on the feedback we received):
>>>  - Have 2 load instructions: one that does a value-wise access (if
>>>  any bit
>>> is poison in memory, the whole value becomes poison), and another
>>> version
>>> that freezes each bit individually (which is just syntactic sugar for
>>> a
>>> freeze(load <n x i1>) followed by a bitcast). The latter load is very
>>> close
>>> to the load instruction we have today.
>>>
>>> Adding this extra instruction has some benefits: it simplifies IR
>>> auto-upgrade, and frontends can move to the new instruction
>>> incrementally.
>>> Also, it seems that C++ codegen could use the freezing load in a few
>>> places.
>>
>> Can we be specific on what "few places" we're talking about? I assume
>> this means at least:
>>
>>  1. Accesses to bit fields (or other places where Clang generates
>> loads larger than the underlying data type)
>>  2. memcpy-like operations
>>  3. Accesses though character pointers? (because character pointers
>> can be used to access data of any type)
>
> 1. Bit fields definitely. Sometimes clang needs to widen loads because
> of the ABI, but these can probably be changed to vector loads instead.
> 2. Clang introduces memcpys to copy objects (which may have
> uninitialized padding). My suggestion (contrary to my previous opinion)
> is to make memcpy a bit-wise copy to simplify things.
> 3. For char ptrs, not sure that's needed. You would be loading an i8,
> which is the type you are asking for, so it doesn't seem necessary.
> 4. I'm not a clang expert by any means, but I briefly spoke with Richard
> Smith during the dev meeting and he told me that there are some cases
> involving initialization of unions in C++ that may also need this
> freezing load. I don't know the exact details.
>
>
>>> On the negative side is that the freezing load is harder to move
>>> around and
>>
>> Why exactly are these harder to move around?
>
> Because they have an implicit freeze, which is also non-trivial to move.
> For example, these freezing loads cannot easily be sank into loops, like:
> %v = load %p
> while (...) { use(%v) }
>  =>
> while (...) { %v = load %p; use(%v) }
>
> The load inside the loop may return a different value for each
> uninitialized/poison bit in each loop iteration, while in the original
> code all iterations would see the same value for %v.
>
> So this load should be reserved for the cases when it's actually needed
> only.  For example, in clang, all loads from the program itself should
> use the value-wise-poison-load, not this one.
> This load is similar as if we were to make undef yield the same value
> for all uses. Since our load today returns undef for uninitialized
> memory, the current load would become pretty much equivalent to the load
> we are proposing.
> (we already established that having undef is not a good idea; this is
> just an analogy that will hopefully give some intuition and help
> understanding the concept of this new load)
>
> Nuno