[llvm-dev] Killing undef and spreading poison

Nuno Lopes via llvm-dev llvm-dev at lists.llvm.org
Wed Dec 7 15:27:31 PST 2016


>>  1) a bit in the output is poison if flipping any set of poison bits
>>  in the
>> input may yield different values for the output bit.
>
> This is the definition I'd expect.
>
>> For example (bitwise notation):
>> ppp * 000 == 000 (since flipping any of the poison bits cannot yield
>> a
>> result other than zero)
>> 00p + 00p == 0pp
>> ppp << 2 == p00
>> icmp ugt ppp, uint_max  == p  (special case)
>>
>> 2) (thanks to Sanjoy):  for an operation R = A op B, bit R[i] is
>> poison if
>> bit i of 'A op undef' may yield 0 and 1  (same for 'undef op B').
>> This one unfortunately breaks some algebraic rewrites like: x * -1
>>   ->   0
>> - x
>>
>>
>> I've implemented both semantics in Alive (branches: newsema2 and
>> bitwise-poison2, respectively) and then we run Alive over a large set
>> of
>> optimizations to see which were ok and which became wrong.
>>
>> Many optimizations became wrong with either of the bitwise semantics.
>
> To be clear, "wrong" here means that the value resulting from the 
> transformation might have more poison bits for some inputs than the 
> original value, correct?

The vast majority of the examples were of that case, yes.  But we also have 
a few cases where the transformed expression computes a different value (I 
don't have a record of these handy, sorry).


>> So, my suggestion is to go for value-wise poison, but maybe with a
>> few
>> tweaks from what I've presented (based on the feedback we received):
>>  - Have 2 load instructions: one that does a value-wise access (if
>>  any bit
>> is poison in memory, the whole value becomes poison), and another
>> version
>> that freezes each bit individually (which is just syntactic sugar for
>> a
>> freeze(load <n x i1>) followed by a bitcast). The latter load is very
>> close
>> to the load instruction we have today.
>>
>> Adding this extra instruction has some benefits: it simplifies IR
>> auto-upgrade, and frontends can move to the new instruction
>> incrementally.
>> Also, it seems that C++ codegen could use the freezing load in a few
>> places.
>
> Can we be specific on what "few places" we're talking about? I assume this 
> means at least:
>
>  1. Accesses to bit fields (or other places where Clang generates loads 
> larger than the underlying data type)
>  2. memcpy-like operations
>  3. Accesses though character pointers? (because character pointers can be 
> used to access data of any type)

1. Bit fields definitely. Sometimes clang needs to widen loads because of 
the ABI, but these can probably be changed to vector loads instead.
2. Clang introduces memcpys to copy objects (which may have uninitialized 
padding). My suggestion (contrary to my previous opinion) is to make memcpy 
a bit-wise copy to simplify things.
3. For char ptrs, not sure that's needed. You would be loading an i8, which 
is the type you are asking for, so it doesn't seem necessary.
4. I'm not a clang expert by any means, but I briefly spoke with Richard 
Smith during the dev meeting and he told me that there are some cases 
involving initialization of unions in C++ that may also need this freezing 
load. I don't know the exact details.


>> On the negative side is that the freezing load is harder to move
>> around and
>
> Why exactly are these harder to move around?

Because they have an implicit freeze, which is also non-trivial to move. 
For example, these freezing loads cannot easily be sank into loops, like:
%v = load %p
while (...) { use(%v) }
  =>
while (...) { %v = load %p; use(%v) }

The load inside the loop may return a different value for each 
uninitialized/poison bit in each loop iteration, while in the original code 
all iterations would see the same value for %v.

So this load should be reserved for the cases when it's actually needed 
only.  For example, in clang, all loads from the program itself should use 
the value-wise-poison-load, not this one.
This load is similar as if we were to make undef yield the same value for 
all uses. Since our load today returns undef for uninitialized memory, the 
current load would become pretty much equivalent to the load we are 
proposing.
(we already established that having undef is not a good idea; this is just 
an analogy that will hopefully give some intuition and help understanding 
the concept of this new load)

Nuno 



More information about the llvm-dev mailing list