[llvm-dev] RFC: Killing undef and spreading poison

Nuno Lopes via llvm-dev llvm-dev at lists.llvm.org
Wed Nov 9 15:13:12 PST 2016


>> > On 11/8/2016 3:32 PM, Sanjoy Das wrote:
>> >> Hi Nuno, Chandler,
>> >>
>> >> Nuno Lopes via llvm-dev wrote:
>> >> > This program stores 8 bits, and leaves the remaining 24 bits
>> >> > uninitialized. It then loads 16 bits, half initialized to %v,
>> >> > half
>> >> > uninitialized. SROA transforms the above function to:
>> >> >
>> >> > define i16 @g(i8 %in) {
>> >> >    %v = add nsw i8 127, %in
>> >> >    %1 = zext i8 %v to i16
>> >> >    %2 = shl i16 %1, 8
>> >> >    %3 = and i16 undef, 255
>> >> >    %4 = or i16 %3, %2
>> >> >    ret i16 %4
>> >> > }
>> >>
>> >> This program above returns i16 poison only if "shl i16 poison, 8"
>> >> is a
>> >> full value poison.  Whether that is the case today or not is
>> >> anybody's
>> >> guess (as Eli says, we should not rely too much on semantics that
>> >> are
>> >> known to be broken), so the important question is whether "shl i16
>> >> poison, 8" _can_ be defined to be poison only on the higher 8 bits
>> >> in
>> >> a world with bitwise poison.  If we could make that happen, we'd
>> >> also
>> >> get some other fringe benefits, like ComputeKnownBits would be
>> >> able to
>> >> fearlessly return "the low 8 bits of `shl i16 %val, 8` are 0" as
>> >> opposed to "the low 8 bits of `shl i16 %val, 8` are 0 or poison".
>> >>
>> >> One problem with shl always generating non-poison zeroes in the
>> >> lower
>> >> bits is that it then is not equivalent to multiplication.  This
>> >> breaks
>> >> optimizations like:
>> >>
>> >>   %t = ...
>> >>   %q = shl i32 %t, 6
>> >>
>> >> =/=>
>> >>
>> >>   %t = ...
>> >>   %q = mul i32 %t, 64
>> >>
>> >> and (probably more meaningful)
>> >>
>> >>   %t = ...
>> >>   %q0 = shl i32 %t, 6
>> >>   %q1 = mul i32 %q0, 3
>> >>
>> >> =/=>
>> >>
>> >>   %t = ...
>> >>   ;; %q0 = shl i32 %t, 6
>> >>   %q1 = mul i32 %q0, (3 * 64)
>> >>
>> >>
>> >> and (probably even worse) it also breaks reassociation:
>> >>
>> >>   %t = ...
>> >>   %q0 = shl i32 %t, %P
>> >>   %q1 = mul i32 %q0, %Q
>> >>
>> >> (No other uses of %q0, but there are uses of %q1)
>> >>
>> >> =/=>
>> >>
>> >>   %t = ...
>> >>   %q0 = mul i32 %t, %Q
>> >>   %q1 = shl i32 %q1, %P
>> >>
>> >>
>> >> I'm especially worried about this last re-association example
>> >> since
>> >> SCEV does things like this internally when canonicalizing
>> >> expressions,
>> >> and I'm not convinced that we can change it to tiptoe around these
>> >> problematic cases.
>> >
>> > Well, we could say non-nsw add and mul are actually "bitwise"
>> > operations, so "mul i32 poison, 2" is guaranteed to have its
>> > bottom bit zero (but "mul nsw i32 poison, 2" is poison).  Of
>> > course, there's a limit to how far we can go in that direction, or
>> > we just end up with the old notion of undef.  Off the top of my
>> > head, I'm not exactly sure where that line is.
>>
>> Right, we could try to make it work.  I don't have a good model for
>> bit-wise poison yet, but it might be possible.  One of the things it
>> will break is rewrites into comparisons, since at least comparisons
>> will need to return poison if any bit of the inputs is poison. So
>> the direction of "arithmetic -> comparison" gets broken (InstCombine
>> has a few of these).
>>
>> But before we try to make bit-wise poison work, I would like to
>> understand what are the deficiencies of value-wise poison. Is there
>> any major issue?  I'm asking because value-wise poison seems to be a
>> much simpler concept, so if it works ok it seems to be preferable.
>>
>> During the meeting last week, Richard Smith proposed that instead we
>> add a "load not_poison %ptr" (like load atomic) instruction that
>> would be equivalent to load+freeze+bitcast, because he is concerned
>> that some C++ stuff needs to be lowered to such a load. This load
>> then becomes tricky to move around (e.g., cannot be sank to inside
>> loops), but there are options to attenuate that problem if
>> necessary.
>
>Wouldn't we need to do this for all loads?

I don't think so. Most loads can be plain value-wise poison tainting 
(lacking better naming). Only loads that may touch uninitialized memory in a 
*valid* way need to be frozen. Richard's example was calling the constructor 
with a union or something similar (sorry, don't remember the exact details). 
For those cases it's probably easier for the front-end to legally read past 
boundaries.


> Does this solve the load/store <-> memcpy equivalence problem?

If we define memcpy to be a bit-wise copy, then it would be equivalent to a 
bit-wise load/store (e.g., <n x i1>).  I know this is not well supported by 
LLVM at the moment, but we can fix that.

Nuno 



More information about the llvm-dev mailing list