[llvm-dev] RFC: Killing undef and spreading poison
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 9 11:09:11 PST 2016
----- Original Message -----
> From: "Nuno Lopes via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "Eli' 'Friedman" <efriedma at codeaurora.org>, "Sanjoy Das" <sanjoy at playingwithpointers.com>
> Cc: llvm-dev at lists.llvm.org, "John Regehr" <regehr at cs.utah.edu>
> Sent: Wednesday, November 9, 2016 10:44:19 AM
> Subject: Re: [llvm-dev] RFC: Killing undef and spreading poison
>
> > On 11/8/2016 3:32 PM, Sanjoy Das wrote:
> >> Hi Nuno, Chandler,
> >>
> >> Nuno Lopes via llvm-dev wrote:
> >> > This program stores 8 bits, and leaves the remaining 24 bits
> >> > uninitialized. It then loads 16 bits, half initialized to %v,
> >> > half
> >> > uninitialized. SROA transforms the above function to:
> >> >
> >> > define i16 @g(i8 %in) {
> >> > %v = add nsw i8 127, %in
> >> > %1 = zext i8 %v to i16
> >> > %2 = shl i16 %1, 8
> >> > %3 = and i16 undef, 255
> >> > %4 = or i16 %3, %2
> >> > ret i16 %4
> >> > }
> >>
> >> This program above returns i16 poison only if "shl i16 poison, 8"
> >> is a
> >> full value poison. Whether that is the case today or not is
> >> anybody's
> >> guess (as Eli says, we should not rely too much on semantics that
> >> are
> >> known to be broken), so the important question is whether "shl i16
> >> poison, 8" _can_ be defined to be poison only on the higher 8 bits
> >> in
> >> a world with bitwise poison. If we could make that happen, we'd
> >> also
> >> get some other fringe benefits, like ComputeKnownBits would be
> >> able to
> >> fearlessly return "the low 8 bits of `shl i16 %val, 8` are 0" as
> >> opposed to "the low 8 bits of `shl i16 %val, 8` are 0 or poison".
> >>
> >> One problem with shl always generating non-poison zeroes in the
> >> lower
> >> bits is that it then is not equivalent to multiplication. This
> >> breaks
> >> optimizations like:
> >>
> >> %t = ...
> >> %q = shl i32 %t, 6
> >>
> >> =/=>
> >>
> >> %t = ...
> >> %q = mul i32 %t, 64
> >>
> >> and (probably more meaningful)
> >>
> >> %t = ...
> >> %q0 = shl i32 %t, 6
> >> %q1 = mul i32 %q0, 3
> >>
> >> =/=>
> >>
> >> %t = ...
> >> ;; %q0 = shl i32 %t, 6
> >> %q1 = mul i32 %q0, (3 * 64)
> >>
> >>
> >> and (probably even worse) it also breaks reassociation:
> >>
> >> %t = ...
> >> %q0 = shl i32 %t, %P
> >> %q1 = mul i32 %q0, %Q
> >>
> >> (No other uses of %q0, but there are uses of %q1)
> >>
> >> =/=>
> >>
> >> %t = ...
> >> %q0 = mul i32 %t, %Q
> >> %q1 = shl i32 %q1, %P
> >>
> >>
> >> I'm especially worried about this last re-association example
> >> since
> >> SCEV does things like this internally when canonicalizing
> >> expressions,
> >> and I'm not convinced that we can change it to tiptoe around these
> >> problematic cases.
> >
> > Well, we could say non-nsw add and mul are actually "bitwise"
> > operations, so "mul i32 poison, 2" is guaranteed to have its
> > bottom bit zero (but "mul nsw i32 poison, 2" is poison). Of
> > course, there's a limit to how far we can go in that direction, or
> > we just end up with the old notion of undef. Off the top of my
> > head, I'm not exactly sure where that line is.
>
> Right, we could try to make it work. I don't have a good model for
> bit-wise poison yet, but it might be possible. One of the things it
> will break is rewrites into comparisons, since at least comparisons
> will need to return poison if any bit of the inputs is poison. So
> the direction of "arithmetic -> comparison" gets broken (InstCombine
> has a few of these).
>
> But before we try to make bit-wise poison work, I would like to
> understand what are the deficiencies of value-wise poison. Is there
> any major issue? I'm asking because value-wise poison seems to be a
> much simpler concept, so if it works ok it seems to be preferable.
>
> During the meeting last week, Richard Smith proposed that instead we
> add a "load not_poison %ptr" (like load atomic) instruction that
> would be equivalent to load+freeze+bitcast, because he is concerned
> that some C++ stuff needs to be lowered to such a load. This load
> then becomes tricky to move around (e.g., cannot be sank to inside
> loops), but there are options to attenuate that problem if
> necessary.
Wouldn't we need to do this for all loads? Does this solve the load/store <-> memcpy equivalence problem?
Thanks again,
Hal
> Perhaps this construct would make John McCall happy as well (re. his
> question during the talk last week)?
>
> Thanks,
> Nuno
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list