[llvm-dev] Killing undef and spreading poison

Nuno Lopes via llvm-dev llvm-dev at lists.llvm.org
Thu Dec 8 14:16:04 PST 2016


>> >> I've implemented both semantics in Alive (branches: newsema2 and
>> >> bitwise-poison2, respectively) and then we run Alive over a large
>> >> set
>> >> of
>> >> optimizations to see which were ok and which became wrong.
>> >>
>> >> Many optimizations became wrong with either of the bitwise
>> >> semantics.
>> >
>> > To be clear, "wrong" here means that the value resulting from the
>> > transformation might have more poison bits for some inputs than the
>> > original value, correct?
>>
>> The vast majority of the examples were of that case, yes.  But we
>> also have
>> a few cases where the transformed expression computes a different
>> value (I
>> don't have a record of these handy, sorry).
>
> Those are just bugs, right?

Well, increasing the number of poison bits would also be considered a bug.
I guess the main issue is that some InstCombine transformations that we 
consider correct today are not under the bitwise poison semantics.  Some of 
these transformations start producing wrong values because poison is not 
propagating enough for us to be able to discard the inputs that trigger 
different output values as "uninteresting".  Bitwise poison semantics has 
too many opportunities to lose poison (e.g., shifting out poison bits, no 
carry in arithmetic instructions, and/or with zero/one bits, etc).


>> >> Adding this extra instruction has some benefits: it simplifies IR
>> >> auto-upgrade, and frontends can move to the new instruction
>> >> incrementally.
>> >> Also, it seems that C++ codegen could use the freezing load in a
>> >> few
>> >> places.
>> >
>> > Can we be specific on what "few places" we're talking about? I
>> > assume this
>> > means at least:
>> >
>> >  1. Accesses to bit fields (or other places where Clang generates
>> >  loads
>> > larger than the underlying data type)
>> >  2. memcpy-like operations
>> >  3. Accesses though character pointers? (because character pointers
>> >  can be
>> > used to access data of any type)
>>
>> 1. Bit fields definitely. Sometimes clang needs to widen loads
>> because of
>> the ABI, but these can probably be changed to vector loads instead.
>> 2. Clang introduces memcpys to copy objects (which may have
>> uninitialized
>> padding). My suggestion (contrary to my previous opinion) is to make
>> memcpy
>> a bit-wise copy to simplify things.
>
> And then how exactly do we turn small memcpys into "regular" loads/stores?

Bit vector load/stores (i.e., <n x i1>).  I was really happy to see that the 
codegen of i1 vectors has gotten some attention recently btw!

Nuno 



More information about the llvm-dev mailing list