[llvm-dev] RFC: Killing undef and spreading poison

Wed Oct 19 09:49:27 PDT 2016

>>>> Memcpy does a byte-by-byte copy. So if one of the bits is poison then only the byte containing that bit becomes poison.
>>>> Therefore, memcpy(x, y, 1) is equivalent to load i8.  But memcpy(x,y,4) is not equivalent to "load i32" since load makes the whole value poison if any of the bits is poison.
>>>> The alternative as given by Eli is to use "load <4 x i8>".  Since now we are loading 4 separate values, poison does not extend past the byte boundary.  When load is lowered, you should get exactly the same code as with "load i32", though.
>>>> So the hope is that there's no diff at assembly level.
>>>
>>> I'm curious. Where is it defined that memcpy is byte by byte not, for example, bit by bit? Why is the destination not identical to the source, with exactly the same bits poison?
>> 
>> I don't think it's written explicitly anywhere, but the C++ standard says the following:
>> "The fundamental storage unit in the C++ memory model is the byte."
>> [intro.memory - 1.7]
>> 
>> And then:
>> "most derived object shall have a non-zero size and shall occupy one or more bytes of storage."
>> "An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage."
>> [intro.object - 1.8]
>> 
>> I'm not a language lawyer, but it seems that for C/C++, defining memcpy in terms of byte copying is sufficient. Even for bit-fields, since these are lowered into words with a multiple-of-a-byte size.
>> 
>> However, just because for C is fine to do something, it's true that we may choose to do something else at the IR level.  It's easy to make memcpy a bit-wise copy; the question is whether there's a client that cares or not.  Please let us know if you are aware of such a client.
> 
> Isn't a struct containing a bitfield with a non multiple of 8 number of bits already an example? A previous post indicated the last byte, with the padding bits, needed special treatment.

I believe clang always uses multiples of bytes to lower bitfields (even if some bits aren't used).  Therefore, AFAICT, clang will only introduce whole-byte undefs for padding at the end, but not non-multiple of 8 bits.

Nuno