[llvm-dev] llvm.memcpy for struct copy

Fri Feb 2 02:59:35 PST 2018

On 1 Feb 2018, at 18:39, Friedman, Eli <efriedma at codeaurora.org> wrote:
> 
> On 2/1/2018 2:03 AM, David Chisnall via llvm-dev wrote:
>> In contrast, the padding between fields in non-packed structs disappears as soon as SROA runs. This can lead to violations of C semantics, where padding fields should not change (because C defines bitwise comparisons on structs using memcmp). This can lead to subtly different behaviour in C code depending on the target ABI (we’ve seen cases where trailing padding is copied in one ABI but not in another, depending solely on pointer size). 
> 
> The IR type of an alloca isn't supposed to affect the semantics; it's just a sizeof(type) block of bytes.  We haven't always gotten this right in the past, but it should work correctly on trunk, as far as I know.  If you have an IR testcase where this still doesn't work correctly, please file a bug.

It’s not an IR test case.  We have a C struct that is {void*, int}.  On a system with 8-byte pointers, this becomes an LLVM struct { i8*, i8 }.  On a system with 16-byte pointers, clang lowers it to { i8*, i8, [12 x i8] }.  From the perspective of SROA, the [12 x i8] is a real field.  When a function is called with the struct, it is lowered to taking an explicit [12 x i8] argument, whereas the other version takes only i8* and i8 in registers.  This means that if the callee writes the data out to memory and then performs a memcmp, the 8-byte-pointer version may not have the same padding, whereas the 16-byte-pointer version will.

In the code that we were using (the DukTape JavaScript interpreter), the callee didn’t actually look at the padding bytes in either case, so we just ended up with less efficient code in the 16-byte-pointer case, but the same could equally have generated incorrect code for the 8-byte-pointer case.

David