[cfe-commits] r51673 - in /cfe/trunk: lib/CodeGen/CodeGenTypes.cpp test/CodeGen/struct-x86-darwin.c test/CodeGen/struct.c
Devang Patel
dpatel at apple.com
Fri May 30 08:38:20 PDT 2008
On May 30, 2008, at 1:24 AM, Eli Friedman wrote:
> On Thu, May 29, 2008 at 6:02 PM, Devang Patel <dpatel at apple.com>
> wrote:
>> Consider
>>
>> struct { char c; int i; }
>>
>> At llvm level you use either
>>
>> { i8, i32 }
>>
>> or
>>
>> <{ i8, i8, i8, i8, i32 }>
>>
>> The packed version involves 5 copies during struct element by
>> element copy.
>> And it also raises the issue of dealing with undefined bytes.
>>
>> Plus, such layout change causes changes in the arithmetic involved
>> to access
>> bit fields. I have myself once added two line patch in llvm-gcc to
>> aggressively (but not all the time) use packed LLVM struct. This
>> simplified
>> code in llvm-gcc but I had to back out that patch after few weeks.
>> Because
>> our customer reported 25% performance regression and we tracked it
>> down to
>> this change in struct layout at llvm level, which introduced
>> significant
>> penalty while accessing union bitfields.
>>
>> You can argue that 1) the there is no guaranteed solution if the code
>> performance relies on such behavior and 2) the code that emits
>> bitfield
>> access in llvm-gcc can be made smarter.
>> But the point is, such changes have performance impact, so do not
>> take it
>> lightly.
>
> Okay then, I'll take it into consideration. But I'm more worried
> about getting correct code than about the performance of said code at
> the moment.
I understand. I'm fine with a FIXME for now.
>>> When LLVM packs a struct, it appears to use the tightest packing
>>> possible, which is in fact 10 bytes for an x86 long double. The
>>> issue
>>> is that gcc-compatible C struct layout doesn't allow packing them
>>> that
>>> tightly (might be considered a bug, but not really something that
>>> can
>>> be fixed at this point). Not sure how to deal with it (it doesn't
>>> show up with types with power-of-two width).
>>
>> I am at lost. When you say "LLVM", you mean llvm-gcc ?
>
> No, by LLVM I mean the LLVM assembly-level packed struct. <{ x86_fp80
> }> is 10 bytes. struct __attribute((packed)) {long double x;} is 12
> bytes on x86 Linux, and 16 bytes on x86-64 and x86 Darwin.
In that case FE has to do the right thing.
On darwin-x86, llvm-gcc lowers this into
<{ x86_fp80, [6 x i8] }>
-
Devang
More information about the cfe-commits
mailing list