[LLVMdev] Language lawyer question

Shantonu Sen ssen at apple.com
Tue Mar 11 22:22:00 PDT 2008

Does the test case indicate the why it was added?

More of an implementation observation than a language interpretation,  
but if you add a "long l[6];" field, llvm-gcc continues to do field-by- 
field copies, but at l[7] it turns into machine-word copies, then at  
some point it turns into a rep/movsl (on Intel), and then at another  
threshold it turns into a memcpy(3) callout.

What part of LLVM's codegen for copying "struct x { char c; short s;  
long l[6] };" considers a movb + movw + 6 movl's to efficient in  
either time or space (I was using -Os)? What changes when the overall  
structure gets to 64 bytes such that it decides its more efficient to  
copy a word at a time?

I think the test case is bogus in terms of language correctness, but  
it might be indicative of a missed optimization for doing structure  
copies. Is that what GCC's test case is actually trying to validate?  
If so, it probably falls under a "gcc test case" and not a "C test  
case", if one can differentiate them.

Maybe it would be reasonable for llvm-gcc to NOT copy the padding at - 
O0 and do explicit field copies, but to copy the padding as a side  
effect of an inlined memcpy() implementation for copying sizeof(struct  
x) when optimization is used. Copying using the largest appropriate  
registers/instructions given the structure size and alignment seems  
like it would always be faster than field copies, even for small  


Sent from my MacBook

On Mar 11, 2008, at 9:47 PM, Patrick Meredith wrote:

> I thought pointer referencing like this was only valid for arrays.   
> I could be wrong,  but it might be that looping over the struct like  
> that
> is invalid, making it undefined behavior (and then the hole doesn't  
> matter because there is no valid way to access it).  That said, I've  
> definitely
> seen a lot of code that uses pointers to reference struct contents.
> On Mar 11, 2008, at 10:42 PM, Dale Johannesen wrote:
>> Looking through the gcc testsuite turned up an interesting edge  
>> case.  Let's assume our target leaves a hole for alignment in  
>> struct x, as do x86 and powerpc.  Do you think the following code  
>> can validly abort?
>>   struct x { char c; short s; };
>>   int i;    char *p;
>>   memset(&X, 0, sizeof(struct x));
>>   memset(&Y, 22, sizeof(struct x));
>>   X = Y;
>>   for (i=0, p=(char *)&X; i<sizeof(struct x); i++, p++)
>>     if (*p != 22)
>>       abort();
>> The memset's and char-by-char comparison are clearly valid  
>> references; the questionable bit is the struct copy, which llvm-gcc  
>> currently does field-by-field, skipping the hole.  C99 says
>> In simple assignment (=), the value of the right operand is  
>> converted to the type of the
>> assignment expression and replaces the value stored in the object  
>> designated by the left
>> operand.
>> I would take "replaces the value" to mean "replaces the entire  
>> value", but it could be read otherwise, I suppose.
>> The current code seems to me to assume holes in structs can't ever  
>> be validly accessed, which isn't right, as we see here.  They are  
>> often nondeterministic (this is explicit for initialization in C99)  
>> but not always.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080311/56040969/attachment.html>

More information about the llvm-dev mailing list