[llvm-commits] [llvm] r66366 - in /llvm/trunk: lib/Transforms/Scalar/ScalarReplAggregates.cpp test/Transforms/ScalarRepl/2008-06-22-LargeArray.ll test/Transforms/ScalarRepl/vector_memcpy.ll

Mon Mar 9 13:31:50 PDT 2009

On Mar 8, 2009, at 11:35 PMPDT, Chris Lattner wrote:
> On Mar 8, 2009, at 1:09 PM, Eli Friedman wrote:
>> On Sun, Mar 8, 2009 at 11:54 AM, Chris Lattner <clattner at apple.com>
>> wrote:
>>> Do you have an example that would illustrate this problem?
>>
>> Compile the following with "clang -S -O2 -mattr=-sse".
>>
>> #include <string.h>
>> #include <stdio.h>
>> __attribute__((noinline)) void a(float* x, float* y) {
>> float z;
>> memcpy(&z, y, 4);
>> memcpy(x, &z, 4);
>> }
>>
>> int main() {
>> unsigned x = 2139095041, y;
>> a((float*)&y,(float*)&x);
>> if (y != 2139095041) {
>>   printf("y corrupted!\n");
>>   return 1;
>> }
>> return 0;
>> }
>
> Ok, this is pretty insane, and I'm sure that there are cases where
> SROA would get this "wrong" before too.  I think that the only
> reasonable course of action is to make the IR well defined w.r.t. load
> and store and treat this as an x86 backend bug.  Note that this has
> already been recognized to be a real performance issue (PR3560) as  
> well.

By the time the x86 backend sees this it looks exactly like a floating  
point load and store in the source.   It should be possible for the  
x86 backend to detect that (load whose only use is in a store) and  
turn it back into int loads and stores, but it seems like a bad  
approach.  The x87 behavior is IEEE754 conformant AFAICT so the same  
problem exists on other targets, at least theoretically, and you would  
lose in the rare case where somebody wrote a float load and store and  
actually wanted the exception to go off.

IMO, memcpy simply does not have the same semantics as a floating  
point load and store, and it's wrong for SROA to be making that  
substitution.