[llvm-commits] [test-suite] r160413 - in /test-suite/trunk/SingleSource/Benchmarks/Misc: matmul_f64_4x4.c matmul_f64_4x4.reference_output

Wed Jul 18 13:20:52 PDT 2012

On Jul 18, 2012, at 1:12 PM, Andrew Trick <atrick at apple.com> wrote:

> On Jul 17, 2012, at 5:23 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>> +/* Allow mul4 to be inlined into wrap_mul4. This actually enables further
>> + * optimizations. */
>> +__attribute__((__noinline__))
>> +void wrap_mul4(double *Out, const double A[4][4], const double B[4][4])
>> +{
>> +  mul4(Out, A, B);
>> +}
> 
> This is not obvious to me. Can you explain?

First mul4() is optimized. Then it is inlined into wrap_mul4 and optimized again.

The second pass somehow tickles SROA in a way that causes it to turn the whole double[16] array into an i1024.

That doesn't happen without the extra wrapper function. See also http://llvm.org/pr13392

/jakob

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120718/29f8388c/attachment.html>