<div dir="ltr"><div><div>Oh, I see. Yes, this works:<br><br><span style="font-family:monospace,monospace">__declspec(noalias) <br>void f1(       double c[restrict DIM][DIM], <br>         const double a[restrict DIM][DIM], <br>         const double b[restrict DIM][DIM] )<br>{<br><br>#pragma clang loop unroll_count(UNROLL_DIM)<br>    for( int i=0;i<DIM;i++)<br><br>#pragma clang loop unroll_count(UNROLL_DIM)<br>        for( int j=0;j<DIM;j++)<br><br>#pragma clang loop  unroll_count(UNROLL_DIM)<br>            for( int k=0;k<DIM;k++) {<br>                c[i][k] = c[i][k] + a[i][j]*b[j][k];<br>            }<br>}<br><br></span></div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">...works as in the invariants are optimized.  Thanks.<br><br></font></span></div><span style="font-family:monospace,monospace"><font face="arial,helvetica,sans-serif">Phil<br></font></span><div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Nov 18, 2016 at 3:29 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Phil,<br>

<br>

I'm not sure whether we do anything with __declspec(noalias), but if I had to guess, when you used restrict, you did not do it correctly. You can see <a href="http://en.cppreference.com/w/c/language/restrict" rel="noreferrer" target="_blank">http://en.cppreference.com/w/<wbr>c/language/restrict</a> for some additional usage examples.<br>

<br>

 -Hal<br>

<span class=""><br>

----- Original Message -----<br>

> From: "Phil Tomson via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

> To: "Ashutosh Nema" <<a href="mailto:Ashutosh.Nema@amd.com">Ashutosh.Nema@amd.com</a>><br>

> Cc: "llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

> Sent: Friday, November 18, 2016 12:00:58 PM<br>

> Subject: Re: [llvm-dev] Loop invariant not being optimized<br>

><br>

><br>

><br>

><br>

><br>

> I tried changing 'noalias' to 'restrict' in the code and I get:<br>

><br>

> fma.c:17:12: warning: 'restrict' attribute only applies to return<br>

> values that are pointers<br>

><br>

> It seems like 'noalias' would be the correct attribute here, from the<br>

> article you linked:<br>

><br>

</span>> "if a function is annotated as noalias , the optimizer can assume<br>

<div><div class="h5">> that, in addition to the parameters themselves, only first-level<br>

> indirections of pointer parameters are referenced or modified inside<br>

> the function. The visible global state is the set of all data that<br>

> is not defined or referenced outside of the compilation scope, and<br>

> their address is not taken."<br>

><br>

> Phil<br>

><br>

><br>

><br>

><br>

><br>

> On Thu, Nov 17, 2016 at 9:50 PM, Nema, Ashutosh <<br>

> <a href="mailto:Ashutosh.Nema@amd.com">Ashutosh.Nema@amd.com</a> > wrote:<br>

><br>

><br>

><br>

><br>

><br>

><br>

> If I understood it correctly, __declspec(noalias) is not the same as<br>

> specifying restrict on each parameter.<br>

><br>

><br>

><br>

> It means in the mentioned example a, b & c don't modify or reference<br>

> any global state, but they are free to alias one another.<br>

><br>

><br>

><br>

> You could specify restrict on each one to indicate that they do not<br>

> alias each other.<br>

><br>

><br>

><br>

> For more details refer:<br>

> <a href="https://msdn.microsoft.com/en-us/library/k649tyc7.aspx" rel="noreferrer" target="_blank">https://msdn.microsoft.com/en-<wbr>us/library/k649tyc7.aspx</a><br>

><br>

><br>

><br>

> Regards,<br>

><br>

> Ashutosh<br>

><br>

><br>

><br>

><br>

><br>

><br>

> From: llvm-dev [mailto: <a href="mailto:llvm-dev-bounces@lists.llvm.org">llvm-dev-bounces@lists.llvm.<wbr>org</a> ] On Behalf<br>

> Of Phil Tomson via llvm-dev<br>

> Sent: Friday, November 18, 2016 12:23 AM<br>

> To: LLVM Developers Mailing List < <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a> ><br>

> Subject: [llvm-dev] Loop invariant not being optimized<br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

> I've got an example where I think that there should be some<br>

> loop-invariant optimization happening, but it's not. Here's the C<br>

> code:<br>

><br>

> #define DIM 8<br>

> #define UNROLL_DIM DIM<br>

> typedef double InArray[DIM][DIM];<br>

><br>

> __declspec(noalias) void f1( InArray c, const InArray a, const<br>

> InArray b )<br>

> {<br>

><br>

> #pragma clang loop unroll_count(UNROLL_DIM)<br>

> for( int i=0;i<DIM;i++)<br>

> #pragma clang loop unroll_count(UNROLL_DIM)<br>

> for( int j=0;j<DIM;j++)<br>

> #pragma clang loop unroll_count(UNROLL_DIM)<br>

> for( int k=0;k<DIM;k++) {<br>

> c[i][k] = c[i][k] + a[i][j]*b[j][k];<br>

> }<br>

> }<br>

><br>

> The "a[i][j]" there is invariant in that inner loop. I've unrolled<br>

> the loops with the unroll pragma to make the assembly easier to<br>

> read, here's what I see (LVM 3.9, compiling with: clang<br>

> -fms-compatibility -funroll-loops -O3 -c fma.c -o fma.o )<br>

><br>

><br>

> 0000000000000000 <f1>:<br>

> 0: 29580c0000000000 load r3,r0,0x0,64<br>

> 8: 2958100200000000 load r4,r1,0x0,64 #r4 <- a[0][0]<br>

> 10: 2958140400000000 load r5,r2,0x0,64<br>

> 18: c0580c0805018000 fmaf r3,r4,r5,r3,64<br>

> 20: 79b80c0000000000 store r3,r0,0x0,64<br>

> 28: 2958100000000008 load r4,r0,0x8,64<br>

> 30: 2958140200000000 load r5,r1,0x0,64 #r5 <- a[0][0]<br>

> 38: 2958180400000008 load r6,r2,0x8,64<br>

> 40: c058100a06020000 fmaf r4,r5,r6,r4,64<br>

> 48: 79b8100000000008 store r4,r0,0x8,64<br>

> 50: 2958140000000010 load r5,r0,0x10,64<br>

> 58: 2958180200000000 load r6,r1,0x0,64 #r6 <- a[0][0]<br>

> 60: 29581c0400000010 load r7,r2,0x10,64<br>

> 68: c058140c07028000 fmaf r5,r6,r7,r5,64<br>

> 70: 79b8140000000010 store r5,r0,0x10,64<br>

> 78: 2958180000000018 load r6,r0,0x18,64<br>

> 80: 29581c0200000000 load r7,r1,0x0,64 #r7 <- a[0][0]<br>

> 88: 2958200400000018 load r8,r2,0x18,64<br>

> 90: c058180e08030000 fmaf r6,r7,r8,r6,64<br>

> ...<br>

><br>

> (fmaf semantics are: fmaf r1,r2,r3,r4, SIZE r1 <- r2*r3+r4 )<br>

><br>

><br>

> (load semantics are: load r1,r2,imm, SIZE r1<- mem[r2+imm] )<br>

><br>

><br>

><br>

> All three of the addresses are loaded in every loop. Only two need to<br>

> be reloaded in the inner loop. I added the 'noalias' declspec in the<br>

> C code above thinking that it would indicate that the pointers going<br>

> into the function are not aliased and that that would allow the<br>

> optimization, but it didn't make any difference.<br>

><br>

> Of course it's easy to rewrite the example code to avoid this extra<br>

> load/inner loop, but I would have thought this would be a fairly<br>

> straighforward optimization for the optimizer. Am I missing<br>

> something?<br>

><br>

> Phil<br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

><br>

</div></div>> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

><br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Hal Finkel<br>

Lead, Compiler Technology and Programming Languages<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</font></span></blockquote></div><br></div>