<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><span style="white-space: pre-wrap; font-size: 14px; line-height: 1.7;">At 2013-09-02 17:05:52,"Tobias Grosser" <tobias@grosser.es> wrote:</span><br><pre>>On 09/01/2013 08:02 PM, Star Tan wrote:

>> Hi all,

>>

>>

>> It seems that Polly's code generation can leads to high compile-time overhead, especially for PolyBench applications such as 2mm, 3mm, gemm, syrk, etc. Some basic evaluation and analysis for Polly's code generation can be referred to  http://llvm.org/bugs/show_bug.cgi?id=16898.

>>

>>

>> Currently, we can choose to run -polly-code-generator=cloog or -polly-code-generator=isl for code generation, but both of them lead to almost double compile-time overhead for the 2mm benchmark. Unfortunately, both Cloog and ISL can not improve the execution time compared with -polly-code-generator=none.  I think if we could identify it will not improve execution time in advance, then we can skip the expensive Cloog and ISL code generator.

>>

>>

>> Can any one provide some suggestions or hints on this problem?

>

>OK. I think in this case the problem is actually to figure out why Polly 

>does not give a speedup in terms of execution time, because we have seen 

>large speedups for 2mm and 3mm.

>

>Here is what I see:

>

>2mm$ polly-clang 2mm.c -O3 -I ../../../utilities/ -DPOLYBENCH_TIME 

>-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-ignore-aliasing

>2mm$ time ./a.out

>18.217128

>

>real        0m18.256s

>user        0m18.128s

>sys 0m0.064s

>2mm$ polly-clang 2mm.c -O3 -I ../../../utilities/ -DPOLYBENCH_TIME 

>-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-ignore-aliasing -mllvm -polly

>2mm$ time ./a.out

>4.986877

>

>real        0m5.036s

>user        0m4.940s

>sys 0m0.068s

>

>So the reason this does not work is that the polybench kernels in the 

>test suite do not annotate the functions called with the 'restrict' 

>keyword (that's whe we need the ignore-aliasing) as well as that the 

>size of the arrays is given as scalars but the corresponding loop bounds 

>are not. It would be great to fix up those issues.

>

>The first issue can be fixed by adding run-time alias analysis checks.

>Adding those checks now became very easy with the new isl code 

>generation. The basic idea is that we ask isl to generate the necessary 

>run-time check and add it into the condition created by 

>executeScopConditionally(). In case you are interested in looking into 

>this, this would be a great help!

></pre><pre>Thanks for your helpful reply. Yes, if we add <span style="font-size: 14px; line-height: 1.7;"> -polly-ignore-aliasing, which skills the aliasing checking in ScopDetection, then we can detect the kernel loop as a valid scop and gain significant performance improvement.  I tried to follow your hints to look into the </span><span style="font-size: 14px; line-height: 1.7;">executeScopConditionally() in </span>CodeGen/Utils.cpp, but I cannot fully understand how to affect ScopDetection pass by modifying the executionScopConditionally(). Do you mean I can add ISL checking information into the Context in <span style="font-size: 14px; line-height: 1.7;">executionScopConditionally()? Could you give some more concrete ideas? Is there any code examples about ISL alias analysis?</span></pre><pre>Thanks,</pre><pre>Star Tan</pre><pre><span style="font-size: 14px; line-height: 1.7;"><br></span></pre></div>