[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

Star Tan tanmx_star at yeah.net
Fri Jul 26 08:30:20 PDT 2013


At 2013-07-26 14:14:51,"Tobias Grosser" <tobias at grosser.es> wrote:

>On 07/25/2013 09:01 PM, Star Tan wrote:
>> Hi Sebastian,
>>
>>
>> Recently, I found the "Polly - Calculate dependences" pass would lead to significant compile-time overhead when compiling some loop-intensive source code. Tobias told me you found similar problem as follows:
>> http://llvm.org/bugs/show_bug.cgi?id=14240
>>
>>
>> My evaluation shows that "Polly - Calculate dependences" pass consumes 96.4% of total compile-time overhead when compiling the program you proposed. It seems that the expensive compile-time overhead comes from those loop operations in ISL library. Some preliminary results can be seen on http://llvm.org/bugs/show_bug.cgi?id=14240.
>>
>>
>> Sebastian, I wonder whether you have further results or suggestions besides those words posted on the Bugzilla page? Any information or suggestion would be appreciated.
>
>Hi Star Tan,
>
>thanks for looking into this. Just to sum up, the test case that
>shows this slowness is the following:
>
>void bar();
>void foo(short *input) {
>   short x0, x1;
>   char i, ctr;
>
>   for(i = 0; i < 8; i++) {
>
>     // SCoP begin
>     for (ctr = 0; ctr < 8; ctr++) {
>       x0 = input[i*64 + ctr*8 + 0] ;
>       x1 = input[i*64 + ctr*8 + 1] ;
>
>       input[i*64 + ctr*8 + 0] = x0 - x1;
>       input[i*64 + ctr*8 + 1] = x0 + x1;
>       input[i*64 + ctr*8 + 2] = x0 * x1;
>     }
>     // SCoP end
>
>     bar(); // Unknown function call stops further expansion of SCoP
>   }
>}
>
>Which is translated to the following scop:
>
>     Context:
>     [p_0, p_1, p_2] -> {  : p_0 >= -2147483648 and p_0 <= 2147483647 
>and p_1 >= -2147483648 and p_1 <= 2147483647 and p_2 >= -2147483648 and 
>p_2 <= 2147483647 }
>     p_0: {0,+,128}<%for.cond2.preheader>
>     p_1: {2,+,128}<%for.cond2.preheader>
>     p_2: {4,+,128}<%for.cond2.preheader>
>[...]
>
>The interesting observation is, that Polly introduces three parameters 
>(p_0, p_1, p_2) for this SCoP, even though in the C source code only the 
>variable 'i' is SCoP invariant. However, due to the way 
>SCEVExpr(essions) in LLVM are nested, Polly sees three scop-invariant 
>SCEVExpr(essions) which are all translated into independent parameters.
>However, as we can see, the only difference between the three
>parameters is a different constant in the base of the AddRecExpr. If we 
>would just introduce p_0 (the parameter where the scev-base is zero) and 
>express any use of p_1 as p_0 + 2 and p_2 as p_0 + 4, isl could solve 
>this problem very quickly.
>There are other ways to improve performance which include further tuning 
>isl or reducing the precision of the analysis, but in this case
>I don't think we need to look into them. The above fix should give us 
>good performance and additionally also increases the precision of the 
>result (as isl now understands the relation between the different 
>parameters).
>
>To fix this, you probably would like to look into the SCEVAffinator 
>class and the way parameters are handled.
>
Thank you so much. Maybe I was in the wrong direction at first -:)

Yes, you are right. I should first investigate why Polly introduces three parameters rather than just one parameter.
Best wishes,
Star Tan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130726/ec4c2b2d/attachment.html>


More information about the llvm-dev mailing list