[polly] scop detection: detect regions bottom up

Thu Jun 13 01:36:57 PDT 2013

On 06/12/2013 02:40 PM, Sebastian Pop wrote:
> Tobias Grosser wrote:
>> Also, in terms of speeding up scop detection. My guess is that the
>> largest speedup would come from detecting scops bottom up, starting
>> from small regions to bigger regions. We could then skip analyzing
>> regions as soon as we found a construct in a smaller region that
>> would
>> also invalidate all regions containing the smaller region.
>
> Here is a patch that makes scop detection to work bottom up.  On my huge C++
> benchmark, the attached patch gets the overall sequential time spent in Polly
> from 2151 seconds to 1573 seconds.

Nice. How much of the compile time is that? Could you just run the scop 
detection,
but disable scheduler and code generation. I would be interested what 
percentage of compile time scop detection use in a release -O3 run.

> Also note that there still are redundant computations in the case of nested
> valid regions: we would start calling isValidRegion on the innermost region
> containing a loop, then supposing that the parent region is composed of only
> valid regions, we would call again isValidRegion and that would walk again over
> all the subregions, validating the CFG, loops, and every instruction again.

Yes, I believe we can eliminate some of this, but I am unsure if we can 
eliminate all. The main question is: Can there be a scop where an inner 
region is not a scop, but a larger region containing this inner region 
is again a scop? Meaning: do all checks that return invalid in the inner 
region also return invalid if they are called with a larger region?

For function calls with unknown side effects or non-structured control 
this is probably true. I am a little unsure about our affine loop 
bounds/conditions checks, but I could not yet find a counter example. 
One of the checks where this may fail, is the recent check that we 
committed, where we check that if the header of a loop is contained in 
the scop, then the entire loop needs to be part of the scop.

As this patch may invalidate assumptions we took previously, we should 
throughly test it. Meaning, we should get the following info:

1) Number of new compile errors on the llvm test-suite
2) Performance impact on the llvm test-suite
3) Test cases that check some of the corner case?

> With the attached patch there are 4 fails:
>
> Failing Tests (4):
>      Polly :: polybench/linear-algebra/solvers/lu/lu_with_param.ll
>      Polly :: polybench/linear-algebra/solvers/lu/lu_without_param.ll
>      Polly :: polybench/linear-algebra/solvers/ludcmp/ludcmp_without_param.ll
>      Polly :: polybench/stencils/jacobi-2d-imper/jacobi-2d-imper_with_param.ll
>
> These fails are due to the fact that we do not detect a maximal region anymore:
> first, scop detection splits region exit blocks when the exit block has multiple
> exit edges, and then these new basic blocks that we inserted do not belong to
> any region, and so we end up not discovering a maximal region anymore.

Could you point me to the line, where we split region exit blocks?

> I think we can add the newly created blocks to the region info.

If we do not correctly update the RegionInfo, we should definitely fix it.

> Another solution, and probably more difficult, is to avoid invalidating the
> region info by trying to not split the blocks ending a region.  I'm not sure
> about the impact of such a change on the codegen region versioning.

I would like to have the scop detection not modify the LLVM-IR at all. I 
know we modify the RegionInfo analysis and insert additional Regions, 
but I am a little surprised we actually modify the LLVM-IR itself.

The patch looks reasonable at the moment. Though some of the comments
above still need some discussion.

Cheers,
Tobi