[LLVMdev] [Polly] Analysis of the expensive compile-time overhead of Polly Dependence pass

Sun Jul 28 16:42:25 PDT 2013

On 07/28/2013 06:52 AM, Star Tan wrote:
> Hi Tobias,
>
> I tried to investigated the problem related to ScopInfo, but I need your
> help on handling some problems about ISL and SCEV.

I copied the list as the discussion may be helpful for others.

@Sven, no need to read all. Just search for your name.

[..]

>>The interesting observation is, that Polly introduces three parameters
>>(p_0, p_1, p_2) for this SCoP, even though in the C source code only the
>>variable 'i' is SCoP invariant. However, due to the way
>>SCEVExpr(essions) in LLVM are nested, Polly sees three scop-invariant
>>SCEVExpr(essions) which are all translated into independent parameters.
>>However, as we can see, the only difference between the three
>>parameters is a different constant in the base of the AddRecExpr. If we
>>would just introduce p_0 (the parameter where the scev-base is zero) and
>>express any use of p_1 as p_0 + 2 and p_2 as p_0 + 4, isl could solve
>>this problem very quickly.
>
> Currently, Polly introduces three parameters because it treats (base, step) as a whole for each parameter. You are right, it seems we could usea single parameter to represent all these accesses.
>
> I have attached a patch file for this purpose. This patch file is just a hack file to see whether three parameters can be merged into one parameter and whether the compile-time could be reduce after that. The key of this patch is the added source code inScop::addParams, which only add parameters if the new parameter has different "step" value. In other ways, we skip the "base" value when compare parameters.

> With this patch file, the Scop becomes:
>
> Context:
>      p0: {0,+,128}<%for.cond2.preheader>
>      Statements {
>      	Stmt_for_body6
>              Domain :=
>                  { Stmt_for_body6[i0] : i0 >= 0 and i0 <= 7 };
>              Scattering :=
>                  { Stmt_for_body6[i0] -> scattering[0, i0, 0] };
>              ReadAccess :=
>                  [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_0 + 16i0 };
>              ReadAccess :=
>                  [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = 2 + p_0 + 16i0 };
>              MustWriteAccess :=
>                  [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = p_0 + 16i0 };
>              MustWriteAccess :=
>                  [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = 2 + p_0 + 16i0 };
>              MustWriteAccess :=
>                  [p_0] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : 2o0 = 4 + p_0 + 16i0 };
>      }
>
> Unfortunately, the compile-time is still very high!

This _looks_ already a lot better. Very nice!

> I have investigated a little about it. Maybe this is becauseisl_union_map_add_mapin Polly-dependencewould still introduces new parameters based on the resulted produced by ScopInfo.
>
> Without this patch file, the Read, Write, MayWrite (input to ISL library)calculated by collectInfo are:
>
> Read:[p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : (2o0 = p_0 + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = p_1 + 16i0 and i0 >= 0 and i0 <= 7) }
> Write:[p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : (2o0 = p_0 + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = p_1 + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = p_2 + 16i0 and i0 >= 0 and i0 <= 7) }
> Maywrite:[p_0, p_1, p_2] -> {  }
> Schedule:[p_0, p_1, p_2] -> { Stmt_for_body6[i0] -> scattering[0, i0, 0] }
>
> But with our patch file, they become:
>
> Read: [p_0, p_0'] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : (2o0 = p_0' + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = 2 + p_0 + 16i0 and i0 >= 0 and i0 <= 7) }
> Write: [p_0, p_0', p_0''] -> { Stmt_for_body6[i0] -> MemRef_input[o0] : (2o0 = p_0'' + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = 2 + p_0' + 16i0 and i0 >= 0 and i0 <= 7) or (2o0 = 4 + p_0 + 16i0 and i0 >= 0 and i0 <= 7) }
> MayWrite: {  }
> Schedule: { Stmt_for_body6[i0] -> scattering[0, i0, 0] }
> It seems that theisl_union_map_add_map function called from  Dependences::collectInfoautomatically introduces extra parameters, but I have no idea why this happens.
>
> Could you give me some further suggestion on this problem?

Oh, you are on the right way. Even though we only see parameters 'p_0', 
isl seems to believe those values 'p_0' are not identical. We see this 
as isl renames them such that the difference becomes obvious. 
Unfortunately, as long as isl does not believe those parameters are 
identical, it will still be stuck in long calculations.

To understand the above behaviour we need to understand how isl 
determines if two parameter dimensions refer to the same parameter. What 
you would probably expect is that the union of the two sets [p] -> {[i] 
= p} and [p] -> {[i] = p} is a single set [p] -> {[i] = p} as the two 
parameters have the very same name. However, this is not always true. 
Even though parameters share the same name, they may in fact be 
different isl_id objects. isl_id_alloc() takes a parameter name and a 
pointer and only if both of them are identical, the very same isl_id 
object is returned.

Sven: In terms of making the behaviour of isl easier to understand,
it may make sense to fail/assert in case operands have parameters that
are named identical, but that refer to different pointer values.

So the question is why do you obtain such different isl_ids and what
can be done to fix this. I don't have a solution ready, but I have
some questions in the patch that may help you when looking into this.

> diff --git a/include/polly/ScopInfo.h b/include/polly/ScopInfo.h
> index 8c56582..bab7763 100755
> --- a/include/polly/ScopInfo.h
> +++ b/include/polly/ScopInfo.h
> @@ -445,6 +445,7 @@ class Scop {
>     /// The isl_ids that are used to represent the parameters
>     typedef std::map<const SCEV *, int> ParamIdType;
>     ParamIdType ParameterIds;
> +  ParamIdType ParameterIdsSpace;

Why are you introducing another member variable here? Why don't you keep
using ParameterIds? What is the this variable supposed to track?

> diff --git a/lib/Analysis/Dependences.cpp b/lib/Analysis/Dependences.cpp
> index 5a185d0..9f918f3 100644
> --- a/lib/Analysis/Dependences.cpp
> +++ b/lib/Analysis/Dependences.cpp
> @@ -30,6 +30,9 @@
>   #include <isl/map.h>
>   #include <isl/set.h>
>
> +#define DEBUG_TYPE "polly-dependence"
> +#include "llvm/Support/Debug.h"
> +
>   using namespace polly;
>   using namespace llvm;
>
> @@ -88,8 +91,15 @@ void Dependences::collectInfo(Scop &S, isl_union_map **Read,
>   void Dependences::calculateDependences(Scop &S) {
>     isl_union_map *Read, *Write, *MayWrite, *Schedule;
>
> +  DEBUG(dbgs() << "Scop: " << S << "\n");
> +
>     collectInfo(S, &Read, &Write, &MayWrite, &Schedule);
>
> +  DEBUG(dbgs() << "Read: " << Read << "\n";
> +        dbgs() << "Write: " << Write << "\n";
> +        dbgs() << "MayWrite: " << MayWrite << "\n";
> +        dbgs() << "Schedule: " << Schedule << "\n");
> +
>     if (OptAnalysisType == VALUE_BASED_ANALYSIS) {
>       isl_union_map_compute_flow(
>           isl_union_map_copy(Read), isl_union_map_copy(Write),
> @@ -131,6 +141,8 @@ void Dependences::calculateDependences(Scop &S) {
>     RAW = isl_union_map_coalesce(RAW);
>     WAW = isl_union_map_coalesce(WAW);
>     WAR = isl_union_map_coalesce(WAR);
> +
> +  DEBUG(printScop(dbgs()));
>   }

This patch looks good/helpful by itself. I propose to submit it 
separately for commit.

>   bool Dependences::runOnScop(Scop &S) {
> diff --git a/lib/Analysis/ScopInfo.cpp b/lib/Analysis/ScopInfo.cpp
> index aa72f3e..6fc838e 100644
> --- a/lib/Analysis/ScopInfo.cpp
> +++ b/lib/Analysis/ScopInfo.cpp
> @@ -86,6 +86,14 @@ public:
>         isl_aff *Affine =
>             isl_aff_zero_on_domain(isl_local_space_from_space(Space));
>         Affine = isl_aff_add_coefficient_si(Affine, isl_dim_param, 0, 1);
> +      if (Scev->getSCEVType() == scAddRecExpr) {
> +        const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(Scev);

The canonical pattern for this is:

if (SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Scev)) {

> +        const SCEVConstant *c = cast<SCEVConstant>(AR->getOperand(0));

This is obviously a hack. The base is not always a constant.

You can probably just call use something like,

isl_pw_aff *BaseValue = visit(AR->getOperand(0))
Affine = isl_pw_aff_sum(Affine, BaseValue);

> +        isl_int t;
> +        isl_int_init(t);
> +        isl_int_set_si(t, c->getValue()->getSExtValue());

We now use isl_val instead of isl_int.

>         return isl_pw_aff_alloc(Domain, Affine);
>       }
> @@ -717,6 +725,34 @@ void Scop::addParams(std::vector<const SCEV *> NewParameters) {
>       if (ParameterIds.find(Parameter) != ParameterIds.end())
>         continue;
>
> +    if (Parameter->getSCEVType() == scAddRecExpr) {
> +      int index, maxIndex = Parameters.size();
> +      const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(Parameter);
> +
> +      for (index = 0; index < Parameters.size(); ++index) {
> +        const SCEV *pOld = Parameters[index];
> +
> +        if (pOld->getSCEVType() != scAddRecExpr)
> +          continue;
> +
> +        const SCEVAddRecExpr *OAR = cast<SCEVAddRecExpr>(pOld);
> +        if (AR->getNumOperands() != OAR->getNumOperands())
> +          continue;
> +
> +        unsigned i, e;
> +        for (i = 1, e = AR->getNumOperands(); i != e; ++i){
> +          if (AR->getOperand(i) != OAR->getOperand(i))
> +            break;
> +        }
> +
> +        if (i == e) {
> +          // Parameters.push_back(Parameter);
> +          ParameterIds[Parameter] = index;
> +          return;
> +        }
> +      }
> +    }

I think this is the right idea, but probably the wrong place to put it.

I would put this into SCEVValidator::visitAddRecExpr. This function 
always adds the AddRecExpr itself as a parameter, whenever it is found 
to be parametric. However, what we should do is to create a new ScevExpr
that starts at zero and is otherwise identical. We then add this as a 
parameter. When doing this, we now also need to keep all the parameters
that have been found previously in the base expression.

>       int dimension = Parameters.size();
>
>       Parameters.push_back(Parameter);
> @@ -777,9 +813,9 @@ void Scop::addParameterBounds() {
>
>   void Scop::realignParams() {
>     // Add all parameters into a common model.
> -  isl_space *Space = isl_space_params_alloc(IslCtx, ParameterIds.size());
> +  isl_space *Space = isl_space_params_alloc(IslCtx, ParameterIdsSpace.size());
>
> -  for (ParamIdType::iterator PI = ParameterIds.begin(), PE = ParameterIds.end();
> +  for (ParamIdType::iterator PI = ParameterIdsSpace.begin(), PE = ParameterIdsSpace.end();

I do not see why those changes are necessary.

>          PI != PE; ++PI) {
>       const SCEV *Parameter = PI->first;
>       isl_id *id = getIdForParam(Parameter);

> @@ -787,7 +823,7 @@ void Scop::realignParams() {
>     }
>
>     // Align the parameters of all data structures to the model.
> -  Context = isl_set_align_params(Context, Space);
> +//  Context = isl_set_align_params(Context, Space);

Also no idea why this one is necessary.

Hope this helps,
Tobias