split delinearization pass in 3 steps

Tue May 6 15:48:09 PDT 2014

Tobias Grosser wrote:
> Hi Sebastian,
>
> it is good to see the result of our discussions on the mailing list.
> I already tested it with Polly and it works great. After spending a
> couple of hours integrating Polly into Julia (julialang.org), we can
> now nicely delinearize their linear algebra kernels. I will test it
> a little more with ublas and polybench, but I am rather confident
> about this change.
>
> Regarding the patch itself:
>
> >diff --git a/include/llvm/Analysis/ScalarEvolutionExpressions.h b/include/llvm/Analysis/ScalarEvolutionExpressions.h
> >index ed8c133..2c429e8 100644
> >--- a/include/llvm/Analysis/ScalarEvolutionExpressions.h
> >+++ b/include/llvm/Analysis/ScalarEvolutionExpressions.h
> >@@ -352,8 +352,27 @@ namespace llvm {
> >        return S->getSCEVType() == scAddRecExpr;
> >      }
> >
> >-    /// Splits the SCEV into two vectors of SCEVs representing the subscripts
> >-    /// and sizes of an array access. Returns the remainder of the
> >+    /// First step of delinearization: find parametric terms in this
> >+    /// SCEVAddRecExpr.
>
> 1) Is it really necessary for SCEV* to be a SCEVAddRecExpr?

Yes, I think I don't want to support generic delinearizations for now: how would
you delinearize a generic SCEV, like SCEVMulExpr, SCEVAddExpr, etc.?

>
> 2) What about using the doxygen @param syntax?

Not a single function in that file uses @param, so I won't start.  I also find
doxygen markers hard to read.

>
> @param Terms A vector in which the parametric terms will be returned.
>
> Also, I am a little unsure about the 'First step of delinearization'
> comment. This is a little bit like a "come-from" comment. Instead of
> describing where this function is used, I would prefer to read what
> this function actually does.
>
> Something like
>
> @brief Collect parametric terms occurring in step expressions
>
> We scan this SCEV for parametric terms that occur in the step
> expressions of SCEVAddRecExpr[s]. A parametric term is are
> termsconsisting of SCEVUnknowns/zext/[...]? In case the step
> expression consists of
>
> >+    void collectParametricTerms(ScalarEvolution &SE,
> >+                                SmallVectorImpl<const SCEV *> &Terms) const;
> >+
> >+    /// Second step of delinearization: compute the array dimensions Sizes from
> >+    /// the set of Terms extracted from the memory access function of this
> >+    /// SCEVAddRecExpr.
> >+    void findArrayDimensions(ScalarEvolution &SE,
> >+                             SmallVectorImpl<const SCEV *> &Terms,
> >+                             SmallVectorImpl<const SCEV *> &Sizes) const;
> >+
> >+    /// Third step of delinearization: return in Subscripts the access functions
> >+    /// for each dimension in Sizes.
> >+    const SCEV *
> >+    computeAccessFunctions(ScalarEvolution &SE,
> >+                           SmallVectorImpl<const SCEV *> &Subscripts,
> >+                           SmallVectorImpl<const SCEV *> &Sizes) const;
> >+
> >+    /// Split this SCEVAddRecExpr into two vectors of SCEVs representing the
> >+    /// subscripts and sizes of an array access. Returns the remainder of the
> >      /// delinearization that is the offset start of the array.
> >      const SCEV *delinearize(ScalarEvolution &SE,
> >                              SmallVectorImpl<const SCEV *> &Subscripts,
>
> This comment seems a little short. Everything a user of this
> function needs to know about this function should be written here.
> It seems larger parts of the comment of the delinearize()
> implementation could be moved here. Also, we should clearly state
> that this delinearization is just a guess based on certain
> assumptions that need to be run-time checked. To make this usable,
> we need to clearly state the assumptions that need to be checked.
>
> >diff --git a/lib/Analysis/ScalarEvolution.cpp b/lib/Analysis/ScalarEvolution.cpp
> >index d28aa58..584dc58 100644
> >--- a/lib/Analysis/ScalarEvolution.cpp
> >+++ b/lib/Analysis/ScalarEvolution.cpp
> >@@ -6817,6 +6817,138 @@ const SCEV *SCEVAddRecExpr::getNumIterationsInRange(ConstantRange Range,
> >    return SE.getCouldNotCompute();
> >  }
> >
> >+/// The delinearization is a 3 step process: the first two steps compute the
> >+/// sizes of each subscript and the third step computes the access functions for
> >+/// the delinearized array:
> >
> >+///
> >+/// 1. Collect parametric terms
> >+/// 2. Sort terms such that the smaller terms divide the larger terms
> >+/// 3. Separate the SCEV in Quotient + Remainder using the largest term as divisor
>
> 80 columns.
>
> >+///   -> Quotient is the subscript of the outer dimension
> >+///   -> Remainder is the new SCEV
> >+///   -> Separate again with next largest term
> >+///   -> Loop until reaching 1
> >+///
> >+/// A unification process can be done on steps 1 and 2 when there exist several
> >+/// access functions to the same memory object: that guarantees the array shape
> >+/// will be the same across memory accesses. We could derive the result of steps
> >+/// 1 and 2 from the shape given in metadata.
>
> I would mark this as TODO or FIXME.
>
> >+/// The previous SCEV->delinearize implementation performed steps 1,2,3 at the
> >+/// same time. The main deficiency of the previous implementation is that it only
> >+/// allows to look at the term in the step expression of the outermost AddRec
> >+/// expression. This term may not have the information necessary to derive the
> >+/// right delinearization:http://llvm.org/PR19336  is such a case.
>
> I believe, the problems of the old implementation should only be
> mentioned in the commit message.
>
> >+/// Example:
> >+///
> >+/// A[][n][m]
> >+///
> >+/// for i
> >+///   for j
> >+///     for k
> >+///       A[j+k][2i][5i] =
> >+///
> >+/// The initial SCEV:
> >+///
> >+/// A[{{{0,+,2*m+5}_i, +, n*m}_j, +, n*m}_k]
> >+///
> >+/// 1. Find the different terms found in the step functions:
>                                    ^^^^^  This is redundant
>
> >+/// -> [2*m, 5 * 1, n*m, n*m]
> >+///
> >+/// 2. Sort them
> >+/// -> [n*m, m, 1]
> >+/// At this point we already guess the array size: A[][n][m]
> >+///
> >+/// 3.
>
>            ^^^ Title missing here.
>
> This would be even more understandable if the if the names of the
> individual points would correspond to the one you used at the
> beginning.
>
> >+/// a. Divide SCEV: {{{0,+,2*m+5}_i, +, n*m}_j, +, n*m}_k  by largest term 'n*m'
> >+/// Quotient: {{{0,+,0}_i, +, 1}_j, +, 1}_k
> >+/// Remainder: {{{0,+,2*m+5}_i, +, 0}_j, +, 0}_k
> >+/// The quotient is the subscript of the outermost array dimension.
> >+///
> >+/// b. Divide Remainder: {{{0,+,2*m+5}_i, +, 0}_j, +, 0}_k by next largest term 'm'
>
> 80 columns
>
> >+/// Quotient: {{{0,+,2}_i, +, 0}_j, +, 0}_k
> >+/// Remainder: {{{0,+,5}_i, +, 0}_j, +, 0}_k
> >+/// The quotient is the subscript of the next array dimension.
> >+///
> >+/// As the only term remaining is '1', the remainder is the subscript of the
> >+/// innermost array expression.
> >+///
> >+/// Overall, we have:
> >+///
> >+/// n * m: {{{0,+,0}_i, +, 1}_j, +, 1}_k
> >+/// +
> >+/// m: {{{0,+,2}_i, +, 0}_j, +, 0}_k
> >+/// +
> >+/// 1 * {{{0,+,5}_i, +, 0}_j, +, 0}_k
> >+///
> >+/// This leads to the access function: A[j+k][2i][5i].
>
> Also, this whole comment is hanging a little in the middle of nothing.
> To me, it would fit best as an implementation not on top of the
> delinearize() function after the original user interface description
> has been pulled in the header.
>
> >+namespace {
> >+// Collect all steps of SCEV expressions.
> >+struct SCEVCollectStrides {
> >+  ScalarEvolution &SE;
> >+  SmallVectorImpl<const SCEV *> &Strides;
> >+
> >+  SCEVCollectStrides(ScalarEvolution &SE, SmallVectorImpl<const SCEV *> &S)
> >+      : SE(SE), Strides(S) {}
> >+
> >+  bool follow(const SCEV *S) {
> >+    if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(S))
> >+      Strides.push_back(AR->getStepRecurrence(SE));
> >+    return true;
>
> Does this mean we would also recurse into the step expression itself,
> instead of only the base expression? This seems incorrect for the
> algorithm we have in mind.

We have to recurse on the AddRecExpr otherwise we won't get any multivariate
SCEV's like {{{0,+,5}_i, +, 0}_j, +, 0}_k

As we only collect AddRec->getStepRecurrence() it does not matter whether we
recurse or not on the step of the AddRec.

>
>
> >+  }
> >+  bool isDone() const { return false; }
> >+};
> >+
> >+// Collect all SCEVUnknown and SCEVMulExpr expressions.
> >+struct SCEVCollectTerms {
> >+  SmallVectorImpl<const SCEV *> &Terms;
> >+
> >+  SCEVCollectTerms(SmallVectorImpl<const SCEV *> &T)
> >+      : Terms(T) {}
> >+
> >+  bool follow(const SCEV *S) {
> >+    if (isa<SCEVUnknown>(S) || isa<SCEVConstant>(S) || isa<SCEVMulExpr>(S)) {
> >+      Terms.push_back(S);
> >+
> >+      // Stop recursion: once we collected a term, do not walk its operands.
> >+      return false;
> >+    }
> >+
> >+    // Keep looking.
> >+    return true;
> >+  }
> >+  bool isDone() const { return false; }
> >+};
> >+}
> >+
> >+/// First step of delinearization: find parametric terms in this SCEVAddRecExpr.
> >+void SCEVAddRecExpr::collectParametricTerms(
> >+    ScalarEvolution &SE, SmallVectorImpl<const SCEV *> &Terms) const {
> >+  SmallVector<const SCEV *, 4> Strides;
> >+  SCEVCollectStrides StrideCollector(SE, Strides);
> >+  visitAll(this, StrideCollector);
> >+
> >+#ifndef NDEBUG
> >+  DEBUG(dbgs() << "Strides:\n");
> >+  for (unsigned I = 0; I < Strides.size(); ++I)
> >+    DEBUG(dbgs() << *Strides[I] << "\n");
>
> What about:
>   DEBUG({
>
>     dbgs() << "Strides:\n";
>     for (auto Stride: Strides)
>       dbgs() << *Stride << "\n";
>   });

Done all over the place.

>
> This also removes the need for the NDEBUG check.
>
> >+#endif
> >+
> >+  for (unsigned int i = 0; i < Strides.size(); i++) {
> >+    SCEVCollectTerms TermCollector(Terms);
> >+    visitAll(Strides[i], TermCollector);
> >+  }
>
> C++11 foreach loop?

done all over.

>
> >+
> >+#ifndef NDEBUG
> >+  DEBUG(dbgs() << "Terms:\n");
> >+  for (unsigned I = 0; I < Terms.size(); ++I)
> >+    DEBUG(dbgs() << *Terms[I] << "\n");
> >+#endif
>
> C++11 foreach loop? DEBUG({})?
>
> This also removes the need for the NDEBUG check.
>
> >  static const APInt srem(const SCEVConstant *C1, const SCEVConstant *C2) {
> >    APInt A = C1->getValue()->getValue();
> >    APInt B = C2->getValue()->getValue();
> >@@ -6846,358 +6978,412 @@ static const APInt sdiv(const SCEVConstant *C1, const SCEVConstant *C2) {
> >  }
> >
> >  namespace {
> >-struct SCEVGCD : public SCEVVisitor<SCEVGCD, const SCEV *> {
> >+
> >+struct SCEVDivision : public SCEVVisitor<SCEVDivision, void> {
> >  public:
> >-  // Pattern match Step into Start. When Step is a multiply expression, find
> >-  // the largest subexpression of Step that appears in Start. When Start is an
> >-  // add expression, try to match Step in the subexpressions of Start, non
> >-  // matching subexpressions are returned under Remainder.
> >-  static const SCEV *findGCD(ScalarEvolution &SE, const SCEV *Start,
> >-                             const SCEV *Step, const SCEV **Remainder) {
> >-    assert(Remainder && "Remainder should not be NULL");
> >-    SCEVGCD R(SE, Step, SE.getConstant(Step->getType(), 0));
> >-    const SCEV *Res = R.visit(Start);
> >-    *Remainder = R.Remainder;
> >-    return Res;
> >-  }
> >-
> >-  SCEVGCD(ScalarEvolution &S, const SCEV *G, const SCEV *R)
> >-      : SE(S), GCD(G), Remainder(R) {
> >-    Zero = SE.getConstant(GCD->getType(), 0);
> >-    One = SE.getConstant(GCD->getType(), 1);
> >-  }
> >-
> >-  const SCEV *visitConstant(const SCEVConstant *Constant) {
> >-    if (GCD == Constant || Constant == Zero)
> >-      return GCD;
> >-
> >-    if (const SCEVConstant *CGCD = dyn_cast<SCEVConstant>(GCD)) {
> >-      const SCEV *Res = SE.getConstant(gcd(Constant, CGCD));
> >-      if (Res != One)
> >-        return Res;
> >-
> >-      Remainder = SE.getConstant(srem(Constant, CGCD));
> >-      Constant = cast<SCEVConstant>(SE.getMinusSCEV(Constant, Remainder));
> >-      Res = SE.getConstant(gcd(Constant, CGCD));
> >-      return Res;
> >+  // Computes the Quotient and Remainder of the division of Numerator by
> >+  // Denominator.
> >+  static void divide(ScalarEvolution &SE, const SCEV *Numerator,
> >+                     const SCEV *Denominator, const SCEV **Quotient,
> >+                     const SCEV **Remainder) {
>
> You could think of returning a std::tuple, which can then be extracted
> by std::tie. This may make it more clear that we return Quotient and
> Remainder. (I am not sure if this is really nicer. I mentioned it to
> at least make sure you are aware of this option).
>
> >+    assert(Numerator && Denominator && *Quotient && *Remainder &&
> >+           "Uninitialized SCEV");
> >+
> >+    SCEVDivision D(SE, Numerator, Denominator);
> >+
> >+    // Check for the trivial case here to avoid having to check for it in the
> >+    // rest of the code.
> >+    if (Numerator == Denominator) {
> >+      *Quotient = D.One;
> >+      *Remainder = D.Zero;
> >+      return;
> >      }
> >
> >-    // When GCD is not a constant, it could be that the GCD is an Add, Mul,
> >-    // AddRec, etc., in which case we want to find out how many times the
> >-    // Constant divides the GCD: we then return that as the new GCD.
> >-    const SCEV *Rem = Zero;
> >-    const SCEV *Res = findGCD(SE, GCD, Constant, &Rem);
> >-
> >-    if (Res == One || Rem != Zero) {
> >-      Remainder = Constant;
> >-      return One;
> >+    if (Numerator == D.Zero) {
> >+      *Quotient = D.Zero;
> >+      *Remainder = D.Zero;
> >+      return;
> >      }
> >
> >-    assert(isa<SCEVConstant>(Res) && "Res should be a constant");
> >-    Remainder = SE.getConstant(srem(Constant, cast<SCEVConstant>(Res)));
> >-    return Res;
> >-  }
> >-
> >-  const SCEV *visitTruncateExpr(const SCEVTruncateExpr *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >-  }
> >-
> >-  const SCEV *visitZeroExtendExpr(const SCEVZeroExtendExpr *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >-  }
> >+    // Split the Denominator when it is a product.
> >+    if (const SCEVMulExpr *T = dyn_cast<const SCEVMulExpr>(Denominator)) {
> >+      const SCEV *Q, *R;
> >+      *Quotient = Numerator;
> >+      for (int I = 0, E = T->getNumOperands(); I < E; ++I) {
> >+        divide(SE, *Quotient, T->getOperand(I), &Q, &R);
>
> C++11 foreach loop?
>
> The corresponding iterator range does not yet exist, but you could
> create it with llvm/ADT/iterator_range.h in a separate patch. As
> this patch adds plenty of operand loops, this may be worthwhile.

Won't do.

>
> >+        *Quotient = Q;
> >+
> >+        // Bail out when the Numerator is not divisible by one of the terms of
> >+        // the Denominator.
> >+        if (R != D.Zero) {
> >+          *Quotient = D.Zero;
> >+          *Remainder = Numerator;
> >+          return;
> >+        }
> >+      }
> >+      *Remainder = D.Zero;
> >+      return;
> >+    }
> >
> >-  const SCEV *visitSignExtendExpr(const SCEVSignExtendExpr *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >+    D.visit(Numerator);
> >+    *Quotient = D.Quotient;
> >+    *Remainder = D.Remainder;
> >+  }
> >+
> >+  SCEVDivision(ScalarEvolution &S, const SCEV *Numerator, const SCEV *Denominator)
>
> 80 columns.
>
> >+      : SE(S), Denominator(Denominator) {
> >+    Zero = SE.getConstant(Denominator->getType(), 0);
> >+    One = SE.getConstant(Denominator->getType(), 1);
> >+
> >+    // By default, we don't know how to divide Expr by Denominator.
> >+    // Providing the default here simplifies the rest of the code.
> >+    Quotient = Zero;
> >+    Remainder = Numerator;
> >+  }
> >+
> >+  // Except in the trivial case described above, we do not know how to divide
> >+  // Expr by Denominator for the following functions with empty implementation.
> >+  void visitTruncateExpr(const SCEVTruncateExpr *Numerator) {}
> >+  void visitZeroExtendExpr(const SCEVZeroExtendExpr *Numerator) {}
> >+  void visitSignExtendExpr(const SCEVSignExtendExpr *Numerator) {}
> >+  void visitUDivExpr(const SCEVUDivExpr *Numerator) {}
> >+  void visitSMaxExpr(const SCEVSMaxExpr *Numerator) {}
> >+  void visitUMaxExpr(const SCEVUMaxExpr *Numerator) {}
> >+  void visitUnknown(const SCEVUnknown *Numerator) {}
> >+  void visitCouldNotCompute(const SCEVCouldNotCompute *Numerator) {}
> >+
> >+  void visitConstant(const SCEVConstant *Numerator) {
> >+    if (const SCEVConstant *D = dyn_cast<SCEVConstant>(Denominator)) {
> >+      Quotient = SE.getConstant(sdiv(Numerator, D));
> >+      Remainder = SE.getConstant(srem(Numerator, D));
> >+      return;
> >+    }
> >    }
> >
> >-  const SCEV *visitAddExpr(const SCEVAddExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return GCD;
> >-
> >-    for (int i = 0, e = Expr->getNumOperands(); i < e; ++i) {
> >-      const SCEV *Rem = Zero;
> >-      const SCEV *Res = findGCD(SE, Expr->getOperand(e - 1 - i), GCD, &Rem);
> >+  void visitAddRecExpr(const SCEVAddRecExpr *Numerator) {
> >+    const SCEV *StartQ, *StartR, *StepQ, *StepR;
> >+    assert(Numerator->isAffine() && "Numerator should be affine");
> >+    divide(SE, Numerator->getStart(), Denominator, &StartQ, &StartR);
> >+    divide(SE, Numerator->getStepRecurrence(SE), Denominator, &StepQ, &StepR);
> >+    Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),
> >+                                Numerator->getNoWrapFlags());
> >+    Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),
> >+                                 Numerator->getNoWrapFlags());
> >+  }
> >+
> >+  void visitAddExpr(const SCEVAddExpr *Numerator) {
> >+    SmallVector<const SCEV *, 2> Qs, Rs;
> >+    for (int I = 0, E = Numerator->getNumOperands(); I < E; ++I) {
>
> C++11 range loop?
>
> >+      const SCEV *Q, *R;
> >+      divide(SE, Numerator->getOperand(I), Denominator, &Q, &R);
> >+      Qs.push_back(Q);
> >+      Rs.push_back(R);
> >+    }
> >
> >-      // FIXME: There may be ambiguous situations: for instance,
> >-      // GCD(-4 + (3 * %m), 2 * %m) where 2 divides -4 and %m divides (3 * %m).
> >-      // The order in which the AddExpr is traversed computes a different GCD
> >-      // and Remainder.
> >-      if (Res != One)
> >-        GCD = Res;
> >-      if (Rem != Zero)
> >-        Remainder = SE.getAddExpr(Remainder, Rem);
> >+    if (Qs.size() == 1) {
> >+      Quotient = Qs[0];
> >+      Remainder = Rs[0];
> >+      return;
> >      }
> >
> >-    return GCD;
> >+    Quotient = SE.getAddExpr(Qs);
> >+    Remainder = SE.getAddExpr(Rs);
> >    }
> >
> >-  const SCEV *visitMulExpr(const SCEVMulExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return GCD;
> >+  void visitMulExpr(const SCEVMulExpr *Numerator) {
> >+    SmallVector<const SCEV *, 2> Qs;
> >
> >-    for (int i = 0, e = Expr->getNumOperands(); i < e; ++i) {
> >-      if (Expr->getOperand(i) == GCD)
> >-        return GCD;
> >-    }
> >+    bool FoundDenominatorTerm = false;
> >+    for (int I = 0, E = Numerator->getNumOperands(); I < E; ++I) {
>
> C++11 range loop?
>
> >+      if (FoundDenominatorTerm) {
> >+        Qs.push_back(Numerator->getOperand(I));
> >+        continue;
> >+      }
> >
> >-    // If we have not returned yet, it means that GCD is not part of Expr.
> >-    const SCEV *PartialGCD = One;
> >-    for (int i = 0, e = Expr->getNumOperands(); i < e; ++i) {
>
> C++11 range loop?
>
> >-      const SCEV *Rem = Zero;
> >-      const SCEV *Res = findGCD(SE, Expr->getOperand(i), GCD, &Rem);
> >-      if (Rem != Zero)
> >-        // GCD does not divide Expr->getOperand(i).
> >+      // Check whether Denominator divides one of the product operands.
> >+      const SCEV *Q, *R;
> >+      divide(SE, Numerator->getOperand(I), Denominator, &Q, &R);
> >+      if (R != Zero) {
> >+        Qs.push_back(Numerator->getOperand(I));
> >          continue;
> >+      }
> >+      FoundDenominatorTerm = true;
> >+      Qs.push_back(Q);
> >+    }
> >
> >-      if (Res == GCD)
> >-        return GCD;
> >-      PartialGCD = SE.getMulExpr(PartialGCD, Res);
> >-      if (PartialGCD == GCD)
> >-        return GCD;
> >+    if (FoundDenominatorTerm) {
> >+      Remainder = Zero;
> >+      if (Qs.size() == 1)
> >+        Quotient = Qs[0];
> >+      else
> >+        Quotient = SE.getMulExpr(Qs);
> >+      return;
> >      }
> >
> >-    if (PartialGCD != One)
> >-      return PartialGCD;
> >-
> >-    // Failed to find a PartialGCD: set the Remainder to the full expression,
> >-    // and return the GCD.
> >-    Remainder = Expr;
> >-    const SCEVMulExpr *Mul = dyn_cast<SCEVMulExpr>(GCD);
> >-    if (!Mul)
> >-      return GCD;
> >-
> >-    // When the GCD is a multiply expression, try to decompose it:
> >-    // this occurs when Step does not divide the Start expression
> >-    // as in: {(-4 + (3 * %m)),+,(2 * %m)}
> >-    for (int i = 0, e = Mul->getNumOperands(); i < e; ++i) {
> >-      const SCEV *Rem = Zero;
> >-      const SCEV *Res = findGCD(SE, Expr, Mul->getOperand(i), &Rem);
> >-      if (Rem == Zero) {
> >-        Remainder = Rem;
> >-        return Res;
> >-      }
> >+    if (!isa<SCEVUnknown>(Denominator)) {
> >+      Quotient = Zero;
> >+      Remainder = Numerator;
> >+      return;
> >      }
> >
> >-    return GCD;
> >+    // The Remainder is obtained by replacing Denominator by 0 in Numerator.
> >+    ValueToValueMap RewriteMap;
> >+    RewriteMap[cast<SCEVUnknown>(Denominator)->getValue()] =
> >+        cast<SCEVConstant>(Zero)->getValue();
> >+    Remainder = SCEVParameterRewriter::rewrite(Numerator, SE, RewriteMap, true);
> >+
> >+    // Quotient is (Numerator - Remainder) divided by Denominator.
> >+    const SCEV *Q, *R;
> >+    divide(SE, SE.getMinusSCEV(Numerator, Remainder), Denominator, &Q, &R);
> >+    assert(R == Zero &&
> >+           "(Numerator - Remainder) should evenly divide Denominator");
> >+    Quotient = Q;
> >    }
> >
> >-  const SCEV *visitUDivExpr(const SCEVUDivExpr *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >+private:
> >+  ScalarEvolution &SE;
> >+  const SCEV *Denominator, *Quotient, *Remainder, *Zero, *One;
> >+};
> >+}
> >+
> >+// Find the Greatest Common Divisor of A and B.
> >+static const SCEV *
> >+findGCD(ScalarEvolution &SE, const SCEV *A, const SCEV *B) {
> >+
> >+  if (const SCEVConstant *CA = dyn_cast<SCEVConstant>(A))
> >+    if (const SCEVConstant *CB = dyn_cast<SCEVConstant>(B))
> >+      return SE.getConstant(gcd(CA, CB));
> >+
> >+  const SCEV *One = SE.getConstant(A->getType(), 1);
> >+  if (isa<SCEVConstant>(A) && isa<SCEVUnknown>(B))
> >+    return One;
> >+  if (isa<SCEVUnknown>(A) && isa<SCEVConstant>(B))
> >+    return One;
> >+
> >+  const SCEV *Q, *R;
> >+  if (const SCEVMulExpr *M = dyn_cast<SCEVMulExpr>(A)) {
> >+    SmallVector<const SCEV *, 2> Qs;
> >+    for (int I = 0, E = M->getNumOperands(); I < E; ++I)
>
> C++11 range loop?
>
> >+      Qs.push_back(findGCD(SE, M->getOperand(I), B));
> >+    return SE.getMulExpr(Qs);
> >+  }
> >+  if (const SCEVMulExpr *M = dyn_cast<SCEVMulExpr>(B)) {
> >+    SmallVector<const SCEV *, 2> Qs;
> >+    for (int I = 0, E = M->getNumOperands(); I < E; ++I)
>
> C++11 range loop?
>
> >+      Qs.push_back(findGCD(SE, A, M->getOperand(I)));
> >+    return SE.getMulExpr(Qs);
> >    }
> >
> >-  const SCEV *visitAddRecExpr(const SCEVAddRecExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return GCD;
> >+  const SCEV *Zero = SE.getConstant(A->getType(), 0);
> >+  SCEVDivision::divide(SE, A, B, &Q, &R);
> >+  if (R == Zero)
> >+    return B;
> >
> >-    if (!Expr->isAffine()) {
> >-      Remainder = Expr;
> >-      return GCD;
> >-    }
> >+  SCEVDivision::divide(SE, B, A, &Q, &R);
> >+  if (R == Zero)
> >+    return A;
> >
> >-    const SCEV *Rem = Zero;
> >-    const SCEV *Res = findGCD(SE, Expr->getOperand(0), GCD, &Rem);
> >-    if (Res == One || Res->isAllOnesValue()) {
> >-      Remainder = Expr;
> >-      return GCD;
> >-    }
> >+  return One;
> >+}
> >
> >-    if (Rem != Zero)
> >-      Remainder = SE.getAddExpr(Remainder, Rem);
> >+// Find the Greatest Common Divisor of all the SCEVs in Terms.
> >+static const SCEV *
> >+findGCD(ScalarEvolution &SE, SmallVectorImpl<const SCEV *> &Terms) {
> >+  assert(Terms.size() > 0 && "Terms vector is empty");
> >
> >-    Rem = Zero;
> >-    Res = findGCD(SE, Expr->getOperand(1), Res, &Rem);
> >-    if (Rem != Zero || Res == One || Res->isAllOnesValue()) {
> >-      Remainder = Expr;
> >-      return GCD;
> >-    }
> >+  const SCEV *GCD = Terms[0];
> >+  for (unsigned I = 1; I < Terms.size(); I++)
> >+    GCD = findGCD(SE, GCD, Terms[I]);
> >
> >-    return Res;
> >-  }
> >+  return GCD;
> >+}
> >
> >-  const SCEV *visitSMaxExpr(const SCEVSMaxExpr *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >-  }
> >+static void findArrayDimensionsRec(ScalarEvolution &SE,
> >+                                   SmallVectorImpl<const SCEV *> &Terms,
> >+                                   SmallVectorImpl<const SCEV *> &Sizes,
> >+                                   const SCEV *Zero, const SCEV *One) {
> >+  // The GCD of all Terms is the dimension of the innermost dimension.
> >+  const SCEV *GCD = findGCD(SE, Terms);
> >
> >-  const SCEV *visitUMaxExpr(const SCEVUMaxExpr *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >-  }
> >+  // End of recursion.
> >+  if (Terms.size() == 1) {
> >+    if (const SCEVMulExpr *M = dyn_cast<SCEVMulExpr>(GCD)) {
> >+      SmallVector<const SCEV *, 2> Qs;
> >+      for (int I = 0, E = M->getNumOperands(); I < E; ++I)
>
> C++11 range loop?
>
> >+        if (!isa<SCEVConstant>(M->getOperand(I)))
> >+          Qs.push_back(M->getOperand(I));
> >+
> >+      GCD = SE.getMulExpr(Qs);
> >+    }
> >
> >-  const SCEV *visitUnknown(const SCEVUnknown *Expr) {
> >-    if (GCD != Expr)
> >-      Remainder = Expr;
> >-    return GCD;
> >+    Sizes.push_back(GCD);
> >+    return;
> >    }
> >
> >-  const SCEV *visitCouldNotCompute(const SCEVCouldNotCompute *Expr) {
> >-    return One;
> >+  for (unsigned I = 0; I < Terms.size(); ++I) {
>
> C++11 range loop?
>
> >+    // Normalize the terms before the next call to findArrayDimensionsRec.
> >+    const SCEV *Q, *R;
> >+    SCEVDivision::divide(SE, Terms[I], GCD, &Q, &R);
> >+    assert(R == Zero && "GCD does not evenly divide one of the terms");
> >+    Terms[I] = Q;
> >    }
> >
> >-private:
> >-  ScalarEvolution &SE;
> >-  const SCEV *GCD, *Remainder, *Zero, *One;
> >-};
> >+  // Remove all SCEVConstants.
> >+  for (unsigned I = 0; I < Terms.size();)
> >+    if (isa<SCEVConstant>(Terms[I]))
> >+      Terms.erase(Terms.begin() + I);
> >+    else
> >+      ++I;
>
> This seems to work, but it looks a little confusing. What about:
>
> for (auto I = Terms.begin(), E = Terms.end(); I != E; I++)
>    if (isa<SCEVConstant>(Terms[I]))
>      I = Terms.erase(I);

Not done. I don't know how to make your version work: it has several problems
when I tried it out. I fixed at least two problems before I decided it's not
worth the effort ;-)

>
> >-struct SCEVDivision : public SCEVVisitor<SCEVDivision, const SCEV *> {
> >-public:
> >-  // Remove from Start all multiples of Step.
> >-  static const SCEV *divide(ScalarEvolution &SE, const SCEV *Start,
> >-                            const SCEV *Step) {
> >-    SCEVDivision D(SE, Step);
> >-    const SCEV *Rem = D.Zero;
> >-    (void)Rem;
> >-    // The division is guaranteed to succeed: Step should divide Start with no
> >-    // remainder.
> >-    assert(Step == SCEVGCD::findGCD(SE, Start, Step, &Rem) && Rem == D.Zero &&
> >-           "Step should divide Start with no remainder.");
> >-    return D.visit(Start);
> >-  }
> >+  if (Terms.size() > 0)
> >+    findArrayDimensionsRec(SE, Terms, Sizes, Zero, One);
> >+  Sizes.push_back(GCD);
> >+}
> >+
> >+namespace {
> >+struct FindParameter {
> >+  bool FoundParameter;
> >+  FindParameter() : FoundParameter(false) {}
> >
> >-  SCEVDivision(ScalarEvolution &S, const SCEV *G) : SE(S), GCD(G) {
> >-    Zero = SE.getConstant(GCD->getType(), 0);
> >-    One = SE.getConstant(GCD->getType(), 1);
> >+  bool follow(const SCEV *S) {
> >+    if (isa<SCEVUnknown>(S)) {
> >+      FoundParameter = true;
> >+      // Stop recursion: we found a parameter.
> >+      return false;
> >+    }
> >+    // Keep looking.
> >+    return true;
> >+  }
> >+  bool isDone() const {
> >+    // Stop recursion if we have found a parameter.
> >+    return FoundParameter;
> >    }
> >+};
> >+}
> >
> >-  const SCEV *visitConstant(const SCEVConstant *Constant) {
> >-    if (GCD == Constant)
> >-      return One;
> >+// Returns true when S contains at least a SCEVUnknown parameter.
> >+static inline bool
> >+containsParameters(const SCEV *S) {
> >+  FindParameter F;
> >+  SCEVTraversal<FindParameter> ST(F);
> >+  ST.visitAll(S);
> >
> >-    if (const SCEVConstant *CGCD = dyn_cast<SCEVConstant>(GCD))
> >-      return SE.getConstant(sdiv(Constant, CGCD));
> >-    return Constant;
> >-  }
> >+  return F.FoundParameter;
> >+}
> >
> >-  const SCEV *visitTruncateExpr(const SCEVTruncateExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >-  }
> >+// Returns true when one of the SCEVs of Terms contains a SCEVUnknown parameter.
> >+static inline bool
> >+containsParameters(SmallVectorImpl<const SCEV *> &Terms) {
> >+  for (unsigned I = 0; I < Terms.size(); I++)
> >+    if (containsParameters(Terms[I]))
> >+      return true;
> >+  return false;
> >+}
> >
> >-  const SCEV *visitZeroExtendExpr(const SCEVZeroExtendExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >-  }
> >+// Return the number of product terms in S.
> >+static inline int numberOfTerms(const SCEV *S) {
> >+  if (const SCEVMulExpr *Expr = dyn_cast<SCEVMulExpr>(S))
> >+    return Expr->getNumOperands();
> >+  return 1;
> >+}
> >
> >-  const SCEV *visitSignExtendExpr(const SCEVSignExtendExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >-  }
> >+/// Second step of delinearization: compute the array dimensions Sizes from the
> >+/// set of Terms extracted from the memory access function of this SCEVAddRec.
> >+void SCEVAddRecExpr::findArrayDimensions(
> >+    ScalarEvolution &SE, SmallVectorImpl<const SCEV *> &Terms,
> >+    SmallVectorImpl<const SCEV *> &Sizes) const {
> >
> >-  const SCEV *visitAddExpr(const SCEVAddExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >+  if (Terms.size() < 2)
> >+    return;
> >
> >-    SmallVector<const SCEV *, 2> Operands;
> >-    for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
> >-      Operands.push_back(divide(SE, Expr->getOperand(i), GCD));
> >+  // Early return when Terms do not contain parameters: we do not delinearize
> >+  // non parametric SCEVs.
> >+  if (!containsParameters(Terms))
> >+    return;
> >
> >-    if (Operands.size() == 1)
> >-      return Operands[0];
> >-    return SE.getAddExpr(Operands);
> >-  }
> >+#ifndef NDEBUG
> >+  DEBUG(dbgs() << "Terms:\n");
> >+  for (unsigned I = 0; I < Terms.size(); ++I)
> >+    DEBUG(dbgs() << *Terms[I] << "\n");
> >+#endif
>
> C++11 range loop? DEBUG({})?
>
> This also removes the need for the NDEBUG check.
>
> >-  const SCEV *visitMulExpr(const SCEVMulExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >+  // Remove duplicates.
> >+  std::sort(Terms.begin(), Terms.end());
> >+  Terms.erase(std::unique(Terms.begin(), Terms.end()), Terms.end());
> >
> >-    bool FoundGCDTerm = false;
> >-    for (int i = 0, e = Expr->getNumOperands(); i < e; ++i)
> >-      if (Expr->getOperand(i) == GCD)
> >-        FoundGCDTerm = true;
> >+  // Put larger terms first.
> >+  std::sort(Terms.begin(), Terms.end(), [](const SCEV *LHS, const SCEV *RHS) {
> >+    return numberOfTerms(LHS) > numberOfTerms(RHS);
> >+  });
> >
> >-    SmallVector<const SCEV *, 2> Operands;
> >-    if (FoundGCDTerm) {
> >-      FoundGCDTerm = false;
> >-      for (int i = 0, e = Expr->getNumOperands(); i < e; ++i) {
> >-        if (FoundGCDTerm)
> >-          Operands.push_back(Expr->getOperand(i));
> >-        else if (Expr->getOperand(i) == GCD)
> >-          FoundGCDTerm = true;
> >-        else
> >-          Operands.push_back(Expr->getOperand(i));
> >-      }
> >-    } else {
> >-      const SCEV *PartialGCD = One;
> >-      for (int i = 0, e = Expr->getNumOperands(); i < e; ++i) {
> >-        if (PartialGCD == GCD) {
> >-          Operands.push_back(Expr->getOperand(i));
> >-          continue;
> >-        }
> >+#ifndef NDEBUG
> >+  DEBUG(dbgs() << "Terms after sorting:\n");
> >+  for (unsigned I = 0; I < Terms.size(); ++I)
> >+    DEBUG(dbgs() << *Terms[I] << "\n");
> >+#endif
>
> C++11 range loop? DEBUG({})?
>
> This also removes the need for the NDEBUG check.
>
> >
> >-        const SCEV *Rem = Zero;
> >-        const SCEV *Res = SCEVGCD::findGCD(SE, Expr->getOperand(i), GCD, &Rem);
> >-        if (Rem == Zero) {
> >-          PartialGCD = SE.getMulExpr(PartialGCD, Res);
> >-          Operands.push_back(divide(SE, Expr->getOperand(i), Res));
> >-        } else {
> >-          Operands.push_back(Expr->getOperand(i));
> >-        }
> >-      }
> >-    }
> >+  const SCEV *Zero = SE.getConstant(this->getType(), 0);
> >+  const SCEV *One = SE.getConstant(this->getType(), 1);
> >+  findArrayDimensionsRec(SE, Terms, Sizes, Zero, One);
> >
> >-    if (Operands.size() == 1)
> >-      return Operands[0];
> >-    return SE.getMulExpr(Operands);
> >-  }
> >+#ifndef NDEBUG
> >+  DEBUG(dbgs() << "Sizes:\n");
> >+  for (unsigned int I = 0; I < Sizes.size(); ++I)
> >+    DEBUG(dbgs() << *Sizes[I] << "\n");
> >+#endif
>
> C++11 range loop? DEBUG({})?
>
> This also removes the need for the NDEBUG check.
>
> >+}
> >
> >-  const SCEV *visitUDivExpr(const SCEVUDivExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >-  }
> >+/// Third step of delinearization: compute the access functions for the
> >+/// Subscripts based on the dimensions in Sizes.
> >+const SCEV *SCEVAddRecExpr::computeAccessFunctions(
> >+    ScalarEvolution &SE, SmallVectorImpl<const SCEV *> &Subscripts,
> >+    SmallVectorImpl<const SCEV *> &Sizes) const {
> >+  // Early exit in case this SCEV is not an affine multivariate function.
> >+  const SCEV *Zero = SE.getConstant(this->getType(), 0);
> >+  if (!this->isAffine())
> >+    return Zero;
> >
> >-  const SCEV *visitAddRecExpr(const SCEVAddRecExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >+  DEBUG(dbgs() << "(delinearize: " << *this << "\n");
> >
> >-    assert(Expr->isAffine() && "Expr should be affine");
> >+  const SCEV *Res = this, *Remainder = Zero;
> >+  int Last = Sizes.size() - 1;
> >+  for (int i = Last; i >= 0; i--) {
> >+    const SCEV *Q, *R;
> >+    SCEVDivision::divide(SE, Res, Sizes[i], &Q, &R);
> >
> >-    const SCEV *Start = divide(SE, Expr->getStart(), GCD);
> >-    const SCEV *Step = divide(SE, Expr->getStepRecurrence(SE), GCD);
> >+#ifndef NDEBUG
> >+    DEBUG(dbgs() << "Res: " << *Res << "\n");
> >+    DEBUG(dbgs() << "Sizes[i]: " << *Sizes[i] << "\n");
> >+    DEBUG(dbgs() << "Res divided by Sizes[i]:\n");
> >+    DEBUG(dbgs() << "Quotient: " << *Q << "\n");
> >+    DEBUG(dbgs() << "Remainder: " << *R << "\n");
> >+#endif
>
> DEBUG({})?
>
> This is no need to check NDEBUG. DEBUG is defined to an empty
> statement as soon as NDEBUG is set.
>
> >-    return SE.getAddRecExpr(Start, Step, Expr->getLoop(),
> >-                            Expr->getNoWrapFlags());
> >-  }
> >+    Res = Q;
> >
> >-  const SCEV *visitSMaxExpr(const SCEVSMaxExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >-  }
> >+    if (i == Last) {
> >+      // Do not record the last subscript corresponding to the size of elements
> >+      // in the array.
> >+      Remainder = R;
> >+      continue;
> >+    }
> >
> >-  const SCEV *visitUMaxExpr(const SCEVUMaxExpr *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >+    // Record the access function for the current subscript.
> >+    Subscripts.push_back(R);
> >    }
> >
> >-  const SCEV *visitUnknown(const SCEVUnknown *Expr) {
> >-    if (GCD == Expr)
> >-      return One;
> >-    return Expr;
> >-  }
> >+  // Also push in last position the remainder of the last division: it will be
> >+  // the access function of the innermost dimension.
> >+  Subscripts.push_back(Res);
> >
> >-  const SCEV *visitCouldNotCompute(const SCEVCouldNotCompute *Expr) {
> >-    return Expr;
> >-  }
> >+  std::reverse(Subscripts.begin(), Subscripts.end());
> >
> >-private:
> >-  ScalarEvolution &SE;
> >-  const SCEV *GCD, *Zero, *One;
> >-};
> >+#ifndef NDEBUG
> >+  DEBUG(dbgs() << "Subscripts:\n");
> >+  for (unsigned int I = 0; I < Subscripts.size(); ++I)
> >+    DEBUG(dbgs() << *Subscripts[I] << "\n");
> >+#endif
>
> C++11 range loop? DEBUG({})?
>
> No need for NDEBUG check.
>
> >+  return Remainder;
> >  }
> >
> >  /// Splits the SCEV into two vectors of SCEVs representing the subscripts and
> >@@ -7253,75 +7439,24 @@ const SCEV *
> >  SCEVAddRecExpr::delinearize(ScalarEvolution &SE,
> >                              SmallVectorImpl<const SCEV *> &Subscripts,
> >                              SmallVectorImpl<const SCEV *> &Sizes) const {
> >-  // Early exit in case this SCEV is not an affine multivariate function.
> >-  if (!this->isAffine())
> >-    return this;
> >+  // First step: collect parametric terms.
> >+  SmallVector<const SCEV *, 4> Terms;
> >+  collectParametricTerms(SE, Terms);
> >
> >-  const SCEV *Start = this->getStart();
> >-  const SCEV *Step = this->getStepRecurrence(SE);
> >+  // Second step: find subscript sizes.
> >+  findArrayDimensions(SE, Terms, Sizes);
> >
> >-  // Build the SCEV representation of the canonical induction variable in the
> >-  // loop of this SCEV.
> >-  const SCEV *Zero = SE.getConstant(this->getType(), 0);
> >-  const SCEV *One = SE.getConstant(this->getType(), 1);
> >-  const SCEV *IV =
> >-      SE.getAddRecExpr(Zero, One, this->getLoop(), this->getNoWrapFlags());
> >-
> >-  DEBUG(dbgs() << "(delinearize: " << *this << "\n");
> >-
> >-  // When the stride of this SCEV is 1, do not compute the GCD: the size of this
> >-  // subscript is 1, and this same SCEV for the access function.
> >-  const SCEV *Remainder = Zero;
> >-  const SCEV *GCD = One;
> >-
> >-  // Find the GCD and Remainder of the Start and Step coefficients of this SCEV.
> >-  if (Step != One && !Step->isAllOnesValue())
> >-    GCD = SCEVGCD::findGCD(SE, Start, Step, &Remainder);
> >-
> >-  DEBUG(dbgs() << "GCD: " << *GCD << "\n");
> >-  DEBUG(dbgs() << "Remainder: " << *Remainder << "\n");
> >-
> >-  const SCEV *Quotient = Start;
> >-  if (GCD != One && !GCD->isAllOnesValue())
> >-    // As findGCD computed Remainder, GCD divides "Start - Remainder." The
> >-    // Quotient is then this SCEV without Remainder, scaled down by the GCD.  The
> >-    // Quotient is what will be used in the next subscript delinearization.
> >-    Quotient = SCEVDivision::divide(SE, SE.getMinusSCEV(Start, Remainder), GCD);
> >-
> >-  DEBUG(dbgs() << "Quotient: " << *Quotient << "\n");
> >-
> >-  const SCEV *Rem = Quotient;
> >-  if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Quotient))
> >-    // Recursively call delinearize on the Quotient until there are no more
> >-    // multiples that can be recognized.
> >-    Rem = AR->delinearize(SE, Subscripts, Sizes);
> >-
> >-  // Scale up the canonical induction variable IV by whatever remains from the
> >-  // Step after division by the GCD: the GCD is the size of all the sub-array.
> >-  if (Step != One && !Step->isAllOnesValue() && GCD != One &&
> >-      !GCD->isAllOnesValue() && Step != GCD) {
> >-    Step = SCEVDivision::divide(SE, Step, GCD);
> >-    IV = SE.getMulExpr(IV, Step);
> >-  }
> >-  // The access function in the current subscript is computed as the canonical
> >-  // induction variable IV (potentially scaled up by the step) and offset by
> >-  // Rem, the offset of delinearization in the sub-array.
> >-  const SCEV *Index = SE.getAddExpr(IV, Rem);
> >-
> >-  // Record the access function and the size of the current subscript.
> >-  Subscripts.push_back(Index);
> >-  Sizes.push_back(GCD);
> >+  // Third step: compute the access functions for each subscript.
> >+  const SCEV *Remainder = computeAccessFunctions(SE, Subscripts, Sizes);
>
> Very nice. I like small functions!
>
> >    DEBUG(dbgs() << "ArrayRef");
> >-  for (int i = 0; i < Size; i++)
> >+  for (unsigned int i = 0; i < Sizes.size(); i++)
> >      DEBUG(dbgs() << "[" << *Subscripts[i] << "]");
> >    DEBUG(dbgs() << "\n)\n");
>
> The large amount of DEBUGs looks confusing. You could write it as follows:
>
>     DEBUG({
>     dbgs() << "ArrayRef";
>         for (unsigned int i = 0; i < Sizes.size(); i++)
>           dbgs() << "[" << *Subscripts[i] << "]";
>         dbgs() << "\n)\n";
>     });
>
>
> That's for now. I need to go again through the algorithmic details,
> but for now I need some sleep.
>
> Cheers,
> Tobias

Let me know if I missed something.

Tobias Grosser wrote:
> On 02/05/2014 02:21, Tobias Grosser wrote:
> >Hi Sebastian,
> >
> >it is good to see the result of our discussions on the mailing list. I
> >already tested it with Polly and it works great. After spending a couple
> >of hours integrating Polly into Julia (julialang.org), we can now nicely
> >delinearize their linear algebra kernels. I will test it a little more
> >with ublas and polybench, but I am rather confident about this change.
>
> I just tested your patch more intensively running it over all the
> polybench test cases. For almost all test cases, we correctly
> delinearize the accesses.
>
> There are three issues I run into:
>
> 1) ASSERT: GCD does not evenly divide one of the terms
> ======================================================
>
> I attached one test case that yields an assert when running:
>
> opt -delinearize -analyze GCD-does-not-evenly-divide-one-of-the-terms.ll
>
> Inst:  %1 = load double* %arrayidx70, align 8, !tbaa !1
> In Loop with Header: for.body60
> AddRec: {{((8 * undef) + %Ey),+,(8 * undef *
> undef)}<%for.cond55.preheader>,+,(8 * undef)}<%for.cond58.preheader>

These are strange access functions: I should probably just bail out when the
SCEV contains undefined values. Do you agree? Or are the undef's a side effect
of bugpoint removing the initialization of variables?

I haven't implemented anything yet to fix this.

> opt: /home/grosser/Projects/polly/git/lib/Analysis/ScalarEvolution.cpp:7228:
> void findArrayDimensionsRec(llvm::ScalarEvolution &,
> SmallVectorImpl<const llvm::SCEV *> &, SmallVectorImpl<const
> llvm::SCEV *> &, const llvm::SCEV *, const llvm::SCEV *): Assertion
> `R == Zero && "GCD does not evenly divide one of the terms"' failed.
>
> 2) SEGFAULT && infinite recursion
> =================================
>
> When running 'opt -delinearize -analyze segfault.ll', opt segfaults.
> The following change fixes the reason for this segfault:
>
> -    assert(Numerator && Denominator && *Quotient && *Remainder &&
> -           "Uninitialized SCEV");
> +    assert(Numerator && Denominator && "Uninitialized SCEV");

I don't see this assert triggering on the testcase you attached.

>
> But unfortunately we run immediately in an infinite recursion (which
> causes another segfault:

I fixed this recursion by checking that the following folding does not diverge:

    // Quotient is (Numerator - Remainder) divided by Denominator.
    const SCEV *Q, *R;
    const SCEV *Diff = SE.getMinusSCEV(Numerator, Remainder);
    if (sizeOfSCEV(Diff) > sizeOfSCEV(Numerator)) {
      // This SCEV does not seem to simplify: fail the division here.
      Quotient = Zero;
      Remainder = Numerator;
      return;
    }

(gdb) p Numerator->dump()
((1 + %arg) * (-1 + %arg))
(gdb) p Denominator->dump()
%arg
(gdb) p Diff->dump()
(1 + ((1 + %arg) * (-1 + %arg)))
(gdb) p sizeOfSCEV(Diff)
$13 = 7
(gdb) p sizeOfSCEV(Numerator)
$14 = 6

I was expecting to see the SCEV folding nicely when substracting the
remainder... and in this particular case it was not folding.

> 3. Array size in the index expression
> =====================================
>
> opt -delinearize -analyze /tmp/negative-offset.ll yields:
>
> Inst:  store double 2.000000e+00, double* %tmp16, align 8
> In Loop with Header: bb12
> AddRec: {(-8 + (8 * %arg1) + %arg2),+,(8 * %arg1)}<%bb12>
> failed to delinearize
>
> This one is interesting, as the original access is:
>
> double A[][N];
> for i:
>     A[i][N-1] = 2;
>
> I assume we could delinearize it to
> A[i+1][-1] (which is incorrect) and then translate this to:
>
> [x][y] -> [x][y]: y >= 0; [x][y] -> [x-1][y+N]: y < 0
>
> Let's keep this discussion out of the review. I have the feeling we
> can address it later.

Let's discuss the runtime conditions tomorrow on the polly phone call.  I guess
it is just a matter of interpreting the result of delinearization in the data
dependence analysis.

Sebastian
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-split-delinearization-pass-in-3-steps.patch
Type: text/x-patch
Size: 59071 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140506/95780e31/attachment.bin>