[llvm-commits] [PATCH] Multidimensional Array Index Delinearization Analysis

Mon Oct 1 10:32:46 PDT 2012

On Sep 28, 2012, at 8:03 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> On Fri, 28 Sep 2012 19:34:01 -0700
> Andrew Trick <atrick at apple.com> wrote:
> 
>> 
>> On Sep 27, 2012, at 3:11 AM, Tobias Grosser <tobias at grosser.es> wrote:
>> 
>>> On 09/27/2012 11:29 AM, Sameer Sahasrabuddhe wrote:
>>>> 
>>>> Hi Hal,
>>>> 
>>>> I tried the version from Delinearization-20120926.patch on a
>>>> Fortran loopnest, with -O3 before invoking delinearization. See
>>>> attached file "m.pre.ll".
>>>> 
>>>> The delinearizer misses out on the negation expression implemented
>>>> as an XOR (the value "%not"):
>>>> 
>>>>    ~n = -n - 1
>>>> 
>>>> As a result, the "n" above is not available as a possible term in
>>>> a GCD. It worked when I manually substituted that negation with
>>>> its expansion. I am not sure if this should be handled as an
>>>> additional method "addPolysForXor()", or the IR itself should be
>>>> modified as a precursor to delinearization. This expansion is
>>>> similar to what happens in ScalarEvolution::getNotSCEV().
>>> 
>>> It seems we would need to mirror a lot of pattern matching from
>>> ScalarEvolution. Working directly on SCEVs could avoid this.
>>> 
>>> Another remark comping from a similar angle. The current analysis
>>> is not on demand, but always iterates over all instructions. I have
>>> the feeling within LLVM, people try to perform analysis on demand
>>> (e.g. Prestons Dependency Analysis, but also ScalarEvolution).
>>> Working directly on SCEVs would make it easy to do an on-demand
>>> analysis that is only called for the scevs used by memory access
>>> instructions.
>>> 
>>> @Andrew: I remember you mentioned that ScalarEvolution has some
>>> design problems. Could you elaborate on them and if they would
>>> cause issues in this context?
>> 
>> If we ignore SCEVExpander and only consider SCEV-the-analysis, then
>> we have a pretty robust system. There are a few issues:
>> - LCSSA form artificially limits analysis.
>> - SCEV inherently cannot preserve nsw/nuw flags. When it attempts to
>> do it, the results can depend on the query order.
>> - Expressions with sext/zext/trunc do not have a canonical form.
>> - SCEV queries can take time exponential to the expression depth
>> (mainly a problem because of sext/zext/trunc).
>> 
>> SCEV should be just fine within a single loop nest with simple
>> induction variable expressions.
>> 
>> It seems like SCEV has already done the hard work of factoring the
>> polynomial according to the loop nesting and finding the interesting
>> coefficients. But I don't understand Hal's algorithm well enough yet
>> to claim that it's trivially adapted to chains of recurrences.
> 
> Does SCEV canonicalize the ordering of the the recurrences? What I mean
> is that if I have a value which is a function of multiple loop
> induction variables I'll get a recurrence which will have coefficients
> that are a recurrence which will have coefficients that are a
> recurrence, etc. As far as I can tell, these recurrences are just
> univariate polynomials. Can I depend on these recurrences being nested
> with loop depth?

Yes. Add/Mul expressions that contain recurrences (AddRecs) order the AddRecs by the loop depth. See SCEVComplexityCompare. AddRecs themselves are naturally nested by the loop depth because any loop invariant portion is folded into the recurrence. For example, see the getAddExpr loop that iterates over all terms that are AddRecs.

I assume all your recurrences all have loop invariant steps so you're only dealing with affine induction variables. The recurrence's step is the coefficient for that loop's induction variable. The recurrence's start will be a recurrence over the outer loop, and the recurrences will be nested following the loop nest.

Armin provided a nice example in a previous thread:
<snip>
;   {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,
;      (8 * %o)}<%for.j>,+,8}<%for.k>

we have (writing the SCEV in ordinary notation and in a suggestive order)

  8*%o*%m * i  +  8*%o * j  +  8 * k  +  24*%m*%o - 32*%o + %A + 56

and we see that the coefficients of the iterators are 8, 8*%o, 8*%o*%m and
every coefficient in this chain divides it successor. By factoring these
coefficients out from all the term they divide, we find the subscripts for the
dimensions:

 8*%o*%m * (i+3)  +  8*%o * (j-4)  +  8 * (k+7)  +  %A

So A[i+4][j-4][k-7] is our candidate for a multi-dimensional access. The math
tool we'd need to implement this is (multivariate) polynomial division or
something similar (coefficients could be arbitrary polynomials, e.g., %o+5*%m+7
if somebody declares something like  double A[][o+5*m+7]  or even more complex
expressions).
</snip>

The SCEV way to handle this would be to call getUDiv to divide the recurrence's start by the recurrence's step. Any remainder is added to the current dimension's index. The quotient is itself a recurrence, so the process continues for each dimension.

The problem that I see is that ScalarEvolution::getUDivExpr doesn't implement any normalization except for division by constants.

Still, it would be worth attempting to solve this problem without a separate polynomial package. I could imagine special-casing the solution for the kind of nested recurrences that we expect to see. It might also make sense to do the division "implicitly" without attempting to create a SCEVUDiv expression. I think you're only dealing with linear expressions of a fairly regular form. SCEV's ability to canonicalize and unqiue expressions certainly helps. I can't say for sure whether this will lead to a simpler implementation, but I hope you consider it.

-Andy