I'm travelling right now, so can't fully participate.<div>But I'd encourage people to read Maslov's paper</div><div><br></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div>Delinearization: an Efficient Way to Break Multiloop Dependence Equations</div>

</div><div>Vadim Maslov</div><div>PLDI '92</div></blockquote><div><br></div>Wolfe's book shows the same approach.<div>My impression was that he can do most, but not all,<div>of the delinearization we want. At one point, I though that</div>

<div>he couldn't handle coupled subscripts that had been linearized,</div><div>but I'll have to re-create my examples.</div><div><br></div><div>To get the best results, I thought we should change the GEP</div><div>

(or have a fancy version in addition to the current one).</div><div><br></div><div>Preston</div><div><br><div><br><div class="gmail_quote">On Wed, Sep 12, 2012 at 9:27 AM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Wed, 12 Sep 2012 14:18:43 +0200<br>

Armin Größlinger <<a href="mailto:armin.groesslinger@uni-passau.de">armin.groesslinger@uni-passau.de</a>> wrote:<br>

<br>

> Hi,<br>

><br>

> I had a few thoughts about recovering multi-dimensional accesses in<br>

> Polly and maybe it's time to share them.<br>

><br>

> On 09/12/2012 10:38 AM, Tobias Grosser wrote:<br>

> > [...] Even in the last example which contains parameters both for<br>

> > the sizes and the offsets, the parameters for the sizes are the<br>

> > only ones that appear on the right hand sides ('8 * %m * %o' and '8<br>

> > * %o').<br>

><br>

> I guess this is the key observation. When we have a multi-dimensional<br>

> access, the coefficients of the iterators have to form an increasing<br>

> chain w.r.t. divisibility. For the example<br>

><br>

> > multidim_ivs_and_integer_offsets_3d.ll:<br>

> > ; void foo(long n, long m, long o, double A[n][m][o]) {<br>

> > ;   for (long i = 0; i < n; i++)<br>

> > ;     for (long j = 0; j < m; j++)<br>

> > ;       for (long k = 0; k < o;  k++)<br>

> > ;         A[i+3][j-4][k+7] = 1.0;<br>

> > ; }<br>

><br>

> with access function<br>

><br>

> > ;   {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m *<br>

> > %o)}<%for.i>,+, ;      (8 * %o)}<%for.j>,+,8}<%for.k><br>

><br>

> we have (writing the SCEV in ordinary notation and in a suggestive<br>

> order)<br>

><br>

>    8*%o*%m * i  +  8*%o * j  +  8 * k  +  24*%m*%o - 32*%o + %A + 56<br>

><br>

> and we see that the coefficients of the iterators are 8, 8*%o,<br>

> 8*%o*%m and every coefficient in this chain divides it successor. By<br>

> factoring these coefficients out from all the term they divide, we<br>

> find the subscripts for the dimensions:<br>

><br>

>   8*%o*%m * (i+3)  +  8*%o * (j-4)  +  8 * (k+7)  +  %A<br>

><br>

> So A[i+4][j-4][k-7] is our candidate for a multi-dimensional access.<br>

> The math tool we'd need to implement this is (multivariate)<br>

> polynomial division or something similar<br>

<br>

</div></div>Agreed. Furthermore, I already have code that does this (I've started<br>

working on a transformation pass which re-factors and optimizes integer<br>

polynomial expressions, so I can repurpose code from there). It sounds<br>

like we just need to extract integer polynomials used by GEPs<br>

containing induction variables, and, order-by-order, factor out the<br>

non-induction variables. That will give us the 'size' of each<br>

dimension. As you've noted below, we need to make sure that no index<br>

exceeds its size, and that all accesses to an array within a given loop<br>

are consistent. Does that sound reasonable?<br>

<br>

 -Hal<br>

<div class="HOEnZb"><div class="h5"><br>

> (coefficients could be<br>

> arbitrary polynomials, e.g., %o+5*%m+7 if somebody declares something<br>

> like  double A[][o+5*m+7]  or even more complex expressions).<br>

><br>

> Note that two iterators can have the same coefficients (e.g., for<br>

> A[i+j][.][.]) or one iterator can have two coefficients (e.g.,<br>

> A[k][k][.]). Or a "coefficient" may actually be a constant (e.g.,<br>

> A[42][.][.]) or missing in one access (e.g., A[0][.][.]) so<br>

> determining the coefficients is non-trivial (but not too difficult I<br>

> guess).<br>

><br>

> > I would guess that we could write a SCEVIterator, that analyzes the<br>

> > SCEVs and guesses the dimension sizes.<br>

><br>

> I'm not sure if this cane be done on the SCEV directly (instead of<br>

> polynomials) for cases that are more complex than products of<br>

> parameters but I haven't thought about it thoroughly. OTOH, add<br>

> recursions should be of the right structure to check divisibility<br>

> directly; I'm not sure without more thinking...<br>

><br>

> > After having written that, it should be possible to write<br>

> > another SCEVIterator that given the dimension sizes can separate<br>

> > the different dimensions of a SCEV.<br>

><br>

> Here we have to be careful because when there are several accesses to<br>

> the same array (base pointer) not only the coefficients (8, 8*%o,<br>

> 8*%o*%m above) have to match but also the range of the subscripts<br>

> must be right (globally over all arrays). For example with two<br>

> accesses<br>

><br>

>   A[i+4][j-4][k-7]<br>

>   A[...][i-k][3*j]<br>

><br>

> we'd need to check that<br>

><br>

>    max{j-4, i-k}  -  min{j-4, i-k}   <=  %m - 1<br>

>    max{k-7, 3*j}  -  min{k-7, 3*j}   <=  %o - 1<br>

><br>

> so there is no (inconsistent) overrun between the dimensions (%m and<br>

> %o are the factors "between" the dimensions). Note that this also<br>

> ensures that the factors between the dimensions are positive, i.e.,<br>

> no dimension is collapsed.<br>

><br>

> This check can, of course, be relaxed to checking<br>

><br>

>    0 <= j-4 <= %m - 1<br>

>    0 <= k-7 <= %o - 1<br>

><br>

>    0 <= i-k <= %m - 1<br>

>    0 <= 3*j <= %o - 1<br>

><br>

> (allowing for independent checks for each access) in the assumption<br>

> that subscripts start from 0 but I'm not sure how often this will<br>

> hold on the IR (after loop index and array base pointer shifts during<br>

> optimization).<br>

><br>

> The problem here is that the factors between the dimensions (%m and<br>

> %o here) could be non-affine.<br>

><br>

> > I doubt we will always be able to prove that our guess is correct,<br>

> > but we could add a run-time check to test the conditions that we<br>

> > can not prove statically. This is also the point, where I think the<br>

> > front-end (or the user with attributes) could help. Accesses could<br>

> > be annotated with meta-data providing the size of the array<br>

> > dimensions, such that our analysis can start from this meta-data.<br>

> > This could allow our analysis to take some short cuts and to avoid<br>

> > some run-time checks.<br>

><br>

> Makes sense. I think that it should not be too difficult to implement<br>

> the above approach for affine factors between the dimensions (which<br>

> should cover the common cases) and then we should evaluate how often<br>

> we need run-time checks and whether we need more information from the<br>

> front-ends because other cases occur in practice.<br>

><br>

> So far my thoughts for now.<br>

><br>

> Regards,<br>

> Armin<br>

<br>

<br>

<br>

</div></div><span class="HOEnZb"><font color="#888888">--<br>

Hal Finkel<br>

Postdoctoral Appointee<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</font></span></blockquote></div><br></div></div></div>