[llvm-commits] [PATCH] Multidimensional Array Index Delinearization Analysis

Thu Oct 10 14:41:57 PDT 2013

On Oct 10, 2013, at 1:43 PM, Sebastian Pop <spop at codeaurora.org> wrote:

> Hi all,
> 
> I was in the mid of updating and sending out Hal's patch for review. When I
> tried to summarize all the review comments from this long thread, I realized
> that Andy has sent a nice description of how to implement the delinearization on
> top of SCEV:
> 
> Andrew Trick wrote:
>> The SCEV way to handle this would be to call getUDiv to divide the
>> recurrence's start by the recurrence's step. Any remainder is added to the
>> current dimension's index. The quotient is itself a recurrence, so the process
>> continues for each dimension.
> 
> This seems to work pretty nicely when I was running it by hand on all the tests
> from Hal's patch.  The implementation is also pretty elegant because of the
> recurrence on the structure of SCEVs.
> 
> Now there is a missing piece:
> 
>> The problem that I see is that ScalarEvolution::getUDivExpr doesn't implement
>> any normalization except for division by constants.
>> 
> 
> It looks to me that getUDivExpr does not provide enough functionality to drive
> the delinearization (even in the case of division by constants.) Here is what
> happens when trying to delinearize this SCEV:
> {{{0,+,(8 * %m * %o)}<%for.cond1.preheader>,+,(8 * %o)}<%for.cond4.preheader>,+,8}<%for.body6>
> 
> Start: {{0,+,(8 * %m * %o)}<%for.cond1.preheader>,+,(8 * %o)}<%for.cond4.preheader>
> Step: 8
> SE.getUDivExpr(Start, Step) returns:
> ({{0,+,(8 * %m * %o)}<%for.cond1.preheader>,+,(8 * %o)}<%for.cond4.preheader> /u 8)

It seems to me that in unsigned wrapping arithmetic you cannot assume:

"{0,+,4}<%for.cond1.preheader> /u 4" == "{0,+,1}<%for.cond1.preheader>"

Let say we are in i4:

the first one is:
0, 4/4 = 1, 8/4 == 2, 12/4=3, i4(12+4)= 0, 4/4 = 1, ...

while the second is
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, ….

To do such a simplification we will have to know something about the range of the recurrence. For example, if "{0, +, 4}%for_loop" is within [0 … umax(i4)/4].

In some cases we could obtain this information from dominating context.

assert (n < 100);
for (i = 0;i < n; ++i)

In other cases that we care about, we could have the compiler perform versioning (which we already to plenty of, at least in the loopvectorizer) that makes sure that we don’t wrap after performing some simplifications.

if (n < umax(i4)/4) {
  // vectorized loop. perform math in scev with assumption that "{0,+,4}<%for.cond1.preheader> < umax(i4)/4”.
} else
  // scalar loop

I think, for all this to work we would need some framework on top of/within scev that enables us to “work with SCEV under the assumption X”. This would be useful for other things (e.g. getting rid of z/sexts in scev expressions, etc http://llvm.org/bugs/show_bug.cgi?id=16358, which in term simplifies dependence testers based on scev).