delinearization

Wed Apr 9 01:21:22 PDT 2014

On 04/08/2014 11:38 PM, Sebastian Pop wrote:
> Tobias Grosser wrote:
>> On 04/04/2014 11:06 PM, Sebastian Pop wrote:
>>> Hi,
>>>
>>> here is the updated version of the patch that passes all the make check-polly
>>> tests: yay! (I don't have cloog around, so I haven't checked those tests.)
>>
>> The cloog tests need to be updated. There are three solutions
>>
>> 1) We write a clang-cmp tool that does this semantically
>> 2) You install it briefly and update the tests
>
> I installed cloog and checked with it before I committed the patch.
> All tests are passing.
>
> However, when I enable delinearization by default, 2 cloog tests are failing:
> Tobi, could you please have a look at these two?
>
> test/Cloog/CodeGen/matmul_vec.ll
> test/Cloog/CodeGen/MemAccess/codegen_simple_md.ll
>
>
> These are failing with a JScop error complaining about inconsistent array access
> functions (or something else...)

Yes, I looked into them. We can fix them by updating the .jscop files.

When looking into these test cases, I have seen that you also 
delinearize access functions to fixed size, non linear arrays. This is 
very tempting, but for test/Cloog/CodeGen/matmul_vec.ll this currently 
does not work correctly.

MemRef_A[i0 + 1024i2]

is translated to

MemRef_A[i0, 1024i2]

What is the reason to even try to delinearize non-parametric accesses? 
Doing so is not necessary to increase the precision of our analysis and 
if we get it wrong it has actually negative effects. Also, I believe 
delinearizing such accesses is inherently hard. I personally have the 
feeling limiting ourselves to delinearize arrays with symbolic sizes
is a safer bet.

>> 3) You commit and I fix later.
>>
>> I would prefer 1), understand that 2) is easier and if necessary, we
>> can also go for 3).
>>
>>> The only changes wrt. previous patch are:
>>> - add a flag -polly-delinearize (similar to -da-delinearize)
>>> - normalize the last subscript with respect to the size of elements in the array.
>>
>> Nice, that this fixed the remaining issues.
>>
>>> Ok to commit?
>>
>> Yes, despite the CLooG test cases this looks good.
>
> No cloog tests were failing on my side, so our build bots should be happy with
> my last commit.
>
> I have started triaging the bugs in the nightly test-suite when enabling
> -polly-delinearization.  I have fixed a couple of bugs already: that just shows
> that all this code was never really tested...  I will continue a bit on fixing
> everything that fails in SCEV->delinearize and on the -polly-delinearize side.

The delinearization bug I opened is still open.

I looked at the way we do delinearization and especially at examples of 
non-static size arrays. I attached the following example:

; void foo(long n, long m, double A[n][32]) {
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       A[i][j] = A[i][j+i];
; }

In the inner loop, we get the following delinearization:

Inst:  %val = load double* %arrayidx.load
In Loop with Header: for.j
AddRec: {{%A,+,(33 * sizeof(double))}<%for.i>,+,sizeof(double)}<%for.j>
Base offset: %A
ArrayDecl[UnknownSize][33] with elements of sizeof(double) bytes.
ArrayRef[{0,+,1}<nuw><nsw><%for.i>][{0,+,1}<nuw><nsw><%for.j>]

Inst:  store double %val, double* %arrayidx.store
In Loop with Header: for.j
AddRec: {{%A,+,(32 * sizeof(double))}<%for.i>,+,sizeof(double)}<%for.j>
Base offset: %A
ArrayDecl[UnknownSize][32] with elements of sizeof(double) bytes.
ArrayRef[{0,+,1}<nuw><nsw><%for.i>][{0,+,1}<nuw><nsw><%for.j>]

For the load we derive a size of 33 elements in the inner dimension, for 
the store a size of 32 elements. This is inconsistent and to me it is 
unclear how to derive a common delinearization. For symbolic sizes the 
problem should be a lot simpler.

Cheers,
Tobias

-------------- next part --------------
; RUN: opt < %s -analyze -delinearize | FileCheck %s

; Derived from the following code:
;
; void foo(long n, long m, double A[n][32]) {
;   for (long i = 0; i < n; i++)
;     for (long j = 0; j < m; j++)
;       A[i][i] = A[i][j];
; }

; AddRec: {{%A,+,(8 * %m)}<%for.i>,+,8}<%for.j>
; CHECK: Base offset: %A
; CHECK: ArrayDecl[UnknownSize][%m] with elements of sizeof(double) bytes.
; CHECK: ArrayRef[{0,+,1}<nuw><nsw><%for.i>][{0,+,1}<nuw><nsw><%for.j>]

; AddRec: {(-8 + (8 * %m) + %A),+,(8 * %m)}<%for.i>
; CHECK: Base offset: %A
; CHECK: ArrayDecl[UnknownSize] with elements of sizeof(double) bytes.
; CHECK: ArrayRef[{(-1 + %m),+,%m}<%for.i>]

define void @foo(i64 %n, i64 %m, double* %A) {
entry:
  br label %for.i

for.i:
  %i = phi i64 [ 0, %entry ], [ %i.inc, %for.i.inc ]
  %tmp = mul nsw i64 %i, 32
  br label %for.j

for.j:
  %j = phi i64 [ 0, %for.i ], [ %j.inc, %for.j ]
  %vlaarrayidx.load = add i64 %j, %tmp
  %vlaarrayidx.store = add i64 %i, %tmp
  %arrayidx.load = getelementptr inbounds double* %A, i64 %vlaarrayidx.load
  %arrayidx.store = getelementptr inbounds double* %A, i64 %vlaarrayidx.store
  %val = load double* %arrayidx.load
  store double %val, double* %arrayidx.store
  %j.inc = add nsw i64 %j, 1
  %j.exitcond = icmp eq i64 %j.inc, %m
  br i1 %j.exitcond, label %for.i.inc, label %for.j

for.i.inc:
  %i.inc = add nsw i64 %i, 1
  %i.exitcond = icmp eq i64 %i.inc, %n
  br i1 %i.exitcond, label %end, label %for.i

end:
  ret void
}