split delinearization pass in 3 steps

Sun May 4 06:49:53 PDT 2014

On 02/05/2014 02:21, Tobias Grosser wrote:
> Hi Sebastian,
>
> it is good to see the result of our discussions on the mailing list. I
> already tested it with Polly and it works great. After spending a couple
> of hours integrating Polly into Julia (julialang.org), we can now nicely
> delinearize their linear algebra kernels. I will test it a little more
> with ublas and polybench, but I am rather confident about this change.

I just tested your patch more intensively running it over all the 
polybench test cases. For almost all test cases, we correctly 
delinearize the accesses.

There are three issues I run into:

1) ASSERT: GCD does not evenly divide one of the terms
======================================================

I attached one test case that yields an assert when running:

opt -delinearize -analyze GCD-does-not-evenly-divide-one-of-the-terms.ll

Inst:  %1 = load double* %arrayidx70, align 8, !tbaa !1
In Loop with Header: for.body60
AddRec: {{((8 * undef) + %Ey),+,(8 * undef * 
undef)}<%for.cond55.preheader>,+,(8 * undef)}<%for.cond58.preheader>
opt: 
/home/grosser/Projects/polly/git/lib/Analysis/ScalarEvolution.cpp:7228: 
void findArrayDimensionsRec(llvm::ScalarEvolution &, 
SmallVectorImpl<const llvm::SCEV *> &, SmallVectorImpl<const llvm::SCEV 
*> &, const llvm::SCEV *, const llvm::SCEV *): Assertion `R == Zero && 
"GCD does not evenly divide one of the terms"' failed.

2) SEGFAULT && infinite recursion
=================================

When running 'opt -delinearize -analyze segfault.ll', opt segfaults.
The following change fixes the reason for this segfault:

-    assert(Numerator && Denominator && *Quotient && *Remainder &&
-           "Uninitialized SCEV");
+    assert(Numerator && Denominator && "Uninitialized SCEV");

But unfortunately we run immediately in an infinite recursion (which 
causes another segfault:

llvm::SCEVVisitor<(anonymous namespace)::SCEVDivision, void>::visit 
(this=0x7fffff800ab8, S=0x6e54d0)
    at 
/home/grosser/Projects/polly/git/include/llvm/Analysis/ScalarEvolutionExpressions.h:484
(anonymous namespace)::SCEVDivision::divide (SE=..., Numerator=0x6e54d0, 
Denominator=0x6e4d90, Quotient=0x7fffff800b68, Remainder=0x7fffff800b60)
     at 
/home/grosser/Projects/polly/git/lib/Analysis/ScalarEvolution.cpp:7027
(anonymous namespace)::SCEVDivision::visitMulExpr (this=0x7fffff800c98, 
Numerator=0x6e4f10) at 
/home/grosser/Projects/polly/git/lib/Analysis/ScalarEvolution.cpp:7136
llvm::SCEVVisitor<(anonymous namespace)::SCEVDivision, void>::visit 
(this=0x7fffff800c98, S=0x6e4f10)
     at 
/home/grosser/Projects/polly/git/include/llvm/Analysis/ScalarEvolutionExpressions.h:486
(anonymous namespace)::SCEVDivision::divide (SE=..., Numerator=0x6e4f10, 
Denominator=0x6e4d90, Quotient=0x7fffff800d30, Remainder=0x7fffff800d28)
     at 
/home/grosser/Projects/polly/git/lib/Analysis/ScalarEvolution.cpp:7027
(anonymous namespace)::SCEVDivision::visitAddExpr (this=0x7fffff800e38, 
Numerator=0x6e54d0) at 
/home/grosser/Projects/polly/git/lib/Analysis/ScalarEvolution.cpp:7077

3. Array size in the index expression
=====================================

opt -delinearize -analyze /tmp/negative-offset.ll yields:

Inst:  store double 2.000000e+00, double* %tmp16, align 8
In Loop with Header: bb12
AddRec: {(-8 + (8 * %arg1) + %arg2),+,(8 * %arg1)}<%bb12>
failed to delinearize

This one is interesting, as the original access is:

double A[][N];
for i:
	A[i][N-1] = 2;

I assume we could delinearize it to
A[i+1][-1] (which is incorrect) and then translate this to:

[x][y] -> [x][y]: y >= 0; [x][y] -> [x-1][y+N]: y < 0

Let's keep this discussion out of the review. I have the feeling we can 
address it later.

Cheers,
Tobias
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GCD-does-not-evenly-divide-one-of-the-terms.ll
Type: application/octet-stream
Size: 1136 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140504/c1edcd6b/attachment.obj>
-------------- next part --------------
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define hidden void @barney(i64 %arg, i64 %arg1, double* %arg2)  {
bb6:
  br label %bb8

bb8:
  %tmp9 = phi i64 [ %tmp25, %bb23 ], [ 0, %bb6 ]
  %tmp10 = icmp sgt i64 %arg1, 0
  br i1 %tmp10, label %bb12, label %bb23

bb12:
  %tmp13 = phi i64 [ %tmp21, %bb12 ], [ 0, %bb8 ]
  %tmp7 = add i64 %arg1, -1
  %tmp14 = mul i64 %arg1, %tmp13
  %tmp15 = add i64 %tmp7, %tmp14
  %tmp16 = getelementptr double* %arg2, i64 %tmp15
  store double 2.000000e+00, double* %tmp16, align 8
  %tmp21 = add nsw i64 %tmp13, 1
  %tmp22 = icmp ne i64 %tmp21, %arg1
  br i1 %tmp22, label %bb12, label %bb23

bb23:
  %tmp25 = add nsw i64 %tmp9, 1
  %tmp26 = icmp ne i64 %tmp25, %arg
  br i1 %tmp26, label %bb8, label %bb27

bb27:
  ret void
}
-------------- next part --------------
; ModuleID = 'bugpoint-reduced-simplified.bc'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind uwtable
define void @hoge(i64 %arg, double* %arg1) #0 {
bb:
  %tmp = add nsw i64 %arg, 1
  br i1 undef, label %bb13, label %bb2

bb2:                                              ; preds = %bb2, %bb
  %tmp3 = phi i64 [ %tmp12, %bb2 ], [ 0, %bb ]
  %tmp4 = phi i64 [ %tmp11, %bb2 ], [ 0, %bb ]
  %tmp5 = add nsw i64 %arg, -1
  %tmp6 = add i64 %tmp5, %tmp3
  %tmp7 = mul nsw i64 %tmp6, %tmp
  %tmp8 = add i64 %tmp7, undef
  %tmp9 = getelementptr inbounds double* %arg1, i64 %tmp8
  %tmp10 = load double* %tmp9, align 8, !tbaa !1
  %tmp11 = add nsw i64 %tmp4, 1
  %tmp12 = xor i64 %tmp4, -1
  br i1 false, label %bb2, label %bb13

bb13:                                             ; preds = %bb2, %bb
  ret void
}

attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="true" "no-nans-fp-math"="true" "stack-protector-buffer-size"="8" "unsafe-fp-math"="true" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 "}
!1 = metadata !{metadata !2, metadata !2, i64 0}
!2 = metadata !{metadata !"double", metadata !3, i64 0}
!3 = metadata !{metadata !"omnipotent char", metadata !4, i64 0}
!4 = metadata !{metadata !"Simple C/C++ TBAA"}