[PATCH] [Polly] Runtime alias checks

Thu Sep 11 08:48:15 PDT 2014

Hi Johannes,

the patch looks really good for most parts.

Two remaining issues:

- There seems to be an off by one error (< instead of <=)
- Iteration over a SmallPtrSet may cause random changes.

A comment to think about:

- I am unsure if we should use AST.add(LoadPtr, 8) or AST.add(BasePtr, INT_MAX)

And then a set of smaller comments on the patch.

================
Comment at: include/polly/ScopInfo.h:505
@@ +504,3 @@
+  /// @brief A set of minimal/maximal access vectors one for each alias group.
+  using MinMaxVectorSetTy = SmallPtrSet<MinMaxVectorTy *, 4>;
+
----------------
As you iterate over this set I propose to use a SetVector. Otherwise, the iteration order may randomly change due to adress space randomization, different memory allocators or different operating systems.

================
Comment at: include/polly/ScopInfo.h:549
@@ +548,3 @@
+  /// When building runtime alias checks we look at all memory instructions and
+  /// build so called alias groups. Each group contains a set of a accesses to
+  /// different base arrays which might alias with each other. However, between
----------------
a set of accesses.

(There is an 'a' too much)

================
Comment at: include/polly/ScopInfo.h:558
@@ +557,3 @@
+  /// During code generation we will create a runtime alias check for each alias
+  /// group to ensure the SCoP is executed in a alias free environment.
+  MinMaxVectorSetTy MinMaxAliasGroups;
----------------
typo: _an_ alias free

(Btw, the documentation above is very nice)

================
Comment at: lib/Analysis/ScopDetection.cpp:201
@@ +200,3 @@
+    PollyUseRuntimeAliasChecks = false;
+  }
+}
----------------
We may want to get rid of those two conditions, but it seems OK to insert them for now as this can make the code review more incremental.

================
Comment at: lib/Analysis/ScopInfo.cpp:72
@@ +71,3 @@
+    cl::location(PollyUseRuntimeAliasChecks), cl::Hidden, cl::ZeroOrMore,
+    cl::init(false), cl::cat(PollyCategory));
+
----------------
This is a great feature. We should enable it by default. This also will allow our automatic testers to ensure no regressions slip in throughout time.

================
Comment at: lib/Analysis/ScopInfo.cpp:1141
@@ +1140,3 @@
+  // Adjust the last dimension of the maximal access by one as we want to
+  // enclose the accessed memory region by MinPMA and MaxPMA.
+  assert(isl_pw_multi_aff_dim(MaxPMA, isl_dim_out) &&
----------------
This may cause an out of bound access in the last dimension. Was this intended and is this safe?

In case this is intended, I believe it would be good to add a comment that this case may happen and why it is safe. Nothing complicated, just to ensure people browsing the code see that we though about it.

================
Comment at: lib/Analysis/ScopInfo.cpp:1166
@@ +1165,3 @@
+  //   o) For each group with more then one base pointer we then compute minimal
+  //      and maximal accesses to each array in this group.
+  using AliasGroupTy = SmallVector<MemoryAccess *, 4>;
----------------
Very nice comment.

================
Comment at: lib/Analysis/ScopInfo.cpp:1176
@@ +1175,3 @@
+      PtrToAcc[getPointerOperand(*Acc)] = MA;
+      AST.add(Acc);
+    }
----------------
This is interesting. When I was thinking of this, my thought was to use the base pointer and UINT64_MAX as size, you seem to use the pointer in the load/store instruction and the actual load/store value as size.

I am currently thinking about if there is a difference in terms of expressiveness/correctness. Honestly, I don't know yet. Did you think about the correct size? Is there a reason one approach might be incorrect?

================
Comment at: lib/Analysis/ScopInfo.cpp:1257
@@ +1256,3 @@
+    for (MinMaxAccessTy &MMA : *MinMaxAccesses)
+      isl_pw_multi_aff_free(MMA.first) || isl_pw_multi_aff_free(MMA.second);
+    delete MinMaxAccesses;
----------------
I am personally not a big fan of such somehow surprising line saving tricks.

================
Comment at: lib/Analysis/ScopInfo.cpp:1562
@@ -1434,1 +1561,3 @@
+    scop->buildAliasGroups(AA);
+
   return false;
----------------
As mentioned earlier, it would be nice to only call this on-demand in getAliasGroups(), as users who only need to analysis part of Polly should not need to pay the cost of building alias groups. The same holds in case we want to bail out from optimizing a scop.

We have specific performance testers that ensure that -polly-scops is not getting more expensive over time. Having them not note this change would be ensuring that we did not increase compile time in the early phase of Polly.

I remember you felt very strong about not moving this piece of code over. If you still believe that way, I won't discuss this further.

================
Comment at: lib/CodeGen/IslAst.cpp:302
@@ +301,3 @@
+  RunCondition = isl_ast_expr_eq(RunCondition,
+                                 isl_ast_build_expr_from_pw_aff(Build, PwOne));
+
----------------
This change is the reason why we suddenly get conditions such as:

if (1 == 1)

It is not of cricital importance, but you could probably get rid of them by applying the following patch to createOpBoolean:

```
 -  assert(LHS->getType() == Builder.getInt1Ty() && "Expected i1 type");
 -  assert(RHS->getType() == Builder.getInt1Ty() && "Expected i1 type"); 
 +  if (LHS->getType() != Builder.getInt1Ty()) 
 +    LHS = Builder.CreateIsNotNull(LHS);
 +  if (RHS->getType() != Builder.getInt1Ty()) 
 +    RHS = Builder.CreateIsNotNull(RHS);
```

This would also reduce the number of test cases that get touched by this commit, which again helps people doing post-commit review or who later look at the patch.

================
Comment at: lib/CodeGen/IslAst.cpp:305
@@ +304,3 @@
+  // Create the alias checks from the minimal/maximal accesses in each alias
+  // groups. This operation is by construction quadratic in the number of
+  // elements in each alias group.
----------------
typo: group

================
Comment at: lib/CodeGen/IslCodeGeneration.cpp:592
@@ -591,1 +591,3 @@
+           "Runtime condition should have i1 type");
+    return RTC;
   }
----------------
Not needed if you auto booleanize values in createOpBoolean()

================
Comment at: test/Isl/Ast/alias_simple_1.ll:15
@@ +14,3 @@
+;
+; NOAA: if (1 == 1 && (&MemRef_A[N] < &MemRef_B[0] || &MemRef_B[N] < &MemRef_A[0]))
+; BASI: if (1 == 1 && (&MemRef_A[N] < &MemRef_B[0] || &MemRef_B[N] < &MemRef_A[0]))
----------------
I think this should be '<=' instead of '<'.

Assuming this is:

0 1  2 3   4 5   6 7  
x x  x x   x x   x x 

^A          ^B

// N = 2

short A[N]
short B[N]

&A[N] => Base + (2 * sizeof(short)) = 0 + 2 * 2 = 4
&B[0] => Base + (0 * sizeof(short)) = 4 + 0 * 2 = 4

&A[N] < &B[N] is false, but indeed it should evaluate to true


================
Comment at: test/Isl/Ast/run-time-condition.ll:31
@@ -30,3 +30,3 @@
 
-; CHECK: if (1)
+; CHECK: if (1 == 1)
 ; CHECK:     for (int c1 = 0; c1 <= 1023; c1 += 1)
----------------
Not needed if you auto booleanize values in createOpBoolean()

================
Comment at: test/Isl/Ast/simple-run-time-condition.ll:21
@@ -20,3 +20,3 @@
 
-; CHECK: if ((q == 100 && o <= 0) || (q == 0 && o >= 1) ? 1 : 0)
+; CHECK: if (((q == 100 && o <= 0) || (q == 0 && o >= 1) ? 1 : 0) == 1)
 
----------------
Not needed if you auto booleanize values in createOpBoolean()

================
Comment at: test/Isl/CodeGen/multidim_2d_parametric_array_static_loop_bounds.ll:16
@@ -15,3 +15,3 @@
 ; CHECK: %1 = select i1 %0, i64 1, i64 0
-; CHECK: %2 = icmp ne i64 %1, 0
+; CHECK: %2 = icmp eq i64 %1, 1
 ; CHECK: polly.split_new_and_old:
----------------
Not needed if you auto booleanize values in createOpBoolean()

================
Comment at: test/Isl/CodeGen/run-time-condition-with-scev-parameters.ll:7
@@ -6,3 +6,3 @@
 ; CHECK: %3 = select i1 %2, i64 1, i64 0
-; CHECK: %4 = icmp ne i64 %3, 0
+; CHECK: %4 = icmp eq i64 %3, 1
 
----------------
Not needed if you auto booleanize values in createOpBoolean()

================
Comment at: test/Isl/CodeGen/scop_never_executed_runtime_check_location.ll:14
@@ -13,3 +13,3 @@
 ; CHECK: %[[T2:[._a-zA-Z0-9]]] = select i1 %[[T1]], i64 1, i64 0
-; CHECK: %[[T3:[._a-zA-Z0-9]]] = icmp ne i64 %[[T2]], 0
+; CHECK: %[[T3:[._a-zA-Z0-9]]] = icmp eq i64 %[[T2]], 1
 
----------------
Not needed if you auto booleanize values in createOpBoolean()

================
Comment at: test/ScopDetectionDiagnostics/ReportAlias-01.ll:9
@@ -8,3 +8,3 @@
 ; CHECK: remark: ReportAlias-01.c:2:8: The following errors keep this region from being a Scop.
-; CHECK: remark: ReportAlias-01.c:3:5: Accesses to the arrays "A", "B" may access the same memory.
+; CHECK: remark: ReportAlias-01.c:3:5: Accesses to the arrays {{("A", "B"|"B", "A")}} may access the same memory.
 ; CHECK: remark: ReportAlias-01.c:3:5: Invalid Scop candidate ends here.
----------------
For me, the test case still produces "A", "B" as output. Is there a specific reason this needs to be changed (e.g. your system gives "B", "A")?

I am asking as unpredictable output sometimes points to cases where we loop over unsorted sets, which then cause this kind of behaviour that is hard to debug and consequently better avoided.

http://reviews.llvm.org/D5077