[PATCH] [Polly] Runtime alias checks

Fri Sep 12 05:19:28 PDT 2014

Hi Johannes,

here some feedback to your comments.

Tobias

P.S: Did you push the changed patch? I can not find it in the web interface.

================
Comment at: lib/Analysis/ScopInfo.cpp:1141
@@ +1140,3 @@
+  // Adjust the last dimension of the maximal access by one as we want to
+  // enclose the accessed memory region by MinPMA and MaxPMA.
+  assert(isl_pw_multi_aff_dim(MaxPMA, isl_dim_out) &&
----------------
jdoerfert wrote:
> grosser wrote:
> > This may cause an out of bound access in the last dimension. Was this intended and is this safe?
> > 
> > In case this is intended, I believe it would be good to add a comment that this case may happen and why it is safe. Nothing complicated, just to ensure people browsing the code see that we though about it.
> Added a sentence, just stating the possibility and that it is safe.
I think it would be helpful to mention why we believe it is safe. Otherwise no-one can verify if our assumptions still hold e.g., after changes to LLVM.

================
Comment at: lib/Analysis/ScopInfo.cpp:1176
@@ +1175,3 @@
+      PtrToAcc[getPointerOperand(*Acc)] = MA;
+      AST.add(Acc);
+    }
----------------
jdoerfert wrote:
> grosser wrote:
> > This is interesting. When I was thinking of this, my thought was to use the base pointer and UINT64_MAX as size, you seem to use the pointer in the load/store instruction and the actual load/store value as size.
> > 
> > I am currently thinking about if there is a difference in terms of expressiveness/correctness. Honestly, I don't know yet. Did you think about the correct size? Is there a reason one approach might be incorrect?
> I'm sorry but I don't understand what you are saying. I also do not know why we should use UINT64_MAX except when we can't represent the real offset. However, the runtime alias checks are disabled for now in case "allow non affine" is enabled, thus we always know the exact (parametric) offset.
> 
> Could you explain your idea and the problem this approach might have in a bit more detail?
In the ScopDetection we use the following code to check for aliasing:

```
// Check if the base pointer of the memory access does alias with              
// any other pointer. This cannot be handled at the moment.                    
AliasSet &AS =                                                                 
  Context.AST.getAliasSetForPointer(BaseValue, AliasAnalysis::UnknownSize,   
                                        Inst.getMetadata(LLVMContext::MD_tbaa));
```


Similarly, the AST has a set of add() functions:

  bool add(Value *Ptr, uint64_t Size, const AAMDNodes &AAInfo); // Add a loc.    
  bool add(LoadInst *LI);                                                        
  bool add(StoreInst *SI);  

The LoadInst/StoreInst version of the add reason about a single load of maybe 16 bytes, whereas the first version of the add instruction allows to give a base pointer and an arbitrary offset.

The reason I implemented the alias check in the scop detection using the base pointer + an UnknownSize parameter is that I wanted to include all possible (positive?) offsets from this base pointer in the alias check.

This is similar to how we in the end construct the run-time alias check. Only checking the individual loads did not come to my mind.

Now, as I said, I don't really know what is better/more correct.

Some first thoughts are here:

When checking individual accesses I could imagine that a single base array is split into multiple alias sets:

```
for: int i
  S1:  A[2 i] = A[2i + 1]; 
  S2:  A[2i + 2] = A[2i - 1];
```
Here an alias analysis could possibly form two groups AS_1 = {A[2i], A[2i + 2]}, AS_2 = {A[2i+1], A[2i-1]},
as the even and odd accesses could be proven to not alias. Would this have any drawbacks?

When checking the baseptr + large-offset, we may have troubles to exploit the tbaa and other per-access annotations. Is this right?

In case we decide to go for the per-access checks, I wonder if we then should change the check in the scop detection as well?

================
Comment at: lib/Analysis/ScopInfo.cpp:1562
@@ -1434,1 +1561,3 @@
+    scop->buildAliasGroups(AA);
+
   return false;
----------------
jdoerfert wrote:
> grosser wrote:
> > As mentioned earlier, it would be nice to only call this on-demand in getAliasGroups(), as users who only need to analysis part of Polly should not need to pay the cost of building alias groups. The same holds in case we want to bail out from optimizing a scop.
> > 
> > We have specific performance testers that ensure that -polly-scops is not getting more expensive over time. Having them not note this change would be ensuring that we did not increase compile time in the early phase of Polly.
> > 
> > I remember you felt very strong about not moving this piece of code over. If you still believe that way, I won't discuss this further.
> Your right, I still like the code here, but I also gave you some reasons for this.
> 
> One more (wrt. what you said about polly as analyser) is that we can provide exact information about possibly overlapping accesses. I will use this feature together with the dependency analysis in another project, both with and without generating an optimized SCoP with Polly.
Just to make it clear. I agree that all the code should remain in ScopInfo. The only change I suggested was to not call buildAliasGroups() in runOnScop, but rather on-demand from getAliasGroups() (and to then use getAliasGroups() in the analysis-printer).

================
Comment at: test/Isl/Ast/alias_simple_1.ll:15
@@ +14,3 @@
+;
+; NOAA: if (1 == 1 && (&MemRef_A[N] < &MemRef_B[0] || &MemRef_B[N] < &MemRef_A[0]))
+; BASI: if (1 == 1 && (&MemRef_A[N] < &MemRef_B[0] || &MemRef_B[N] < &MemRef_A[0]))
----------------
jdoerfert wrote:
> grosser wrote:
> > I think this should be '<=' instead of '<'.
> > 
> > Assuming this is:
> > 
> > 0 1  2 3   4 5   6 7  
> > x x  x x   x x   x x 
> > 
> > ^A          ^B
> > 
> > // N = 2
> > 
> > short A[N]
> > short B[N]
> > 
> > &A[N] => Base + (2 * sizeof(short)) = 0 + 2 * 2 = 4
> > &B[0] => Base + (0 * sizeof(short)) = 4 + 0 * 2 = 4
> > 
> > &A[N] < &B[N] is false, but indeed it should evaluate to true
> > 
> If I get your example correctly we have:
> 
> short *A = ...
> short *B = A + 2;
> 
> and accesses (with N=2):
>   A[2] and B[0]
> then the alias check should say false (what it does) becasue both accesses will load the same value.
> 
> What do you think?
My example extends the code in the test case by defining a specific N and a specific memory address for A and B. For this N, the test case only accesses A[0], A[1], B[0], B[1], but not A[2]. For the pointer offsets specified (short *B = A+2), this means there are no conflicting accesses. However, as you seem to agree the alias check will detect the existence of aliasing. To me this seems overly and unnecessarily conservative. Especially as it is not unlikely that two arrays are allocated right after each other.



================
Comment at: test/Isl/CodeGen/multidim_2d_parametric_array_static_loop_bounds.ll:16
@@ -15,3 +15,3 @@
 ; CHECK: %1 = select i1 %0, i64 1, i64 0
-; CHECK: %2 = icmp ne i64 %1, 0
+; CHECK: %2 = icmp eq i64 %1, 1
 ; CHECK: polly.split_new_and_old:
----------------
jdoerfert wrote:
> grosser wrote:
> > Not needed if you auto booleanize values in createOpBoolean()
> As you see on the left, either a "!=" or "==" comparison is needed.
To clarify: The instruction is needed, but most likely the change from "!=" to "==" disappears when using CreateIsNotNull(). This reduces the set of test cases that is changed, which is by itself already a good thing.

http://reviews.llvm.org/D5077