[PATCH] [Polly] Runtime alias checks

Mon Sep 15 01:56:19 PDT 2014

On 12/09/2014 15:51, Johannes Doerfert wrote:
> Comments, new version is comming up.
>
> ================
> Comment at: lib/Analysis/ScopInfo.cpp:1176
> @@ +1175,3 @@
> +      PtrToAcc[getPointerOperand(*Acc)] = MA;
> +      AST.add(Acc);
> +    }
> ----------------
> grosser wrote:
>> jdoerfert wrote:
>>> grosser wrote:
>>>> This is interesting. When I was thinking of this, my thought was to use the base pointer and UINT64_MAX as size, you seem to use the pointer in the load/store instruction and the actual load/store value as size.
>>>>
>>>> I am currently thinking about if there is a difference in terms of expressiveness/correctness. Honestly, I don't know yet. Did you think about the correct size? Is there a reason one approach might be incorrect?
>>> I'm sorry but I don't understand what you are saying. I also do not know why we should use UINT64_MAX except when we can't represent the real offset. However, the runtime alias checks are disabled for now in case "allow non affine" is enabled, thus we always know the exact (parametric) offset.
>>>
>>> Could you explain your idea and the problem this approach might have in a bit more detail?
>> In the ScopDetection we use the following code to check for aliasing:
>>
>> ```
>> // Check if the base pointer of the memory access does alias with
>> // any other pointer. This cannot be handled at the moment.
>> AliasSet &AS =
>>    Context.AST.getAliasSetForPointer(BaseValue, AliasAnalysis::UnknownSize,
>>                                          Inst.getMetadata(LLVMContext::MD_tbaa));
>> ```
>>
>>
>> Similarly, the AST has a set of add() functions:
>>
>>    bool add(Value *Ptr, uint64_t Size, const AAMDNodes &AAInfo); // Add a loc.
>>    bool add(LoadInst *LI);
>>    bool add(StoreInst *SI);
>>
>> The LoadInst/StoreInst version of the add reason about a single load of maybe 16 bytes, whereas the first version of the add instruction allows to give a base pointer and an arbitrary offset.
>>
>> The reason I implemented the alias check in the scop detection using the base pointer + an UnknownSize parameter is that I wanted to include all possible (positive?) offsets from this base pointer in the alias check.
>>
>> This is similar to how we in the end construct the run-time alias check. Only checking the individual loads did not come to my mind.
>>
>> Now, as I said, I don't really know what is better/more correct.
>>
>> Some first thoughts are here:
>>
>> When checking individual accesses I could imagine that a single base array is split into multiple alias sets:
>>
>> ```
>> for: int i
>>    S1:  A[2 i] = A[2i + 1];
>>    S2:  A[2i + 2] = A[2i - 1];
>> ```
>> Here an alias analysis could possibly form two groups AS_1 = {A[2i], A[2i + 2]}, AS_2 = {A[2i+1], A[2i-1]},
>> as the even and odd accesses could be proven to not alias. Would this have any drawbacks?
>>
>> When checking the baseptr + large-offset, we may have troubles to exploit the tbaa and other per-access annotations. Is this right?
>>
>> In case we decide to go for the per-access checks, I wonder if we then should change the check in the scop detection as well?
> The ScopDetection implementation shouldn't matter as long as runtime alias checks are turned on.
> Changing it to the way I implemented it here is a different matter for another time.
>
> I now understood your problem partially, however I still do not know why I shouldn't use the
> ```
> AST.add(Instruction *)
> ```
> interface which then will sue the LoadInst and StoreInst versions. AST offers this, thus it should do the correct thing with the pointer operand. And since I add all memory instructions it also knows about all pointer operands.
>
> The single base address problem is handled here anyway and the check is cheap.
>
> I'm not a hundred procent sure about the TBAA information but I suspect your right. They are attached to the load/store instructions and we need the AST to see them.
>
> I vote __very__ strongly for the load/store version over the baseptr + unknown offset version here, in ScopDetection I do not care to much as the code is alsmost dead after this commit anyway.

I am not sure what voting strongly means? :)

  - You are sure that add(Instruction) is correct and better in
    some way than baseptr + offset
  - You did not understand what is better, but are sure both are
    safe and consequently it does not matter
  - You did not fully understand the difference, but you want to commit
    this code and improve this in a later commit.

I personally don't push in any direction, but would like to have 
understood which of the three options we are heading to. If we are 
heading for one of the first two options, it would be good to document 
the reason in a comment.

> ================
> Comment at: lib/Analysis/ScopInfo.cpp:1562
> @@ -1434,1 +1561,3 @@
> +    scop->buildAliasGroups(AA);
> +
>     return false;
> ----------------
> grosser wrote:
>> jdoerfert wrote:
>>> grosser wrote:
>>>> As mentioned earlier, it would be nice to only call this on-demand in getAliasGroups(), as users who only need to analysis part of Polly should not need to pay the cost of building alias groups. The same holds in case we want to bail out from optimizing a scop.
>>>>
>>>> We have specific performance testers that ensure that -polly-scops is not getting more expensive over time. Having them not note this change would be ensuring that we did not increase compile time in the early phase of Polly.
>>>>
>>>> I remember you felt very strong about not moving this piece of code over. If you still believe that way, I won't discuss this further.
>>> Your right, I still like the code here, but I also gave you some reasons for this.
>>>
>>> One more (wrt. what you said about polly as analyser) is that we can provide exact information about possibly overlapping accesses. I will use this feature together with the dependency analysis in another project, both with and without generating an optimized SCoP with Polly.
>> Just to make it clear. I agree that all the code should remain in ScopInfo. The only change I suggested was to not call buildAliasGroups() in runOnScop, but rather on-demand from getAliasGroups() (and to then use getAliasGroups() in the analysis-printer).
> You can change that afterwards, we would also see how much compile time it safes us that way. I'm at the moment not sure e.g., how to mark the alias checks as computed or not.

Fine with me.

Cheers,
Tobias