[PATCH] D25095: [Polly] Ignore removed memory access. NFC.

Sat Oct 1 02:24:48 PDT 2016

grosser added a reviewer: gareevroman.
grosser added a comment.

Adding Roman who might have an opinion here as well.

In https://reviews.llvm.org/D25095#557969, @Meinersbur wrote:

> In https://reviews.llvm.org/D25095#557608, @grosser wrote:
>
> > I am worried about a situation where we have sth like:
> >
> > %val = load %A
> >  %div = div %val, 0
> >  store %div
> >
> > If we drop the div, we might trigger undefined behavior.
>
>
> Did you mean "drop the load"? This patch does not influence non-access instruction handling.

Right.

>> To my understanding this can never happen, as you will only drop direct copy sequences, right? I am probably too careful here, but if you could quickly check (assert) that the only use of %val is in a store instruction, my worries would disappear.
> 
> The correctness of removing a memory access is a concern of the the code that removes them. At codegen-time it is very difficult or even impossible to retroactively check whether a removal was correct. There is no exhaustive list of reasons why it could be removed. The detection mechanism you mentioned is neither sufficient nor mandatory even for just DeLICM. It is invalid if the store, which could also be an implicit store, is not also removed. Uses can also be in other statements that get their llvm::Value from a scalar load, but have no effect on the removal of such a load.

I understand that in the end passes such as DeLICM need to decide what is save to do and what not.

Let me just write down some related, but unfinished thoughts I have, that may give some context of what I am thinking about and where I believe there are possibilities to improve the current design of Polly.

I have been thinking for a while, if we really want to remove loads and stores from the polyhedral representation and to which extend we want to do this. I generally have the feeling that removing  memory accesses by dropping the full polly::MemoryAccess is not what we actually want, as it results in an inconsistency between the IR and the polyhedral model. We already discussed this earlier, when we concluded that the MemoryAccess is currently holding two kinds of information. The information about the original access and the information about how we transform the memory access. If we drop the full memory access structure we loose the information about the original memory access.  Obviously, there is precedence with Johannes' load hoisting, which introduced this behavior, but I hope we can improve this design in the future by making the separation between new and original memory access behavior more clear

I see different ways to improve this situation. I could e.g. see the load hoisting being expressed by modifying the access functions similar to how we do this today in Roman's and your work. We also have since a long time on the Polly todo list the introduction of ScopStmts that are defined on instruction trees, rather than basic blocks. As part of this, I could imaging that we automatically detect pure 'copy-only' statements and match them to the Copy_Stmt Roman recently introduced.

Assuming this direction is where we want to go (and this probably needs more discussion -- maybe next phone call?), I was wondering where DeLICM fits in here. If DeLICM for example only removes copy statements maybe we could in the future avoid the removal of memory accesses in DeLICM and instead remove full copy statements.

Now going back to this actual patch. I don't think it is necessary to today go full speed on these design changes, but am OK to continue removing full polly::MemoryAccesses for a while. Still, I think to make future changes easier it would be great if we could limit the generality of such changes to what is needed for DeLICM today. As said earlier, I understand that we cannot validate DeLICM retrospectively. But maybe we can check what kind of load/store memory accesses we allow to be removed.

I went back to learn which memory accesses DeLICM removes today and it seems
delicm's cleanup routine does not drop loads at all and only drops stores that store a value that has been loaded before (so no div in between). It seems the load part of this patch can today not be 
reached, right? According to the comments in cleanup, you are planning to also remove the corresponding loads, but such loads would only be limited to the ones , where all uses are direct uses by store statements in the same basic block?Does your comment above refer to this situation, when mentioning that the vectorizer would crash if the load is not dropped? Or does this require more than one compute instruction between load and store?

> Also consider that not generating a store is just as harmful for the generated code as replacing a load by an `Undef`. The missing store would be even harder to find.
> 
>> Also, if you commit this separately, which certainly would not hurt, a JSON test case would be amazing.
> 
> The json importer currently does not support removing `MemoryAccess`es. Implementing that will require much more code and certainly also its own review.

I suggested this as a way to obtain test coverage for this patch without the need for delicm yet. I would hope that JSON is easy to extend (e.g. just mark a memory access as 'deleted').

Another option would be to commit this change together with the delicm cleanup() routine after DELICM. We can then run delicm to test the codegen change and it also becomes more clear what
are the actual uses. I am in fact starting to prefer this approach slightly.

Best,
Tobias

https://reviews.llvm.org/D25095