[PATCH] D13611: [Polly] Create virtual independent blocks

Fri Oct 16 09:50:48 PDT 2015

Meinersbur added a comment.

In http://reviews.llvm.org/D13611#268545, @jdoerfert wrote:

> In http://reviews.llvm.org/D13611#268449, @grosser wrote:
>
> > Hi Johannes,
> >
> > great to see we will finally get rid of the IndependentBlocks pass. I just had a very first skim over this patch, but did not manage to do a full review yet.
> >
> > The two main items I would still like to look into:
> >
> > 1. This seems to conflict with http://reviews.llvm.org/D12975 (Michael's DE-LICM)
> >
> >   I need to get a better understanding in which ways these patches possibly conflict (as well as the overall design).
>
>
> Yeah, at the time this was written it was fine but __as far as I understand it__ the MAPPED accesses (and possibly more) implicitly change the semantics of the SCoP description without it beeing actually changed (except the "MAPPED" flag). One could argue that this should look for MAPPED accesses but I don't, instead I will argue that "MAPPED" should __not__ be an access property that is determined on the CFG and only a flag in the access description that turns on some alternate path in the code generation. An access that is not actually happening (no code is generated for) should not be in the SCoP description and the ones in there should happen where they are and read/write the location they define. If that is the case this patch should work properly.

The semantics of a Scop does not change. What does change is where implicits are stored. Like AccessInstruction and AccessValue and non-affine subregions, it is no information required by the polyhedral model itself, but required to associate polyhedral statments to the IR behind it.

To avoid modifying IR directly, we "virtualize" the relevant metainformation such that the Scop description and IR code do not map directly. Having to redirect loads/stores becomes unavoidable.

> 

> 

> > 2. Post-processing statements vs. instruction-list ScopStmts

> 

> > 

> 

> >   Your current approach of first building all ScopStmts and then post-process them to eliminate scalar dependences is interesting. However, it requires a special post-processing phase in ScopInfo and additional special-purpose code in ScopStmt and BlockGenerator. I wonder if it may make sense to build ScopStmts from the beginning just as a list of instructions (instead of a basic block) which could then be uniformly processed in the BlockGenerator.

> 

> 

> Processing the SCoP allows general solutions which are not (easily) reproduceable on the CFG, e.g., determining if a load is overwritten is easy and precise but hard on the CFG. The instruction list is a different issue and it would remove not that much code/logic anyway. With the pending patches by Michael we will reload scalars at the beginning of a statmeent and as I pointed out on that review, alsmost the same is added here.

I can't find such code in this patch. Where is it?

> While I did not get a valuable response on the idea (and implementation sketch) to merge the recomputation and the scalar reload at the beginning of a statement, I think it is what we ultimately want.

What proposal are you refereing to? What scalar reload?

In http://reviews.llvm.org/D13611#268560, @jdoerfert wrote:

> In http://reviews.llvm.org/D13611#268497, @Meinersbur wrote:
>
> > In http://reviews.llvm.org/D13611#268449, @grosser wrote:
> >
> > > Hi Johannes,
> > >
> > > great to see we will finally get rid of the IndependentBlocks pass. I just had a very first skim over this patch, but did not manage to do a full review yet.
> > >
> > > The two main items I would still like to look into:
> > >
> > > 1. This seems to conflict with http://reviews.llvm.org/D12975 (Michael's DE-LICM)
> > >
> > >   I need to get a better understanding in which ways these patches possibly conflict (as well as the overall design).
> >
> >
> > To add some details:
> >
> > http://reviews.llvm.org/D12975 tracks which values flow between ScopStmts which this patch modifies at a very late phase. There should be fewer such implicit flows, but some of those flows might have already be created with origin MAPPED. The array element occupied by this MAPPED flow might be used for some other implicit flow, meaning De-LICM before virtual blocks is ineffective. It is also more difficult to modify MAPPEDs because lifetime issues have to be considered and there is no distinction between SCALAR and PHI origins (not sure how much of a problem this is though).
>
>
> See the comment on

Which coomment?

> 

> 

> > One of my rationale to do De-LICM before even creating MemoryAccesses is that it is easier to implement (no moving, copying, deleting of MemoryAccesses) and reduced overhead because fewer MemoryAccesses are created in the first place.

> 

> 

> While moving, copying and deleting MemoryAccesses sound like a lot, one has to remember the number of times this can happen and the cost of each operation.

>  I think the fact that we get generally better results with this patch than without (see commit message) shows that the "overhead" is low enough.

I have no general objection to it. However, you occasionally raise such concerns yourself (e.g. http://reviews.llvm.org/D13341) so I try to take them into account.

> The implementation difficulty is another issue. While this patch currently only forwards/copies scalars its general structure allows more, e.g., reload values instead of communicate them (see subsequent patch) or remove conditional PHI nodes and replace them by selects. While e.g., reloading is generally hard and costly on the CFG level (one has to determine if the value has been overwritten __somewhere__ in between) the SCoP abstraction allows to reason about it in a simple and consise way.

Such analysis on the polyhedral model requires expensive ISL operations to compute the dependences. One of our intents of simplifying the scop description is to make its analysis run faster. We'd do still to the complicated analysis on the large scop description to get a simpler on and redo the same analysis. I am not saying this that this cannot make sense, but the simplifications we want to perform (IndependentBlocks, DeLICM) are both on the IR level on which reasoning is much cheaper.

> > So we have the possibilities to either do "simplifyScalarAccesses" before creating MemoryAccesses (e.g. by marking Instructions to where they should be copied) or "De-LICM" after simplifyScalarAccesses.

> 

> 

> I already proposed the latter in a review last week but I don't recall a positive answer to that.

AFAIU we about to discuss this.

http://reviews.llvm.org/D13611