[PATCH] D13611: [Polly] Create virtual independent blocks

Fri Nov 6 19:24:21 PST 2015

jdoerfert added a comment.

In http://reviews.llvm.org/D13611#284391, @Meinersbur wrote:

> I have the impression that this very much relies on that accesses (mostly SCALAR READ, but also PHI writes in non-affine subregions) are computed per instruction, whereas in http://reviews.llvm.org/D13762 I tried to make them per-ScopStmt. The difference is that e.g. if there are two reads in the same BB, two loads would be generated. In http://reviews.llvm.org/D13762 I tried to ensure that there is only one load.


One vs. multiple loads/stores of scalars is only of little difference to us as it will __only__ affect code generation. Everything before, especially dependence analysis, will be the same. Hence, consolidating scalar accesses only to save duplicate loads/stores in the genereted IR (that will be taken care of anyway) is no good reason to make any kind of scalar elimination harder.

> This patch seems to skip PHI accesses completely. In San Joe you mentioned that only loop-carried phis would be skipped. Where is that part? Where would DeLICM tell what memory can be reused?


There are 5 patches. This is the first and it only deals with trivially recomputable scalars, thus the onces without any "interesting/complicated operands".
The following patches as well as DeLICM comes into play in the canRecomputeInStmt function which is (almost) the only place that needs to be updated to allow more functionality.

> I got the impression we could generalize this with in SCEV expander. A SCEV-like object (we already support SDiv/SRem) could store an entire operand tree of movable instructions and CodeGen would synthesize/materialize it  on request. This looks like more streamlined with current mechanisms instead of introducing a new one. What do you think?


If SCEV would allow floating point values maybe, till then, we would implement everything we have here somehow in our SCoPExpander which is not really helpful. Additionally, during the SCoP generation (e.g., when we check if we can synthezise scalars instead of communicating them) we cannot check for overwritten loads and free memory locations we can reuse.


================
Comment at: include/polly/CodeGen/BlockGenerators.h:480
@@ -479,3 +492,1 @@
-  /// @return The innermost loop that surrounds the instruction.
-  Loop *getLoopForInst(const Instruction *Inst);
 
----------------
When we virtually move instructions we are not really interested in the loop surroinding them but their new parent block/statement, thus the change.


http://reviews.llvm.org/D13611