[PATCH] D13611: [Polly] Create virtual independent blocks

Johannes Doerfert via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 6 20:31:06 PST 2015


jdoerfert added a comment.

I tried to address (in words) all your comments. Once we agrred on how to proceed I will do code changes.


================
Comment at: include/polly/CodeGen/BlockGenerators.h:422
@@ -421,1 +421,3 @@
 
+  /// @brief Recompute all scalars needed in this statement.
+  ///
----------------
All that are not communicated via memory but need to be recomputed.

================
Comment at: include/polly/CodeGen/BlockGenerators.h:426
@@ +425,3 @@
+  /// description as well as the dependences. However, they are only moved if
+  /// we can recompute them in the statements they are used in. This method will
+  /// perform the recomputation before we clone the original statement into the
----------------
I will check that.

================
Comment at: include/polly/CodeGen/BlockGenerators.h:428
@@ +427,3 @@
+  /// perform the recomputation before we clone the original statement into the
+  /// new, optimized region, thus ensure all scalars are available.
+  void recomputeDependentScalars(ScopStmt &Stmt, ValueMapT &BBMap,
----------------
I will check that.

================
Comment at: include/polly/CodeGen/BlockGenerators.h:483
@@ -470,1 +482,3 @@
+  ///           o NULL, if no value is found and TryOnly is set.
+  ///           o Otherwise a trap is triggered.
   Value *getNewValue(ScopStmt &Stmt, Value *Old, ValueMapT &BBMap,
----------------
I can use the original message instead. If you have a better way to describe llvm_unreachable I can put it here alternatively.

================
Comment at: include/polly/CodeGen/BlockGenerators.h:488
@@ -473,3 +487,3 @@
   void copyInstScalar(ScopStmt &Stmt, Instruction *Inst, ValueMapT &BBMap,
-                      LoopToScevMapT &LTS);
+                      LoopToScevMapT &LTS, bool Recompute = false);
 
----------------
Yeah, somebody should add documentation to all this undocumented functions at some point. 

================
Comment at: include/polly/CodeGen/BlockGenerators.h:546
@@ -534,1 +545,3 @@
   ///                    memory accesses.
+  /// @param Recompute   Flag to indicate the instruction is a scalar that
+  ///                    needs to be recomputed in this statement. It basically
----------------
I will check that.

================
Comment at: include/polly/ScopInfo.h:516
@@ +515,3 @@
+  /// @param Stmt    The statement the copied access should reside in.
+  MemoryAccess *copy(AccFuncSetType &AccList, ScopStmt *Stmt) const;
+
----------------
I would like to avoid accessing members of a different class.

================
Comment at: include/polly/ScopInfo.h:541
@@ -534,3 +540,3 @@
   /// @brief Get the type of a memory access.
-  enum AccessType getType() { return AccType; }
+  enum AccessType getType() const { return AccType; }
 
----------------
Needed somewhere, but it was quite a while ago. If ppl need to know where I can check.

================
Comment at: include/polly/ScopInfo.h:932
@@ +931,3 @@
+  /// @param Front  Flag to indicate where the access should be added.
+  void addAccess(MemoryAccess *Access, bool Front = false);
+
----------------
I'm not sure about your isl assumption but in any case, I'm not assuming anything here. The different input locations are purely for user experience (as the accesses are ordered in a more natural fashion).

================
Comment at: include/polly/ScopInfo.h:940
@@ -921,1 +939,3 @@
+  ///
+  void removeMemoryAccess(MemoryAccess *MA, bool OnlyMA);
 
----------------
We could do that.

================
Comment at: include/polly/ScopInfo.h:1275
@@ +1274,3 @@
+  /// @returns True, if all instructions can be recomputed in @p Stmt.
+  bool canRecomputeInStmt(ScopStmt &Stmt, SmallPtrSet<Instruction *, 4> &Insts);
+
----------------
Meinersbur wrote:
> It would be more idomatic if each call would check only a single Instruction instead of passing a set implementation.
We could do that.

================
Comment at: include/polly/ScopInfo.h:1291
@@ +1290,3 @@
+  /// operands properly.
+  using NonTrivialOperandsPairTy =
+      std::pair<InstructionSetTy, OutsideOperandsSetTy>;
----------------
Meinersbur wrote:
> I think you should clarify that you actually mean a flattened operand _tree_. I think LLVM uses the term operands as the direct arguments/parameters of an instruction.
Where do you expect a tree here? It actually contains __transitive__ operands and I could add that word.

================
Comment at: include/polly/ScopInfo.h:1295
@@ +1294,3 @@
+  /// @brief Map from in SCoP instructions to their non-trivial operands.
+  DenseMap<Instruction *, NonTrivialOperandsPairTy> NonTrivialOperandsMap;
+
----------------
Meinersbur wrote:
> AFAICS it is only used during the execution of simplifyScalarAccesses. You could make it local and pass it to the functions that need it.
We could do that.

================
Comment at: include/polly/ScopInfo.h:1330
@@ +1329,3 @@
+  /// original code region at all.
+  void simplifyScalarAccesses();
+
----------------
Meinersbur wrote:
> The name "simplifyScalarAccesses" is very general. How about "recomputableDependentScalars"?
We do not recompute here and later we do not always recompute to simplify scalars, thus I think it actually fits pretty well.

================
Comment at: lib/Analysis/ScopInfo.cpp:667
@@ +666,3 @@
+  isl_map *AR = getAccessRelation();
+  if (StmtDims > Dims)
+    AR = isl_map_add_dims(AR, isl_dim_in, StmtDims - Dims);
----------------
Meinersbur wrote:
> What if it has fewer dimensions? If it is impossible, could you add an assertion to make it clear?
Sure we can add an assertion.

================
Comment at: lib/Analysis/ScopInfo.cpp:673
@@ +672,3 @@
+  return CopyMA;
+}
+
----------------
Meinersbur wrote:
> Doesn't this update InstructionToAccesses?
Mh, there was no need but we can do that too I guess.

================
Comment at: lib/Analysis/ScopInfo.cpp:1420
@@ +1419,3 @@
+  auto MALEnd = MAL.end();
+  auto MemAccsIt = MemAccs.begin();
+  while (true) {
----------------
Meinersbur wrote:
> I'd still ask you to refactor this code. It is hard to understand and looks fragile as it depends on order. Maybe you could use some functions from <algorithm>
This code (already in the current HEAD) is ugly because it tries to be smart. While it heavily depends on the order it might need a lot more iterations if it would not. I don't argue we have to keep it but only that it works in Polly HEAD for now and can be takled afterwards, if we find a nice and comparable way to do it.

================
Comment at: lib/Analysis/ScopInfo.cpp:2778
@@ +2777,3 @@
+
+  // TODO: Check if we can actually move the instructions.
+  return false;
----------------
Meinersbur wrote:
> What are you plans here? When can instructions with side effects be recomputed?
That was what I was telling you in SJ about. I even pushed the first patch that implements this TODO to reviews and it depends on this one.

================
Comment at: lib/Analysis/ScopInfo.cpp:2783
@@ +2782,3 @@
+void Scop::collectNonTrivialOperands() {
+  // See: Scop::simplifyScalarAccesses()
+
----------------
Meinersbur wrote:
> What is there to see?
I'm not sure how to answer to that as I expected the link to the function that containts the comment for this one to be  self-explanatory.

================
Comment at: lib/Analysis/ScopInfo.cpp:2788
@@ +2787,3 @@
+    for (MemoryAccess *MA : Stmt) {
+      if (MA->isExplicit() || MA->isRead() || MA->isPHI())
+        continue;
----------------
Meinersbur wrote:
> Why exit node PHIs processed, but not standard phis?
I'm not sure why you ask this but as you can probably see in the code no "PHI access" is processed. Though, if something that is a PHI does not introduce a "PHI access" it might be processed here.

================
Comment at: lib/Analysis/ScopInfo.cpp:2800
@@ +2799,3 @@
+
+      auto &NonTrivialOperands = NonTrivialOperandsMap[AccessInst];
+      auto &SideEffectOperands = NonTrivialOperands.first;
----------------
Meinersbur wrote:
> Why for each access separately? Shouldn't this be per-ScopStmt because it only matters for cross-stmt operands?
I'm not sure what you mean. Why do we look at each access separatly or why do we look at the non-trivial-operands of each access separalty?

================
Comment at: lib/Analysis/ScopInfo.cpp:2815
@@ +2814,3 @@
+
+        for (auto &InstOp : Inst->operands())
+          if (Instruction *InstOpInst = dyn_cast<Instruction>(InstOp)) {
----------------
Meinersbur wrote:
> auto -> Use ?
We could do that.

================
Comment at: lib/Analysis/ScopInfo.cpp:2817
@@ +2816,3 @@
+          if (Instruction *InstOpInst = dyn_cast<Instruction>(InstOp)) {
+            if (R.contains(InstOpInst))
+              Worklist.insert(InstOpInst);
----------------
Meinersbur wrote:
> This probably will have issues with PHIs in the scop's entry just is in PR25394, ie. if InstOpInst is a PHI with an incoming block from outside the scop.
Mh, if we have a test case I'll check that.

================
Comment at: lib/Analysis/ScopInfo.cpp:2824
@@ +2823,3 @@
+        if (Inst->mayHaveSideEffects() || Inst->mayReadFromMemory())
+          SideEffectOperands.insert(Inst);
+
----------------
Meinersbur wrote:
> and continue ?
This looks weird without the follow up commits that introduce some code here.

================
Comment at: lib/Analysis/ScopInfo.cpp:2876
@@ +2875,3 @@
+      auto &SideEffectOperands = NonTrivialOperands.first;
+      if (!canRecomputeInStmt(Stmt, SideEffectOperands))
+        continue;
----------------
Meinersbur wrote:
> Wouldn't it be possible to recompute some of the dependent values, even if some operands have side effects?
Might be, but it won't help with removing the scalar dependences. Actually, it might make them worse.

================
Comment at: lib/Analysis/ScopInfo.cpp:2880
@@ +2879,3 @@
+      auto &OutsideOperands = NonTrivialOperands.second;
+      for (auto &OutsideOperandPair : OutsideOperands) {
+        Instruction *OutsideOperand = OutsideOperandPair.first;
----------------
Meinersbur wrote:
> Why only instructions that have operands from outside? For movable instructions (DependentScalar) without such operands, but implicit accesses, don't we need to remove its memory accesses as well?
Outside operands are the read only scalars we track to easy OpenMP code generation. The scalar access we recompute is removed later.

================
Comment at: lib/Analysis/ScopInfo.cpp:2885
@@ +2884,3 @@
+        assert(UserStmt);
+        auto *UseMAs = UserStmt->lookupAccessesFor(OutsideUser);
+        if (!UseMAs)
----------------
Meinersbur wrote:
> This is using the list of accesses this function modifies on the fly ("InstructionToAccess"). As such, it seems to me to depend on the order we iterate over all statements. E.g. the previous iteration already removed the read access of "OutsideUser", hence will not be added to AdditionalAccesses anymore, hence we miss some read accesses.
> 
> Or am I wrong? Why is it this cannot happen?
No read access is/should be removed if it is still connected to anything else. Hence, your senario should never happen. Additionally, we actually iterate over accesses here, thus the actual removal happens only at the end (after we traversed all accesses).

================
Comment at: lib/Analysis/ScopInfo.cpp:2890
@@ +2889,3 @@
+        for (const MemoryAccess *UseMA : *UseMAs)
+          if (UseMA->getBaseAddr() == OutsideOperand)
+            AdditionalAccesses.push_back(UseMA->copy(AccList, &Stmt));
----------------
Meinersbur wrote:
> What does this condition mean?
> 
> I think we generally are misusing "BaseAddr" for scalar access, because it actually not an address.
It depends on how you see it. I think the (virtual) register is a unique address for a scalar, thus it fits well here.

================
Comment at: lib/Analysis/ScopInfo.cpp:2894
@@ +2893,3 @@
+
+      Stmt.addDependentScalar(DefInst);
+      ResolvedAccesses.push_back(MA);
----------------
Meinersbur wrote:
> I'd expect this to be computed in collectNonTrivialOperands(), like a partition of instructions in the operand tree into
> 
> - Movable
> - Outside/Sideeffect (operand tree leaves)
> 
> I understand your algorithm works differently, but to understand, what is the rationale to only have direct operands in DependentScalars, while we want to copy/move an entire operand tree?
We copy/move the entire operand tree but for that I do not need to traverse it and put all instructions into a list. I simply store the root of the tree and traverse it during the actual copy/move.

================
Comment at: lib/Analysis/ScopInfo.cpp:2932
@@ +2931,3 @@
+        auto *UserMAs = UserStmt->lookupAccessesFor(UserInst);
+        if (!UserMAs)
+          continue;
----------------
Meinersbur wrote:
> If a user has no more accesses, is it because it is intended to be removed?
Actually, it has already been removed, hence the statement but no read access.

================
Comment at: lib/Analysis/ScopInfo.cpp:2935
@@ +2934,3 @@
+
+        for (auto *UserMA : *UserMAs) {
+          if (UserMA->isExplicit() || UserMA->isWrite())
----------------
Meinersbur wrote:
> auto -> MemoryAccess ?
We could do that.

================
Comment at: lib/Analysis/ScopInfo.cpp:2940
@@ +2939,3 @@
+          AllUsersRemoved = false;
+          break;
+        }
----------------
Meinersbur wrote:
> Is this break meant to leave both for-loops?
It could be but doesn't need to.

================
Comment at: lib/Analysis/ScopInfo.cpp:2947
@@ +2946,3 @@
+
+      ResolvedAccesses.push_back(MA);
+    }
----------------
Meinersbur wrote:
> This seems to be a different kind of "resolved" than in removeRecomputableScalarReads.
No it is the same but for the write accesses instead of the reads. Or to be more precise, it is the passive version as writes are (passively) resolved if all their reads have been (actively) resolved.

================
Comment at: lib/Analysis/ScopInfo.cpp:2963
@@ +2962,3 @@
+void Scop::simplifyScalarAccesses() {
+  // First iterate over all implicit write accesses, hence scalar definitions
+  // and collect all operands that might have side effects or read memory as
----------------
Meinersbur wrote:
> collectNonTrivialOperands actually skips PHIs which are also implicit accesses.
It does not skip them but stops as they are non-trivial operands.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:67
@@ +66,3 @@
+  for (auto *Inst : Stmt.getDependentScalars())
+    if (!GlobalMap.count(Inst))
+      copyInstruction(Stmt, Inst, BBMap, LTS, NewAccesses, true);
----------------
Meinersbur wrote:
> When does a dependent scalar become a global?
If it is hoisted.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:86
@@ +85,3 @@
+            if (R.contains(Inst))
+              copyInstruction(Stmt, Inst, BBMap, LTS, nullptr, true);
+        }
----------------
Meinersbur wrote:
> Shouldn't these be in the BBMap after recomputeDependentScalars() executed?
After recomputeDependentScalars is executed they are in BBMap. Though our synthesize code sometimes thinks it can sythesize something while it actually referes to old values we have to recompute first while we are in recomputeDependentScalars.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:130
@@ -111,3 +129,3 @@
 
-  if (Value *New = trySynthesizeNewValue(Stmt, Old, BBMap, LTS, L))
-    return New;
+  if (canSynthesize(Old, &LI, &SE, &Stmt.getParent()->getRegion()))
+    if (Value *New = trySynthesizeNewValue(Stmt, Old, BBMap, LTS, L))
----------------
Meinersbur wrote:
> Why does this become necessary? Shouldn't "trySynthesizeNewValue" determine itself whether it is synthesizable?
Good question. I don't remember anymore. I'll check once the general patch is agreed on.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:147
@@ -124,4 +146,3 @@
   // The scalar dependence is neither available nor SCEVCodegenable.
   llvm_unreachable("Unexpected scalar dependence in region!");
 }
----------------
Meinersbur wrote:
> This essentially becomes
> 
>     assert(TryOnly)
>     return nullptr;
Seems fine to me.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:169
@@ +168,3 @@
+      Instruction *NewOperandInst = dyn_cast_or_null<Instruction>(NewOperand);
+      if (!NewOperand || (NewOperandInst && R.contains(NewOperandInst))) {
+        if (Instruction *OldOperandInst = dyn_cast<Instruction>(OldOperand)) {
----------------
Meinersbur wrote:
> What is the rationale for this condition?
If we could not generate a new operand but we need to because the old one is an instruction in the region we have to do something.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:171
@@ +170,3 @@
+        if (Instruction *OldOperandInst = dyn_cast<Instruction>(OldOperand)) {
+          copyInstruction(Stmt, OldOperandInst, BBMap, LTS, nullptr, Recompute);
+          NewOperand = BBMap[OldOperand];
----------------
Meinersbur wrote:
> Shouldn't this been handled by recomputeDependentScalars() already? How do we know that the operand is even a DependentScalar?
> 
> If this is because DependendScalars is unordered, wouldn't it be cleaner to ensure they are in their order of dependence?
> Shouldn't this been handled by recomputeDependentScalars() already? How do we know that the operand is even a DependentScalar?

This code is only triggered while we execute recomputeDependentScalars.

> If this is because DependendScalars is unordered, wouldn't it be cleaner to ensure they are in their order of dependence?

Which order of dependence? We got a DAG here and have to code generate it. One way is to find some topological order on the whole dag and then generate code, the other (which is implemented) is a recursive demand driven code generation. As we do the second everywhere else I figured it makes sense to do it here too.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:226
@@ -191,3 +225,3 @@
 
-Loop *BlockGenerator::getLoopForInst(const llvm::Instruction *Inst) {
-  return LI.getLoopFor(Inst->getParent());
+Loop *BlockGenerator::getLoopForStmt(const ScopStmt &Stmt) const {
+  auto *StmtBB =
----------------
Meinersbur wrote:
> I like this change, but could be committed independently. Maybe also as a member function of ScopStmt?
I can commit it a priori but I don't see the need for a ScopStmt member function at the moment.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:228
@@ +227,3 @@
+  auto *StmtBB =
+      Stmt.isBlockStmt() ? Stmt.getBasicBlock() : Stmt.getRegion()->getEntry();
+  return LI.getLoopFor(StmtBB);
----------------
Meinersbur wrote:
> This idiom appears multiple times. Introduce a helper function?
That's something not conflicting or required for this commit we can do.

================
Comment at: lib/CodeGen/BlockGenerators.cpp:334
@@ +333,3 @@
+
+  recomputeDependentScalars(Stmt, BBMap, LTS, NewAccesses);
+
----------------
Meinersbur wrote:
> Any reason to do this before generateScalarLoads()? Naively, I'd think it belongs after it because some if the dependent instructions might still use scalars (e.g. read-only ones).
First, by construction we can place it here as all recomputable scalars do not depend on anything we cannot recompute. Second, we can probably move it after the scalar loads generation but it should be all the same.

================
Comment at: test/Isl/CodeGen/srem-in-other-bb.ll:1
@@ -1,3 +1,2 @@
-; RUN: opt %loadPolly -polly-codegen -S \
-; RUN:     < %s | FileCheck %s
+; RUN: opt %loadPolly -polly-codegen -S < %s | FileCheck %s
 ;
----------------
Meinersbur wrote:
> Nice, but unrelated
I probably forogot that while I prepared this commit and all the test cases. I'm sorry.


http://reviews.llvm.org/D13611





More information about the llvm-commits mailing list