[PATCH] D8688: Update MergedLoadStoreMotion to use MemorySSA

Wed Jul 6 20:11:43 PDT 2016

Gerolf added inline comments.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:830
@@ +829,3 @@
+/// the hash table and can be hoisted.
+///
+bool MergedLoadStoreMotion::mergeLoadsMemorySSA(BasicBlock *BB) {
----------------
Ok, I assumed the while loop is on the hot path usually. Then the code checks the (part of the) while condition twice.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:846
@@ +845,3 @@
+  auto *Succ1Accesses = MSSA->getBlockAccesses(Succ1);
+  if (!Succ1Accesses)
+    return false;
----------------
Hm, it looks alright today :-). Thanks.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:889
@@ +888,3 @@
+  bool MergedStores = false;
+  assert(T && "Footer of a diamond cannot be empty");
+
----------------
Footer -> Tail in error message

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:907
@@ +906,3 @@
+
+  // Skip all this if we don't have any memory accesses to look at.
+  auto *Pred0Accesses = MSSA->getBlockAccesses(Pred0);
----------------
There is similar code in mergeLoads. That could be a utility function that returns false when there is no memory access in at least one of the blocks, e.g.. static bool hasMemoryAccess(BasicBlock *B1, BasicBlock *B2).

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:929
@@ +928,3 @@
+      ++NStores;
+      if (NStores * Size1 >= MagicCompileTimeControl)
+        break;
----------------
Should be if (++NStores) to be consistent with your code in mergeLoads. Or change the code for NLoads there.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:933
@@ +932,3 @@
+  }
+
+  for (auto AI = Pred1Accesses->rbegin(), AE = Pred1Accesses->rend(); AI != AE;
----------------
Could be a separate function:

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:950
@@ +949,3 @@
+
+    auto LookupResult = StoreInfo.equal_range(Store1);
+    bool Res = false;
----------------
Please add a comment about what the equal stores are. 

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:954
@@ +953,3 @@
+
+    while (!Res && LookupIter != LookupResult.second) {
+      StoreInst *Store0 = *LookupIter;
----------------
Since it is sinking stores would it speed up the code reversing the loop traversal (from second to first)?

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:979
@@ +978,3 @@
+  }
+  for (auto V : AccessesToDelete) {
+    MSSA->removeMemoryAccess(V);
----------------
A comment would be nice why the accesses get cleared here. Or would that fit better in sinkStores? 

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:988
@@ +987,3 @@
+///
+/// \brief True when the store is not downwards explosed to block \p End
+
----------------
Typo: explosed

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1003
@@ +1002,3 @@
+  // Process all the uses and see if they are below end.
+  while (!CurrentUses.empty()) {
+    MemoryAccess *MA = CurrentUses.front();
----------------
What is a good invariant for this loop? Like there shouldn't be more checks than memory references in the code that contains Start? A checked invariant would make me more comfortable with this code.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1019
@@ +1018,3 @@
+    // that lead to the sink point.
+    if ((CurrDFS.first >= StartDFS.first && CurrDFS.first <= StartDFS.first) &&
+        CurrDFS.second >= StartDFS.second &&
----------------
That looks very wrong. At the minimum this is a convoluted equality test.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1052
@@ +1051,3 @@
+  //
+  // 1. There are no stores above S0 or S1 in either block.
+  // In this case, no phi node is necessary, both MemoryDefs  must have the same
----------------
I continue struggling with this function. For now I just need some help understanding the comment.
When you say "above" S0, S1 you refer to DFS numbering like Sx is above S0 in this example:

0 Sx
1
2 S0 

Correct? This would be consistent with your DFS explanation in a previous explanation.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1054
@@ +1053,3 @@
+  // In this case, no phi node is necessary, both MemoryDefs  must have the same
+  // defining access, as the top of the diamond can only have one reaching
+  // MemoryDef/MemoryPhi at the end. After use replacement, we may end up with a
----------------
What is the "top of the diamond"? The head?

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1057
@@ +1056,3 @@
+  // phi
+  // in the sink location block having all the same operands, which we check
+  // for.
----------------
What is the "sink location block"? The Tail?

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1060
@@ +1059,3 @@
+  //
+  // 2. There are stores above either S0 or S1, and uses below the block
+  // In this case, a phi node will already exist in the sink location, but may
----------------
What is "the block"? Perhaps "uses in the tail and/or below" describes it better.

================
Comment at: lib/Transforms/Scalar/MergedLoadStoreMotion.cpp:1061
@@ +1060,3 @@
+  // 2. There are stores above either S0 or S1, and uses below the block
+  // In this case, a phi node will already exist in the sink location, but may
+  // need to be updated.
----------------
Would it make sense to clearly distinguish phi and memory phi (mphi) nodes? 

http://reviews.llvm.org/D8688