[PATCH] D74544: [MLIR] Add naive fusion of parallel loops.

Fri Feb 14 04:17:30 PST 2020

herhut requested changes to this revision.
herhut added a comment.
This revision now requires changes to proceed.

Cool, great start!

================
Comment at: mlir/include/mlir/Dialect/LoopOps/Passes.h:1
+#ifndef MLIR_DIALECT_LOOPOPS_PASSES_H_
+#define MLIR_DIALECT_LOOPOPS_PASSES_H_
----------------
This is lacking the LLVM header comment.

================
Comment at: mlir/lib/Dialect/LoopOps/Transforms/ParallelLoopFusion.cpp:39
+                           const OperandRange &rhs) -> bool {
+    // TODO(pifon): Extend this to support aliases and equal constants.
+    return std::equal(lhs.begin(), lhs.end(), rhs.begin());
----------------
Avoid named TODO in llvm code base. I am an offender, as well, but I was told to avoid this in the future.

================
Comment at: mlir/lib/Dialect/LoopOps/Transforms/ParallelLoopFusion.cpp:48
+/// Analyzes dependencies in the most primitive way by checking that ploop1
+/// writes to the same buffer elements that ploop2 reads.
+static LogicalResult verifyDependencies(ParallelOp ploop1, ParallelOp ploop2,
----------------
You also need that `ploop2` does not write to buffers that `ploop1` reads. Both loops are fully independent if WRITES(ploop1) U READS(ploop2) is empty and READS(ploop1) U WRITES(ploop2) is empty. As we have sequential execution order within one iteration, we can ignore reads and writes between the two loops that are on the same index in the context of fusion.

================
Comment at: mlir/lib/Dialect/LoopOps/Transforms/ParallelLoopFusion.cpp:49
+/// writes to the same buffer elements that ploop2 reads.
+static LogicalResult verifyDependencies(ParallelOp ploop1, ParallelOp ploop2,
+                                        const BlockAndValueMapping &map) {
----------------
Can you add a comment what the `map` contains? This is a map from `ploop1` indices to `ploop2` indices I assume?

================
Comment at: mlir/lib/Dialect/LoopOps/Transforms/ParallelLoopFusion.cpp:65
+    // Check that the load indices of ploop2 coincide with store indices of
+    // ploop1 for the same memrefs.
+    auto storeIndices = write->second.front();
----------------
This is sufficient for memrefs that are defined outside of the loops. However, memrefs that are created within the loop, e.g. by a subview, will not be handled by this. As a starter, it would be ok to bail out in these cases.

================
Comment at: mlir/lib/Dialect/LoopOps/Transforms/ParallelLoopFusion.cpp:70
+      return WalkResult::interrupt();
+    for (int i = 0, e = storeIndices.size(); i < e; ++i) {
+      if (map.lookupOrDefault(storeIndices[i]) != loadIndices[i])
----------------
Does something like `llvm::all_of(llvm::zip(storeIndices, loadIndices), std::equal)` work? Just curious, no need to go there.

================
Comment at: mlir/lib/Dialect/LoopOps/Transforms/ParallelLoopFusion.cpp:110
+    // Iteratively fuse adjacent loops.
+    for (int i = 0, e = ploops.size(); i + 1 < e; ++i)
+      fuseIfLegal(ploops[i], ploops[i + 1], b);
----------------
What happens if there is a read or write inbetween the ploops? As a start, you could consider immediately adjacent loops or loops where we have no sideffecting ops inbetween.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74544/new/

https://reviews.llvm.org/D74544