[PATCH] D79184: [MLIR][LoopOps] Adds the loop unroll transformation for loop::ForOp.

Tue May 5 03:12:01 PDT 2020

ftynse accepted this revision.
ftynse added a comment.
This revision is now accepted and ready to land.

Thanks, Andy!

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:140
+// where divis is rounding-to-zero division.
+static Value ceilDivPositive(OpBuilder &builder, Location loc, Value dividend,
+                             Value divisor) {
----------------
Would it make sense to share this with affine expression lowering https://github.com/llvm/llvm-project/blob/d3588d0814c4cbc7fca677b4d9634f6e1428a331/mlir/lib/Conversion/AffineToStandard/AffineToStandard.cpp#L149  ?

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:211
+  auto *parentBlock = forOp.getOperation()->getBlock();
+  if (!iv.use_empty())
+    iv.replaceAllUsesWith(lbCstOp);
----------------
Nit: it feels like you could just call `iv.replaceAllUsesWith` and let it do nothing if there are no uses

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:489
+  // so that we know what to clone (since we are doing this in-place).
+  Block::iterator srcBlockEnd = std::prev(loopBodyBlock->end(), 2);
+
----------------
Since this is only used inside `std::next` below, how about taking `std::prev(..., 1)` and dropping `std::next` ?

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:504
+    // Clone the original body of 'forOp'.
+    for (auto it = loopBodyBlock->begin(); it != std::next(srcBlockEnd); it++) {
+      builder.clone(*it, operandMap);
----------------
Nit: drop trivial braces

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:606
+    int64_t stepCst = stepCstOp.getValue();
+    assert(lbCst >= 0 && ubCst >= 0 && stepCst >= 0 &&
+           "expected positive loop bounds and step");
----------------
Please document this precondition. I don't think loop::ForOp disallows negative bounds.

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:616
+    generateEpilogueLoop = upperBoundUnrolledCst < ubCst;
+    if (generateEpilogueLoop) {
+      upperBoundUnrolled =
----------------
Nit: drop trivial braces

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:630
+    // Dynamic loop bounds computation.
+    // TODO(andydavis) Add dynamic asserts for negative lb/ub/step.
+    auto lowerBound = forOp.lowerBound();
----------------
If you take the ceildiv implementations from AffineApplyExpander, you may be able to support negative dividends.

================
Comment at: mlir/lib/Transforms/Utils/LoopUtils.cpp:664
+                       [&](unsigned i, Value iv, OpBuilder b) {
+                         // iv' = iv + 1/2/3...unrollFactor-1;
+                         auto stride = b.create<MulIOp>(
----------------
This comment looks confusing because it doesn't account for step.  (Same issue with the affine version)

================
Comment at: mlir/test/Dialect/Loops/loop-unroll.mlir:20
+//
+//   UNROLL-BY-2-DAG:  %[[V0:.*]] = subi %[[UB]], %[[LB]] : index
+//   UNROLL-BY-2-DAG:  %[[C1:.*]] = constant 1 : index
----------------
Why is it a DAG? Is there some non-determinism in operation order?

================
Comment at: mlir/test/Dialect/Loops/loop-unroll.mlir:47
+
+func @dynamic_loop_unroll_by_3(%arg0 : index, %arg1 : index, %arg2 : index,
+                               %arg3: memref<?xf32>) {
----------------
Hmm, could you just use the same input (i.e. `@dynamic_loop_unroll`) and match it with different prefixes?  All input functions are transformed by all four RUNs and most of them are just ignored in the test.

================
Comment at: mlir/test/lib/Transforms/TestLoopUnrolling.cpp:1
+//===- TestLoopUnrolling.cpp --- loop unrolling test pass -------===//
+//
----------------
Nit: pad until 80 characters

================
Comment at: mlir/test/lib/Transforms/TestLoopUnrolling.cpp:46
+    FuncOp func = getFunction();
+    SmallVector<std::pair<unsigned, loop::ForOp>, 4> loops;
+    func.walk([&](loop::ForOp forOp) {
----------------
Nit: since the required depth is known upfront, how about just storing the loops of this depth in a vector, instead of filtering the vector of all loops later?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79184/new/

https://reviews.llvm.org/D79184