[PATCH] D156889: [mlir][cf] Add ControlFlow to SCF lifting pass

Thu Aug 3 00:53:50 PDT 2023

gysit added a comment.

Amazing progress!

I did a first pass and the logic makes sense to me. I left a couple of nit and naming comments. The latter can be seen as suggestions, take what you think makes sense.

================
Comment at: mlir/include/mlir/Transforms/CFGToSCF.h:37
+  /// left as is.
+  virtual Operation *createBranchRegionOp(OpBuilder &builder,
+                                          Operation *controlFlowCondOp,
----------------
Would it make sense to prefix all methods that create SCF operation with SCF? E.g., for this one it may make sense to rename to `createSCFBranchOp` and below `createSCFLoopOp`. That way it may be a bit easier to follow in the code when new SCF stuff is created.

================
Comment at: mlir/lib/Conversion/ControlFlowToSCF/ControlFlowToSCF.cpp:136-138
+    // TODO: This should create a `ub.unreachable` op once that exists to make
+    //       the pass independent of the func dialect. For now just return
+    //       poison values.
----------------

================
Comment at: mlir/lib/Conversion/ControlFlowToSCF/ControlFlowToSCF.cpp:171-174
+    if (result.wasInterrupted()) {
+      signalPassFailure();
+      return;
+    }
----------------
ultra nit:

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:91
+//
+// Lifting conditional branches is done by analyzing the first conditional
+// branch encountered in the entry region. The algorithm then identifies
----------------
or some other form of emphasizing the first would be nice since it is pretty important.

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:129-130
+/// Returns the mutable operand range used to transfer operands from `block` to
+/// its successor with the given index. The return range being mutable allows
+/// modifying the operands being transferred.
+static MutableOperandRange
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:190
+
+/// Structure containing all special edges of a cycle. A cycle is a
+/// generalization of a loop that may have multiple entry edges. See also
----------------
I would probably enumerate: Structure that contains the entry, exit, and back edges of a cycle.

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:193
+/// https://llvm.org/docs/CycleTerminology.html.
+struct CyclesEdges {
+  /// All edges from a block outside the cycle to a block inside the cycle.
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:204
+/// Class used to orchestrate creation of so-called edge multiplexers.
+/// This constructs creates a new basic blocks and redirects all edges given to
+/// be routed through that basic block before branching to their original
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:210-212
+  Block *multiplexerBlock;
+  function_ref<Value(unsigned)> getSwitchValue;
+  function_ref<Value(Type)> getUndefValue;
----------------
nit: I would probably put the private member variables at the end of the class definition? Also maybe put an explicit `private:` at the beginning of the class definition?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:234-237
+  /// appropriate block arguments after the first entry block. `extraArgs`
+  /// allows adding additional block arguments after the required ones to the
+  /// multiplexer block, allowing additional successor operands to be passed
+  /// from every edge.
----------------
nit: `extraArgs` contains the types of possible extra block arguments passed to the multiplexer block that are added to the successor operands of  every outgoing edge.

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:257
+    // be assigned the `getUndefValue`. The amount of block arguments and their
+    // offset is saved in the map for `redirectEdges` to transform the edges.
+    llvm::SmallMapVector<Block *, BlockEntry, 4> entries;
----------------
nit: `redirectEdges` == `entries`? Would it make sense to rename to edgeToArgRangeMap, or similar?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:304
+    // successor block.
+    ValueRange destArgs = multiplexerBlock->getArguments();
+    iter = llvm::transform(
----------------
Could we fill the array using an llvm::enumerate and if blocks that check based on the index what we need todo?

Something similar to this:

```
for (auto &it : llvm::enumerate(multiplexerBlock->getArguments()) {
  if (block.offset <= it.index() < block.offset + block.size) {
    // copy args of this block
   continue; 
  }
  if (it.index() == discriminatorIdx) {
    // copy discriminator 
    continue;
  }
  if (it.index() > discriminatorIdx) {
    // copy extra argument
    continue;
  }
  // create an undef value
}
```

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:315
+
+    unsigned remainingArgsSize = discriminator ? 1 : 0;
+    remainingArgsSize += extraArgs.size();
----------------
nit: discriminatorAndExtraArgsSize? The term remaining seems a bit confusing here since I would interpret this as the arguments of the remaining blocks. 

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:348
+    SmallVector<Block *> caseDestinations;
+
+    for (auto &&[index, pair] : llvm::enumerate(entries)) {
----------------
ultra nit: drop newline?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:362
+    // create a dummy flag for the switch.
+    Value realDiscriminator = discriminator && caseArguments.size() > 1
+                                  ? discriminator
----------------
isn't caseArgument.size() > 1 iff discriminator != 0? if yes I would probably simplify to:

```
if (caseArguments.size() == 1)
  discriminator = getSwitchValue(0);
```

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:382
+/// to be equivalent.
+struct ReturnLikeOperationEquivalence : public llvm::DenseMapInfo<Operation *> {
+  static unsigned getHashValue(const Operation *opC) {
----------------
ultra nit: ReturnLikeOpEquivalence or ReturnLikeEquivalence?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:406
+/// every return-like operation.
+class ReturnLikeExitTransformer {
+  // Mapping of return-like operation to block. All return-like operations
----------------
nit: maybe RegionExitCombiner or similar? Transformer sounds a bit generic?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:407-411
+  // Mapping of return-like operation to block. All return-like operations
+  // of the same kind with the same attributes, properties and types are seen as
+  // equivalent. First occurrence seen is kept in the map.
+  llvm::SmallDenseMap<Operation *, Block *, 4, ReturnLikeOperationEquivalence>
+      mapping;
----------------
gysit wrote:
> Let's move the private members to the end of the class definition? The comment could be transformed to a doc string?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:407-413
+  // Mapping of return-like operation to block. All return-like operations
+  // of the same kind with the same attributes, properties and types are seen as
+  // equivalent. First occurrence seen is kept in the map.
+  llvm::SmallDenseMap<Operation *, Block *, 4, ReturnLikeOperationEquivalence>
+      mapping;
+  Region &topLevelRegion;
+  CFGToSCFInterface &interface;
----------------
Let's move the private members to the end of the class definition? The comment could be transformed to a doc string?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:423
+  /// instance of `returnLikeOperation`s kind.
+  void transform(Operation *returnLikeOperation,
+                 llvm::function_ref<Value(unsigned)> getSwitchValue) {
----------------
nit: maybe rename to combineWithPrevExits or similar?

Also, would it make sense to shorten operation to op for some of the variables, e.g. returnLikeOperation -> returnLikeOp or originalOperation -> originalOp?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:461
+  /// Returns true if any IR transformations have occurred so far.
+  bool changed() const {
+    // IR transformations have occurred if any exit blocks were created.
----------------
nit: hasChanged

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:470
+
+/// Calculates all special edges of the given cycle.
+static CyclesEdges
----------------
nit: -> the entry, exit, and back edges of the given cycle.

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:496-502
+  for (Block *block : cycles)
+    for (auto &&[succIndex, succ] : llvm::enumerate(block->getSuccessors())) {
+      if (!entryBlock.contains(succ))
+        continue;
+
+      result.backEdges.emplace_back(block, succIndex);
+    }
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:549
+             llvm::map_range(backEdges, std::mem_fn(&Edge::getSuccessor))) &&
+         "All repetition arcs must lead to the single loop header");
+
----------------
nit: should we replace arc by edge here and below?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:572-573
+  // Since this is a loop, all back edges point to the same loop header.
+  Edge latchRepr = backEdges.front();
+  Block *loopHeader = latchRepr.getSuccessor();
+
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:751
+
+    CyclesEdges arcs = calculateCycleEdges(cycleBlockSet);
+    Block *loopHeader = arcs.entryEdges.front().getSuccessor();
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:872
+    DominanceInfo &dominanceInfo) {
+  switch (regionEntry->getNumSuccessors()) {
+  case 0:
----------------
nit: I would favor if blocks over switch case in this case:

```
if (regionEntry->getNumSuccessors() == 0) {
  //
}
if (regionEntry->getNumSuccessors() == 1) {
  //
}
```

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:899
+  // determined and become the region following that edge.
+  // The last region is the continuation that follow the branch regions.
+  SmallPtrSet<Block *, 8> notContinuation;
----------------

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:958
+  // that case the control flow operation must stay as we are unable to create a
+  // single exit-block. We can nevertheless process all its successors as they
+  // are all single-entry regions.
----------------
?

Especially the case three below is quite hard to follow and I am not sure the first sentence matches may understanding of the code. I wonder if examples would be more helpful here than text only?

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:960
+  // are all single-entry regions.
+  // 3) Some branch regions have an edge to one or more continuation.
+  // This occurs if some branches end in the same return-like operation and some
----------------
Reading the code it seems the condition is somewhat more complex:
- Either only a subset has a continuation edge
- Or not all continuation edges point to the same continuation block.
is that correct?

I guess an example may be more intuitive to understand in this case.

================
Comment at: mlir/lib/Transforms/Utils/CFGToSCF.cpp:1171
+
+    size_t sizePrior = workList.size();
+    llvm::append_range(workList, transformCyclesToSCFLoops(
----------------
A comment that explains what the transform functions add to the work lists may make sense here albeit it being a bit repetitive given the function doc strings. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156889/new/

https://reviews.llvm.org/D156889