[PATCH] D111337: [fir] Add array value copy pass

Mehdi AMINI via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 2 22:32:57 PDT 2021


mehdi_amini added a comment.

Thanks for the update, the size makes it tractable now, and the test helps to figure out the corner cases when the code isn't straightforward.

I mostly looked at `ArrayCopyAnalysis` here.

One thing that can be problematic for future patches: the control flow in general should be handled with the ControlFlowInterface <https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td> which may not be yet implemented on the relevant FIR ops. But that had a significant impact on the code inside the passes, so likely something to consider prioritizing downstream.



================
Comment at: flang/include/flang/Optimizer/Builder/FIRBuilder.h:403
+fir::ExtendedValue readBoxValue(fir::FirOpBuilder &builder, mlir::Location loc,
+                                const fir::BoxValue &box);
+
----------------
Is there a use for this method? (same for the other above)


================
Comment at: flang/include/flang/Optimizer/Builder/FIRBuilder.h:417
+fir::ExtendedValue createStringLiteral(fir::FirOpBuilder &, mlir::Location,
+                                       llvm::StringRef string);
+
----------------
No definition added for this declaration?


================
Comment at: flang/include/flang/Optimizer/Builder/FIRBuilder.h:467
+    fir::FirOpBuilder &builder, mlir::Location loc,
+    const fir::ExtendedValue &array, mlir::Value element, mlir::Value slice);
+
----------------
Is this method used somewhere? (Same for the two above)


================
Comment at: flang/include/flang/Optimizer/Transforms/Factory.h:145
+/// \p shapeVal is empty or is a fir.shift.
+inline std::vector<mlir::Value> getExtents(mlir::Value shapeVal) {
+  if (shapeVal)
----------------
Do you know why this is returning a vector instead of an OperandRange? (this is a question for the `ShapeOp` definition).

It seems strange to force copy on all the accessors when they may just want to iterate on the Values...


================
Comment at: flang/include/flang/Optimizer/Transforms/Passes.td:95
+  let dependentDialects = [
+    "fir::FIROpsDialect", "mlir::StandardOpsDialect"
+  ];
----------------
Can you just let me know what fails if you remove any of these?


================
Comment at: flang/lib/Optimizer/Builder/FIRBuilder.cpp:289
+  //   return createConvert(loc, toTy, rp);
+  // }
+  return createConvert(loc, toTy, val);
----------------
Why the commented out code? Can it just be removed?


================
Comment at: flang/lib/Optimizer/Builder/FIRBuilder.cpp:363
 
-static mlir::Value genNullPointerComparison(fir::FirOpBuilder &builder,
-                                            mlir::Location loc,
-                                            mlir::Value addr,
-                                            arith::CmpIPredicate condition) {
+mlir::Value fir::FirOpBuilder::createSlice(mlir::Location loc,
+                                           const fir::ExtendedValue &exv,
----------------
Is this method used?


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:89
+  /// Get all the array value operations that use the original array value
+  /// as passed to `store`.
+  void arrayAccesses(llvm::SmallVectorImpl<mlir::Operation *> &accesses,
----------------
I don't understand this comment (I read it 4 times though).

Edit: the comment on the implementation is actually much more clear: 

```
/// Find all the array operations that access the array value that is loaded by
/// the array load operation, `load`.
```




================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:108
+  ReachCollector(llvm::SmallVectorImpl<mlir::Operation *> &reach,
+                 mlir::Region *loopRegion)
+      : reach{reach}, loopRegion{loopRegion} {}
----------------
A comment around the constructor (or the member variable definition) would be helpful to say:
```
If provided, the `loopRegion` is the body of a loop that produces the array of interest.
```
(or something like that)



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:116
+    }
+    for (auto v : range)
+      collectArrayAccessFrom(v);
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:120
+
+  void collectArrayAccessFrom(mlir::Operation *op, mlir::Value val) {
+    // `val` is defined by an Op, process the defining Op.
----------------
I am very concerned about a recursive algorithm on def-use chain: this is a ticking bomb for a stack overflow in the compiler.
Can you make this an iterative algorithm with an explicit stack?


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:146
+    if (auto box = mlir::dyn_cast<EmboxOp>(op)) {
+      for (auto *user : box.memref().getUsers())
+        if (user != op)
----------------
(here and elsewhere)


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:155
+      return;
+    }
+
----------------
Of all the `if` above, only the `DoLoopOp` seems tested. There are some specific cases for the treatment of EmboxOp and ArrayMergeStoreOp that likely should be covered


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:229
+  /// `array_merge_store`.
+  static void reachingValues(llvm::SmallVectorImpl<mlir::Operation *> &reach,
+                             mlir::Value seq) {
----------------
This is the only method actually accessed outside of the class, can you make sure it is the only one in the public section?

Everything else could be private (constructor included)


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:234
+    if (auto doLoop =
+            mlir::dyn_cast_or_null<fir::DoLoopOp>(seq.getDefiningOp()))
+      loopRegion = &doLoop->getRegion(0);
----------------
Should this also support other kind of loops? (if so add a TODO)


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:250
+    }
+    return false;
+  }
----------------
Isn't this whole function ` isAncestor`?

`return loopRegion->isAncestor(op->getParentRegion());`

Edit: actually, `loopRegion` may be null here but it isn't obvious from the class invariant. So instead:

`return loopRegion && loopRegion->isAncestor(op->getParentRegion());`



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:259
+    // Go through potential converts/coordinate_op.
+    for (auto indirectUser : op->getUsers())
+      followUsers(indirectUser);
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:260
+    for (auto indirectUser : op->getUsers())
+      followUsers(indirectUser);
+  }
----------------
Same recursion issue here.


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:277
+  if (lmIter != loadMapSets.end()) {
+    for (auto *opnd : lmIter->second) {
+      auto *owner = opnd->getOwner();
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:278
+    for (auto *opnd : lmIter->second) {
+      auto *owner = opnd->getOwner();
+      if (mlir::isa<ArrayFetchOp, ArrayUpdateOp, ArrayModifyOp>(owner))
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:289
+  auto appendToQueue = [&](mlir::Value val) {
+    for (auto &use : val.getUses())
+      if (!visited.count(&use)) {
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:302
+  while (!queue.empty()) {
+    auto *operand = queue.pop_back_val();
+    auto *owner = operand->getOwner();
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:303
+    auto *operand = queue.pop_back_val();
+    auto *owner = operand->getOwner();
+    if (!owner)
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:304
+    auto *owner = operand->getOwner();
+    if (!owner)
+      continue;
----------------
I don't understand how this can happen: an OpOperand does not have a lifetime outside of its owner operation IIRC


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:308
+      if (auto blockArg = ro.iterArgToBlockArg(operand->get())) {
+        auto arg = blockArg.getArgNumber();
+        auto output = ro.getResult(ro.finalValue() ? arg : arg - 1);
----------------
Make the type explicit (you're doing arithmetic `arg - 1` on the next line, I shouldn't have to think about the impact of signedness here)





================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:309
+        auto arg = blockArg.getArgNumber();
+        auto output = ro.getResult(ro.finalValue() ? arg : arg - 1);
+        appendToQueue(output);
----------------
The `finalValue()` check here is also another case not covered in the test.

`finalValue` isn't documented in ODS and `initArgs` isn't documented either, it'd be great if this could be addressed (separately from this revision of course)



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:309
+        auto arg = blockArg.getArgNumber();
+        auto output = ro.getResult(ro.finalValue() ? arg : arg - 1);
+        appendToQueue(output);
----------------
mehdi_amini wrote:
> The `finalValue()` check here is also another case not covered in the test.
> 
> `finalValue` isn't documented in ODS and `initArgs` isn't documented either, it'd be great if this could be addressed (separately from this revision of course)
> 



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:314
+    };
+    auto branchOp = [&](mlir::Block *dest, auto operands) {
+      for (auto i : llvm::enumerate(operands))
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:315-319
+      for (auto i : llvm::enumerate(operands))
+        if (operand->get() == i.value()) {
+          auto blockArg = dest->getArgument(i.index());
+          appendToQueue(blockArg);
+        }
----------------
This loop isn't great when you already have everything to compute offsets.


In general all the special casing for the Loop, branches, and the control flow in general should be handled with the ControlFlowInterface: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td

This would make the code entirely uniform, and any new op that has to do with the control flow would not require to modify the pass.

I'm fine if you want to handle this later for this pass (please add a big `// TODO: this need to be updated to use the control-flow interface`), but I'm not sure we should upstream many other passes that would be implemented without going through interfaces.



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:328-329
+      // Thread any uses of fir.if that return the marked array value.
+      auto *parent = rs->getParentRegion()->getParentOp();
+      if (auto ifOp = mlir::dyn_cast<fir::IfOp>(parent))
+        appendToQueue(ifOp.getResult(operand->getOperandNumber()));
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:360
+  }
+  loadMapSets.insert({load, visited});
+}
----------------
Seems like we're storing much more than what we need actually, I tried locally to change `loadMapSets` to store the SmallVector of "accesses" and it is just enough (and it makes subsequent call even easier).

Here is a patch: https://gist.github.com/joker-eph/8f00e79783688bf6349b6b5d9d44a480



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:371
+  auto stEleTy = fir::dyn_cast_ptrOrBoxEleTy(addr.getType());
+  for (auto *op : reach)
+    if (auto ld = mlir::dyn_cast<ArrayLoadOp>(op)) {
----------------
This is a complex body, worth using braces here


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:372
+  for (auto *op : reach)
+    if (auto ld = mlir::dyn_cast<ArrayLoadOp>(op)) {
+      auto ldTy = ld.memref().getType();
----------------
In general, early continue reduces indentation:



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:373-375
+      auto ldTy = ld.memref().getType();
+      if (auto boxTy = ldTy.dyn_cast<fir::BoxType>())
+        ldTy = boxTy.getEleTy();
----------------
If this is a pattern that repeats, it'll be worth having a helper (potentially a method on `ArrayLoadOp` : `Type getUnboxedLoadedType()` or something like that)


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:389
+
+static bool conflictOnMerge(llvm::ArrayRef<mlir::Operation *> accesses) {
+  if (accesses.size() < 2)
----------------
This would be worth a comment I think


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:416
+    } else {
+      mlir::emitError(op->getLoc(), "unexpected operation in analysis");
+    }
----------------
Seems like this should be an assert



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:418-421
+    if (compareVector.size() != indices.size() ||
+        llvm::any_of(llvm::zip(compareVector, indices), [&](auto pair) {
+          return std::get<0>(pair) != std::get<1>(pair);
+        }))
----------------
SmallVector has an `operator==`, is this different?


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:438-442
+  for (auto &region : regions)
+    for (auto &block : region.getBlocks())
+      for (auto &op : block.getOperations()) {
+        if (op.getNumRegions())
+          construct(op.getRegions());
----------------
Isn't this entire loop nest + recursion just a walk?


Also, I'd change the signature of the `construct` method to be:
`void ArrayCopyAnalysis::construct(Operation *topLevelOp) {`

So that you can just write `topLevelOp->walk([&] (Operation *op) {`

(the only reason for the current signature was the recursion as far as I can tell)



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:439
+  for (auto &region : regions)
+    for (auto &block : region.getBlocks())
+      for (auto &op : block.getOperations()) {
----------------
Non trivial nesting should use braces ( https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements )


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:444
+        if (auto st = mlir::dyn_cast<fir::ArrayMergeStoreOp>(op)) {
+          llvm::SmallVector<Operation *> values;
+          ReachCollector::reachingValues(values, st.sequence());
----------------
It may be worth moving all the temporary storage outside of the walk, and then just calling `.clear()` before using them here (which is already done inside `arrayAccesses()` by the way, even though it is for nothing right now). That way they would "grow" only once and the memory can be reused over and over.


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:448
+          arrayAccesses(accesses,
+                        mlir::cast<ArrayLoadOp>(st.original().getDefiningOp()));
+          if (conflictDetected(values, accesses, st)) {
----------------
Where is the guarantee that this cast will always succeed coming from? 
The doc for `ArrayMergeStoreOp` does not mention anything, and there is no verifier for `ArrayMergeStoreOp`


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:468
+            LLVM_DEBUG(llvm::dbgs() << " access: " << *acc << '\n');
+            if (mlir::isa<ArrayFetchOp, ArrayUpdateOp, ArrayModifyOp>(acc)) {
+              if (useMap.count(acc)) {
----------------
Can this check fail? Right now the contract of `arrayAccesses()` is to only return these three operations, and `conflictOnMerge` would fail if there were any other op.
Can this be replace with an assert here?


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:469
+            if (mlir::isa<ArrayFetchOp, ArrayUpdateOp, ArrayModifyOp>(acc)) {
+              if (useMap.count(acc)) {
+                mlir::emitError(
----------------
Same (I think) but with a single map lookup `if (!useMap.insert({acc, &op}).second) {`


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:474
+                    "array_load are not supported.");
+                continue;
+              }
----------------
Seems like this should signal an error? Is this conservative to just "continue" here?


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:659
+                        ArrayOp update,
+                        const std::function<void(mlir::Value)> &assignElement,
+                        mlir::Type lhsEltRefType) const {
----------------
In general, use `llvm::function_ref` instead of `std::function` when you don't take ownership (this avoid potential heap allocation and copies)


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:662
+    auto *op = update.getOperation();
+    auto *loadOp = useMap.lookup(op);
+    auto load = mlir::cast<ArrayLoadOp>(loadOp);
----------------
Alternatively: `auto loadOp = mlir::cast<ArrayLoadOp>(useMap.lookup(op));` and use `loadOp` everywhere instead of having `load` and `loadOp`


================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:673
+      llvm::SmallVector<mlir::Value> extents;
+      auto shapeOp = getOrReadExtentsAndShapeOp(loc, rewriter, load, extents);
+      auto allocmem = rewriter.create<AllocMemOp>(
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:680
+      rewriter.setInsertionPoint(op);
+      auto coor = genCoorOp(
+          rewriter, loc, getEleTy(load.getType()), lhsEltRefType, allocmem,
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:699
+    auto coorTy = getEleTy(load.getType());
+    auto coor = genCoorOp(
+        rewriter, loc, coorTy, lhsEltRefType, load.memref(), load.shape(),
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:794
+    auto loc = fetch.getLoc();
+    auto coor =
+        genCoorOp(rewriter, loc, getEleTy(load.getType()),
----------------



================
Comment at: flang/lib/Optimizer/Transforms/ArrayValueCopy.cpp:823
+    // Phase 1 is performing a rewrite on the array accesses. Once all the
+    // array accesses are rewritten we can go on pahse 2.
+    // Phase 2 gets rid of the useless copy-in/copyout operations. The copy-in
----------------
(typo)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111337/new/

https://reviews.llvm.org/D111337



More information about the llvm-commits mailing list