[all-commits] [llvm/llvm-project] 4cd736: [mlir][SCF] foreach_thread: Capture shared output ...

Matthias Springer via All-commits all-commits at lists.llvm.org
Fri Sep 2 05:54:28 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4cd7362083c8801bbc84d2c43b086d1f8f0de93f
      https://github.com/llvm/llvm-project/commit/4cd7362083c8801bbc84d2c43b086d1f8f0de93f
  Author: Matthias Springer <springerm at google.com>
  Date:   2022-09-02 (Fri, 02 Sep 2022)

  Changed paths:
    M mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
    M mlir/include/mlir/Interfaces/ParallelCombiningOpInterface.td
    M mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp
    M mlir/lib/Dialect/SCF/IR/SCF.cpp
    M mlir/lib/Dialect/SCF/Transforms/BufferizableOpInterfaceImpl.cpp
    M mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
    M mlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir
    M mlir/test/Dialect/Linalg/tile-to-foreach-thread.mlir
    M mlir/test/Dialect/Linalg/transform-op-fuse-into-containing.mlir
    M mlir/test/Dialect/SCF/invalid.mlir
    M mlir/test/Dialect/SCF/one-shot-bufferize-tensor-copy-insertion.mlir
    M mlir/test/Dialect/SCF/one-shot-bufferize.mlir
    M mlir/test/Dialect/SCF/ops.mlir
    M mlir/test/Dialect/Tensor/canonicalize.mlir
    M mlir/test/Dialect/Tensor/one-shot-bufferize.mlir

  Log Message:
  -----------
  [mlir][SCF] foreach_thread: Capture shared output tensors explicitly

This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments.

The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments.

As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again.

Differential Revision: https://reviews.llvm.org/D133114




More information about the All-commits mailing list