[Mlir-commits] [mlir] f287da8 - [mlir][bufferize] Better user control of layout maps

Mon May 16 09:09:18 PDT 2022

Author: Matthias Springer
Date: 2022-05-16T18:06:13+02:00
New Revision: f287da8a158113840aeabd04b641d0d2815212f2

URL: https://github.com/llvm/llvm-project/commit/f287da8a158113840aeabd04b641d0d2815212f2
DIFF: https://github.com/llvm/llvm-project/commit/f287da8a158113840aeabd04b641d0d2815212f2.diff

LOG: [mlir][bufferize] Better user control of layout maps

This changes replaces the `fully-dynamic-layout-maps` options (which was badly named) with two new options:

* `unknown-type-conversion` controls the layout maps on buffer types for which no layout map can be inferred.
* `function-boundary-type-conversion` controls the layout maps on buffer types inside of function signatures.

Differential Revision: https://reviews.llvm.org/D125615

Added: 
    

Modified: 
    mlir/docs/Bufferization.md
    mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
    mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
    mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
    mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
    mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
    mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp
    mlir/test/Dialect/Arithmetic/one-shot-bufferize.mlir
    mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
    mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-allow-return-allocs.mlir
    mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
    mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
    mlir/test/Dialect/SCF/one-shot-bufferize.mlir
    mlir/test/Dialect/Tensor/one-shot-bufferize.mlir

Removed: 
    


################################################################################
diff  --git a/mlir/docs/Bufferization.md b/mlir/docs/Bufferization.md
index 453794904cd04..372947a06c89f 100644

--- a/mlir/docs/Bufferization.md
+++ b/mlir/docs/Bufferization.md
@@ -336,16 +336,29 @@ simpler memref type (e.g., identity layout map), we expect that canonicalization
 patterns would clean up unnecessarily dynamic layout maps. (Some of these
 canonicalization patterns may not be implemented yet.)
 
-Note that One-Shot Bufferize always generates the most specific memref type when
-the entire IR is bufferizable. In that case, we do not have to rely on
-canonicalization patterns to clean up the bufferized IR.
-
-One-Shot Bufferize can be configured to always generate memref types with
-identity layout when the exact target memref type is not known via
-`fully-dynamic-layout-maps=0`. This can be useful for legacy code that cannot
-handle memref types with layout maps. Note that this leads to additional buffer
-copies when folding a `to_tensor`/`to_memref` pair with memref types that are
-not cast-compatible.
+One-Shot Bufferize tries to infer the most precise memref type when bufferizing
+an op. If the entire IR is bufferizable, we do not have to resort to
+conservatively use fully dynamic layout maps. In that case, we also do not have
+to rely on canonicalization patterns to clean up the bufferized IR.
+
+Note: There are some bufferizable ops for which a percise layout map cannot be
+inferred. E.g., a `tensor.cast` from a `tensor<*xf32>` to a `tensor<?x?xf32>`
+must be bufferized to a `memref.cast` with a memref type that has a fully
+dynamic layout map.
+
+One-Shot Bufferize has an option `unknown-type-conversion` to control the
+generation of layout maps when no precise layout can be inferred:
+
+*   `fully-dynamic-layout-map` uses fully dynamic layout maps and is the default
+    behavior. This composes well when IR is partially bufferized.
+*   `identity-layout-map` uses static identity layout maps. This option can be
+    useful for legacy code that cannot handle memref types with layout maps.
+    Note that this setting can lead to additional buffer copies when folding a
+    `to_tensor`/`to_memref` pair with memref types that are not cast-compatible.
+
+Note: The `unknown-type-conversion` option does not affect layout maps of
+function signatures. There is a separate `function-signature-type-conversion`
+option that controls layout maps of function parameters and function results.
 
 ## Extending One-Shot Bufferize
 

diff  --git a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
index a6233295cb650..ea0a3e6fa85a3 100644
--- a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h
@@ -62,6 +62,12 @@ struct BufferizationOptions {
     FilterType type;
   };
 
+  enum class LayoutMapOption : int8_t {
+    InferLayoutMap = 0,
+    IdentityLayoutMap = 1,
+    FullyDynamicLayoutMap = 2
+  };
+
   BufferizationOptions();
 
   /// Return `true` if the filter has at least one ALLOW rule.
@@ -201,6 +207,45 @@ struct BufferizationOptions {
   /// bufferized or not.
   bool bufferizeFunctionBoundaries = false;
 
+  /// This flag controls buffer types on function signatures.
+  ///
+  /// * InferLayoutMap: All function parameter types have a fully dynamic layout
+  ///   map, but function result types are inferred from the body of the
+  ///   function.
+  /// * FullyDynamicLayoutMap: All function parameter types and result types
+  ///   have a fully dynamic layout map. This option is most efficient because
+  ///   any layout map can be casted to a fully dynamic one.
+  /// * IdentityLayoutMap: All function parameter types and result types have a
+  ///   static identity layout (i.e., no layout map). This option may introduce
+  ///   additional buffer allocs and copies because layout maps cannot be casted
+  ///   away.
+  ///
+  /// If `bufferizeFunctionBoundaries` is not set, this flag has no effect. If
+  /// `promoteBufferResultsToOutParams` is set, `kInferMostPreciseLayoutMap` is
+  /// is an invalid option.
+  ///
+  /// Note: Inferred layout maps may not be desireable when interacting with
+  /// external functions, because the generated function signatures will be less
+  /// predictable.
+  LayoutMapOption functionBoundaryTypeConversion =
+      LayoutMapOption::InferLayoutMap;
+
+  /// This flag controls buffer types on unknown ops (to_memref wrappers) and in
+  /// other cases where a precise memref type cannot be inferred (e.g., the
+  /// bufferization of "tensor.cast").
+  ///
+  /// * InferLayoutMap: This option is invalid and cannot be used.
+  /// * FullyDynamicLayoutMap: Assume that unknown ops have results with fully
+  ///   dynamic layout maps after bufferization. This option is most efficient
+  ///   because any layout map can be casted to a fully dynamic one.
+  /// * IdentityLayoutMap: Assume that unknown ops have results with static
+  ///   identity layout (i.e., no layout map) after bufferization. This option
+  ///   introduces additional buffer allocs and copies if the unknown op is
+  ///   eventually bufferized to an op that returns a buffer with non-identity
+  ///   layout.
+  LayoutMapOption unknownTypeConversion =
+      LayoutMapOption::FullyDynamicLayoutMap;
+
   /// Specifies whether dealloc ops should be generated along with alloc ops. If
   /// not, new memory allocations will leak.
   bool createDeallocs = true;
@@ -209,10 +254,6 @@ struct BufferizationOptions {
   /// Should be used only with `testAnalysisOnly = true`.
   unsigned analysisFuzzerSeed = 0;
 
-  /// Specifies whether fully dynamic layout maps should be used on ranked
-  /// MemRef types. If false, MemRef types will have no layout maps.
-  bool fullyDynamicLayoutMaps = true;
-
   /// If set to `true`, does not modify the IR apart from adding attributes (for
   /// checking the results of the analysis) and post analysis steps.
   bool testAnalysisOnly = false;
@@ -554,9 +595,9 @@ OpTy replaceOpWithNewBufferizedOp(RewriterBase &rewriter, Operation *op,
 /// If possible, op bufferization implementations should not use this function
 /// and instead infer precise memref types for tensor results by themselves.
 ///
-/// Unless a layout map was specified, `options` flags determine what kind of
-/// layout map will be used. For best composability (without copies), the fully
-/// dynamic layout map is used by default.
+/// Unless a layout map was specified, `options.unknownTypeConverter` determines
+/// what kind of layout map will be used. For best composability (without
+/// copies), the fully dynamic layout map is used by default.
 ///
 /// Note: Canonicalization patterns could clean up layout maps and infer more
 /// precise layout maps after bufferization. However, many possible
@@ -566,6 +607,17 @@ BaseMemRefType getMemRefType(TensorType tensorType,
                              MemRefLayoutAttrInterface layout = {},
                              Attribute memorySpace = {});
 
+/// Return a MemRef type with fully dynamic layout. If the given tensor type
+/// is unranked, return an unranked MemRef type.
+BaseMemRefType getMemRefTypeWithFullyDynamicLayout(TensorType tensorType,
+                                                   Attribute memorySpace = {});
+
+/// Return a MemRef type with a static identity layout (i.e., no layout map). If
+/// the given tensor type is unranked, return an unranked MemRef type.
+BaseMemRefType
+getMemRefTypeWithStaticIdentityLayout(TensorType tensorType,
+                                      Attribute memorySpace = {});
+
 /// Try to hoist all new buffer allocations until the next hoisting barrier.
 LogicalResult hoistBufferAllocations(Operation *op,
                                      const BufferizationOptions &options);

diff  --git a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
index a19f92cca6902..f9fc5a4ef1a0c 100644
--- a/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
+++ b/mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
@@ -196,6 +196,19 @@ def OneShotBufferize : Pass<"one-shot-bufferize", "ModuleOp"> {
     when encountering an op that is not included in the filter (even if it is
     bufferizable).
 
+    One-Shot Bufferize will by default assume memref types with fully dynamic
+    layout maps when a precise layout cannot be inferred. E.g., this is the case
+    when wrapping a non-bufferizable op in to_memref/to_tensor ops. This
+    behavior can be overridden with `unknown-type-conversion`. Valid values are
+    `fully-dynamic-layout-map` and `identity-layout-map`.
+
+    Layout maps on function signatures can be controlled with a separate
+    `function-boundary-type-conversion` option, which can be set to
+    `infer-layout-map` in addition to the two possible values mentioned above.
+    When layout maps are referred, function return types may be more precise.
+    Function argument types cannot be inferred and have fully dynamic layout
+    maps in that case.
+
     For testing/debugging purposes, `test-analysis-only=1 print-conflicts=1`
     prints analysis results and explains why an OpOperand was decided to
     bufferize out-of-place. This is useful for understanding why One-Shot
@@ -254,9 +267,10 @@ def OneShotBufferize : Pass<"one-shot-bufferize", "ModuleOp"> {
            "core bufferization passes.">,
     ListOption<"dialectFilter", "dialect-filter", "std::string",
                "Restrict bufferization to ops from these dialects.">,
-    Option<"fullyDynamicLayoutMaps", "fully-dynamic-layout-maps", "bool",
-           /*default=*/"true",
-           "Generate MemRef types with dynamic offset+strides by default.">,
+    Option<"functionBoundaryTypeConversion",
+           "function-boundary-type-conversion", "std::string",
+           /*default=*/"\"infer-layout-map\"",
+           "Controls layout maps when bufferizing function signatures.">,
     Option<"testAnalysisOnly", "test-analysis-only", "bool",
             /*default=*/"false",
            "Test only: Only run inplaceability analysis and annotate IR">,
@@ -267,6 +281,9 @@ def OneShotBufferize : Pass<"one-shot-bufferize", "ModuleOp"> {
     Option<"promoteBufferResultsToOutParams",
            "promote-buffer-results-to-out-params", "bool", /*default=*/"false",
            "Replace returned buffers (that were not dropped) with out params.">,
+    Option<"unknownTypeConversion", "unknown-type-conversion", "std::string",
+           /*default=*/"\"fully-dynamic-layout-map\"",
+           "Controls layout maps for non-inferrable memref types.">,
   ];
   let constructor = "mlir::bufferization::createOneShotBufferizePass()";
 }

diff  --git a/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp b/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
index 75b05168902ef..be7cf209d46a6 100644
--- a/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
+++ b/mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
@@ -662,19 +662,38 @@ BaseMemRefType bufferization::getMemRefType(TensorType tensorType,
                                    memorySpace);
   }
 
-  // Case 2: Ranked memref type with specified layout. If fully dynamic layout
-  // maps are not requested, generate a type with `layout`, which is empty (no
-  // layout map) by default.
+  // Case 2: Ranked memref type with specified layout.
   auto rankedTensorType = tensorType.cast<RankedTensorType>();
-  if (layout || !options.fullyDynamicLayoutMaps) {
+  if (layout) {
     return MemRefType::get(rankedTensorType.getShape(),
                            rankedTensorType.getElementType(), layout,
                            memorySpace);
   }
 
-  // Case 3: Ranked memref type with unspecified layout. Choose the most dynamic
-  // one.
-  // TODO: address space decisions to connect with the actual alloc.
+  // Case 3: Configured with "fully dynamic layout maps".
+  if (options.unknownTypeConversion ==
+      BufferizationOptions::LayoutMapOption::FullyDynamicLayoutMap)
+    return getMemRefTypeWithFullyDynamicLayout(tensorType, memorySpace);
+
+  // Case 4: Configured with "static identity layout maps".
+  if (options.unknownTypeConversion ==
+      BufferizationOptions::LayoutMapOption::IdentityLayoutMap)
+    return getMemRefTypeWithStaticIdentityLayout(tensorType, memorySpace);
+
+  llvm_unreachable("InferLayoutMap is an invalid option");
+}
+
+BaseMemRefType
+bufferization::getMemRefTypeWithFullyDynamicLayout(TensorType tensorType,
+                                                   Attribute memorySpace) {
+  // Case 1: Unranked memref type.
+  if (auto unrankedTensorType = tensorType.dyn_cast<UnrankedTensorType>()) {
+    return UnrankedMemRefType::get(unrankedTensorType.getElementType(),
+                                   memorySpace);
+  }
+
+  // Case 2: Ranked memref type.
+  auto rankedTensorType = tensorType.cast<RankedTensorType>();
   int64_t dynamicOffset = ShapedType::kDynamicStrideOrOffset;
   SmallVector<int64_t> dynamicStrides(rankedTensorType.getRank(),
                                       ShapedType::kDynamicStrideOrOffset);
@@ -684,3 +703,22 @@ BaseMemRefType bufferization::getMemRefType(TensorType tensorType,
                          rankedTensorType.getElementType(), stridedLayout,
                          memorySpace);
 }
+
+/// Return a MemRef type with a static identity layout (i.e., no layout map). If
+/// the given tensor type is unranked, return an unranked MemRef type.
+BaseMemRefType
+bufferization::getMemRefTypeWithStaticIdentityLayout(TensorType tensorType,
+                                                     Attribute memorySpace) {
+  // Case 1: Unranked memref type.
+  if (auto unrankedTensorType = tensorType.dyn_cast<UnrankedTensorType>()) {
+    return UnrankedMemRefType::get(unrankedTensorType.getElementType(),
+                                   memorySpace);
+  }
+
+  // Case 2: Ranked memref type.
+  auto rankedTensorType = tensorType.cast<RankedTensorType>();
+  MemRefLayoutAttrInterface layout = {};
+  return MemRefType::get(rankedTensorType.getShape(),
+                         rankedTensorType.getElementType(), layout,
+                         memorySpace);
+}

diff  --git a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
index eb0aeaba0e65b..0727b1ce9da83 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/Bufferize.cpp
@@ -151,6 +151,17 @@ struct FinalizingBufferizePass
   }
 };
 
+static BufferizationOptions::LayoutMapOption
+parseLayoutMapOption(std::string s) {
+  if (s == "fully-dynamic-layout-map")
+    return BufferizationOptions::LayoutMapOption::FullyDynamicLayoutMap;
+  if (s == "identity-layout-map")
+    return BufferizationOptions::LayoutMapOption::IdentityLayoutMap;
+  if (s == "infer-layout-map")
+    return BufferizationOptions::LayoutMapOption::InferLayoutMap;
+  llvm_unreachable("invalid layout map option");
+}
+
 struct OneShotBufferizePass
     : public OneShotBufferizeBase<OneShotBufferizePass> {
   OneShotBufferizePass() : OneShotBufferizeBase<OneShotBufferizePass>() {}
@@ -175,11 +186,13 @@ struct OneShotBufferizePass
       opt.alwaysAliasingWithDest = alwaysAliasingWithDest;
       opt.analysisFuzzerSeed = analysisFuzzerSeed;
       opt.createDeallocs = createDeallocs;
-      opt.fullyDynamicLayoutMaps = fullyDynamicLayoutMaps;
+      opt.functionBoundaryTypeConversion =
+          parseLayoutMapOption(functionBoundaryTypeConversion);
       opt.printConflicts = printConflicts;
       opt.testAnalysisOnly = testAnalysisOnly;
       opt.bufferizeFunctionBoundaries = bufferizeFunctionBoundaries;
       opt.promoteBufferResultsToOutParams = promoteBufferResultsToOutParams;
+      opt.unknownTypeConversion = parseLayoutMapOption(unknownTypeConversion);
 
       BufferizationOptions::OpFilterEntry::FilterFn filterFn =
           [&](Operation *op) {
@@ -362,6 +375,9 @@ LogicalResult
 bufferization::bufferizeOp(Operation *op,
                            BufferizationState &bufferizationState) {
   const auto &options = bufferizationState.getOptions();
+  assert(options.unknownTypeConversion !=
+             BufferizationOptions::LayoutMapOption::InferLayoutMap &&
+         "invalid layout map option");
 
   // Keep track of to_memref ops.
   DenseSet<Operation *> toMemrefOps;
@@ -371,13 +387,9 @@ bufferization::bufferizeOp(Operation *op,
   //
   // We should ideally know the exact memref type of all operands when
   // bufferizing an op. (This is the case when bufferizing top-to-bottom.)
-  // Otherwise, we have to use a memref type with a fully dynamic layout map,
-  // which has to canonicalize away. This is less efficient.
-  //
-  // If "fullyDynamicLayoutMaps = false", we would have to insert buffer copies
-  // to fold ("finalize") to_memref(to_tensor(x)) ops with non-cast-compatible
-  // layout maps when doing a traversal other than top-to-bottom. These would
-  // not easily fold away.
+  // Otherwise, we have to use a memref type with a fully dynamic layout map to
+  // avoid copies. We are currently missing patterns for layout maps to
+  // canonicalize away (or canonicalize to more precise layouts).
   SmallVector<Operation *> worklist;
   op->walk<WalkOrder::PreOrder>([&](Operation *op) {
     if (hasTensorSemantics(op))
@@ -478,6 +490,7 @@ BufferizationOptions bufferization::getPartialBufferizationOptions() {
   BufferizationOptions options;
   options.allowUnknownOps = true;
   options.createDeallocs = false;
-  options.fullyDynamicLayoutMaps = false;
+  options.unknownTypeConversion =
+      BufferizationOptions::LayoutMapOption::IdentityLayoutMap;
   return options;
 }

diff  --git a/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp b/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
index dda46141e5382..a6a6935bc18da 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp
@@ -58,15 +58,24 @@ static func::ReturnOp getAssumedUniqueReturnOp(FuncOp funcOp) {
 
 /// Return the index-th bufferized function argument type. This assumes that the
 /// specified argument is a tensor. If the tensor is ranked, a layout map may be
-/// specified by the user. If no layout map is specified, a fully dynamic map is
-/// used.
+/// specified by the user. If no layout map is specified, the default layout map
+/// (as per `options.functionBoundaryTypeConversion`) is used.
 static BaseMemRefType
 getBufferizedFunctionArgType(FuncOp funcOp, int64_t index,
                              const BufferizationOptions &options) {
   auto tensorType =
       funcOp.getFunctionType().getInput(index).dyn_cast<TensorType>();
   assert(tensorType && "expected TensorType");
-  BaseMemRefType memrefType = getMemRefType(tensorType, options);
+
+  BaseMemRefType memrefType;
+  if (options.functionBoundaryTypeConversion ==
+      BufferizationOptions::LayoutMapOption::IdentityLayoutMap) {
+    memrefType = getMemRefTypeWithStaticIdentityLayout(tensorType);
+  } else {
+    // Note: Layout maps on function parameters cannot be inferred. The best we
+    // can do at the moment is "fully dynamic".
+    memrefType = getMemRefTypeWithFullyDynamicLayout(tensorType);
+  }
 
   auto layoutAttr = funcOp.getArgAttrOfType<AffineMapAttr>(
       index, BufferizationDialect::kBufferLayoutAttrName);
@@ -386,11 +395,10 @@ struct ReturnOpInterface
 
 struct FuncOpInterface
     : public BufferizableOpInterface::ExternalModel<FuncOpInterface, FuncOp> {
-  /// Rewrite function bbArgs and return values into buffer form (using the
-  /// canonical memref layout for now). This function bufferizes the function
-  /// signature and the ReturnOp. When the entire function body has been
-  /// bufferized, function return types can be switched to more concise memref
-  /// types as part of `foldMemRefCasts`.
+  /// Rewrite function bbArgs and return values into buffer form. This function
+  /// bufferizes the function signature and the ReturnOp. When the entire
+  /// function body has been bufferized, function return types can be switched
+  /// to more concise memref types as part of `foldMemRefCasts`.
   ///
   /// When a tensor function argument is known to be equivalent to a tensor
   /// result, it is dropped from the return values.
@@ -439,6 +447,7 @@ struct FuncOpInterface
     // TODO: Support functions with multiple returns.
     func::ReturnOp returnOp = getAssumedUniqueReturnOp(funcOp);
     assert(returnOp && "expected func with single return op");
+    Location loc = returnOp.getLoc();
 
     // 1. Rewrite the bbArgs. Turn every tensor bbArg into a memref bbArg.
     Block &frontBlock = funcOp.getBody().front();
@@ -474,9 +483,11 @@ struct FuncOpInterface
     SmallVector<Value> returnValues;
     for (OpOperand &returnOperand : returnOp->getOpOperands()) {
       Value returnVal = returnOperand.get();
+      auto tensorType = returnVal.getType().dyn_cast<TensorType>();
+      rewriter.setInsertionPoint(returnOp);
 
       // If not a tensor type just forward it.
-      if (!returnVal.getType().isa<RankedTensorType>()) {
+      if (!tensorType) {
         returnValues.push_back(returnVal);
         continue;
       }
@@ -485,12 +496,10 @@ struct FuncOpInterface
       if (options.dropEquivalentFuncResults) {
         if (Optional<int64_t> equivBbArgIdx = getEquivalentFuncArgIdx(
                 funcOp, funcState, returnOperand.getOperandNumber())) {
-          rewriter.setInsertionPoint(returnOp);
-          Location loc = returnOp.getLoc();
+          // TODO: Use memref type with fully dynamic layout map and add folder
+          // for memref.cast + memref.copy.
           Value toMemrefOp = rewriter.create<bufferization::ToMemrefOp>(
-              loc,
-              getMemRefType(returnVal.getType().cast<TensorType>(), options),
-              returnVal);
+              loc, getMemRefType(tensorType, options), returnVal);
           BlockArgument equivBbArg = funcOp.getArgument(*equivBbArgIdx);
           // Note: This copy will fold away. It must be inserted here to ensure
           // that `returnVal` still has at least one use and does not fold away.
@@ -501,7 +510,17 @@ struct FuncOpInterface
         }
       }
 
-      returnValues.push_back(*state.getBuffer(rewriter, returnOperand));
+      BaseMemRefType resultType;
+      if (options.functionBoundaryTypeConversion ==
+          BufferizationOptions::LayoutMapOption::IdentityLayoutMap) {
+        resultType = getMemRefTypeWithStaticIdentityLayout(tensorType);
+      } else {
+        // Note: If `InferLayoutMap`, cast are later folded away.
+        resultType = getMemRefTypeWithFullyDynamicLayout(tensorType);
+      }
+      Value toMemrefOp = rewriter.create<bufferization::ToMemrefOp>(
+          loc, resultType, returnVal);
+      returnValues.push_back(toMemrefOp);
     }
 
     // 3. Rewrite the terminator without the in-place bufferizable values.

diff  --git a/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp b/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp
index 6dc3432f46a63..805467be2088c 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp
@@ -471,7 +471,10 @@ LogicalResult mlir::bufferization::runOneShotModuleBufferize(
     // would be invalidated.
     if (failed(bufferizeOp(funcOp, bufferizationState)))
       return failure();
-    foldMemRefCasts(funcOp);
+    // Change buffer return types to more precise layout maps.
+    if (options.functionBoundaryTypeConversion ==
+        BufferizationOptions::LayoutMapOption::InferLayoutMap)
+      foldMemRefCasts(funcOp);
   }
 
   // Check result.

diff  --git a/mlir/test/Dialect/Arithmetic/one-shot-bufferize.mlir b/mlir/test/Dialect/Arithmetic/one-shot-bufferize.mlir
index 4523981ea3221..9822913b69a06 100644
--- a/mlir/test/Dialect/Arithmetic/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Arithmetic/one-shot-bufferize.mlir
@@ -6,7 +6,7 @@
 // RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs test-analysis-only analysis-fuzzer-seed=91 bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs fully-dynamic-layout-maps=0 bufferize-function-boundaries" -split-input-file -o /dev/null
+// RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs unknown-type-conversion=identity-layout-map function-boundary-type-conversion=identity-layout-map bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // CHECK-LABEL: func @write_to_select_op_source
 //  CHECK-SAME:     %[[t1:.*]]: memref<?xf32, #{{.*}}>, %[[t2:.*]]: memref<?xf32, #{{.*}}>

diff  --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
index 514a9a895db04..4717881f522c4 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
@@ -1,7 +1,7 @@
 // RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs allow-unknown-ops" -split-input-file | FileCheck %s
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs allow-unknown-ops fully-dynamic-layout-maps=0" -split-input-file | FileCheck %s --check-prefix=CHECK-NO-LAYOUT-MAP
+// RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs allow-unknown-ops unknown-type-conversion=identity-layout-map" -split-input-file | FileCheck %s --check-prefix=CHECK-NO-LAYOUT-MAP
 
 // Run fuzzer with 
diff erent seeds.
 // RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs test-analysis-only analysis-fuzzer-seed=23" -split-input-file -o /dev/null

diff  --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-allow-return-allocs.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-allow-return-allocs.mlir
index 4ef9995c07921..888d8a10e135e 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-allow-return-allocs.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-allow-return-allocs.mlir
@@ -7,7 +7,7 @@
 // RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs test-analysis-only analysis-fuzzer-seed=91" -split-input-file -o /dev/null
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs fully-dynamic-layout-maps=0" -split-input-file -o /dev/null
+// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs unknown-type-conversion=identity-layout-map function-boundary-type-conversion=identity-layout-map" -split-input-file -o /dev/null
 
 // Make sure that the returned buffer is not deallocated.
 // TODO: Such buffers currently leak. We need buffer hoisting / ref counting for

diff  --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
index 41e61ac98b561..d0316464eaae4 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
@@ -1,4 +1,5 @@
-// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1" -split-input-file | FileCheck %s
+// Note: Default is function-boundary-type-conversion=infer-layout-map
+// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs" -split-input-file | FileCheck %s
 
 // Run fuzzer with 
diff erent seeds.
 // RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs test-analysis-only analysis-fuzzer-seed=23" -split-input-file -o /dev/null
@@ -6,14 +7,29 @@
 // RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs test-analysis-only analysis-fuzzer-seed=91" -split-input-file -o /dev/null
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs fully-dynamic-layout-maps=0" -split-input-file | FileCheck %s --check-prefix=CHECK-NO-LAYOUT-MAP-LABEL
+// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs unknown-type-conversion=identity-layout-map function-boundary-type-conversion=identity-layout-map" -split-input-file | FileCheck %s --check-prefix=CHECK-NO-LAYOUT-MAP
+
+// Test bufferization using memref types that have fully dynamic layout maps.
+// RUN: mlir-opt %s -one-shot-bufferize="bufferize-function-boundaries=1 allow-return-allocs function-boundary-type-conversion=fully-dynamic-layout-map" -split-input-file | FileCheck %s --check-prefix=CHECK-FULLY-DYNAMIC-LAYOUT-MAP
+
 
 // Bufferization of bodiless function with no tensor return value.
 
-// CHECK-LABEL: func private @private_func
+// CHECK: #[[$map0:.*]] = affine_map<(d0)[s0, s1] -> (d0 * s1 + s0)>
+// CHECK: #[[$map1:.*]] = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>
+// CHECK-LABEL: func private @private_func(memref<?xf32,
+//  CHECK-SAME:                                          #[[$map0]]>)
+// CHECK-NO-LAYOUT-MAP-LABEL: func private @private_func(memref<?xf32>)
 func.func private @private_func(tensor<?xf32>) -> ()
 
-// CHECK-LABEL: func @empty_func()
+// CHECK-LABEL: func private @private_func_2d(memref<?x?xf32,
+//  CHECK-SAME:                                               #[[$map1]]>)
+// CHECK-NO-LAYOUT-MAP-LABEL: func private @private_func_2d(memref<?x?xf32>)
+func.func private @private_func_2d(tensor<?x?xf32>) -> ()
+
+// CHECK-LABEL: func @empty_func() {
+// CHECK-NO-LAYOUT-MAP-LABEL: func @empty_func() {
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP-LABEL: func @empty_func() {
 func.func @empty_func() -> () {
   return
 }
@@ -23,10 +39,45 @@ func.func @empty_func() -> () {
 // A bodiless function that returns something that is not a tensor.
 
 // CHECK: func private @external_func_with_return_val(memref<4xi32, #{{.*}}>) -> f32
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP: #[[$map1:.*]] = affine_map<(d0)[s0, s1] -> (d0 * s1 + s0)>
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP-LABEL: func private @external_func_with_return_val(memref<4xi32,
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP-SAME: #[[$map1]]>
 func.func private @external_func_with_return_val(tensor<4xi32>) -> f32
 
 // -----
 
+// A function that returns a non-equivalent tensor with layout map.
+
+// CHECK: #[[$map2:.*]] = affine_map<(d0, d1)[s0] -> (d0 * 10 + s0 + d1)>
+// CHECK-LABEL: func @return_extract_slice(%{{.*}}) -> memref<2x?xf32,
+//  CHECK-SAME:     #[[$map2]]> {
+//       CHECK:   %[[alloc:.*]] = memref.alloc() {{.*}} : memref<20x10xf32>
+//       CHECK:   %[[subview:.*]] = memref.subview {{.*}} : memref<20x10xf32> to memref<2x?xf32, #[[$map2]]>
+//       CHECK:   return %[[subview]]
+
+// CHECK-NO-LAYOUT-MAP: #[[$map2:.*]] = affine_map<(d0, d1)[s0] -> (d0 * 10 + s0 + d1)>
+// CHECK-NO-LAYOUT-MAP-LABEL: func @return_extract_slice(%{{.*}}) -> memref<2x?xf32>
+//       CHECK-NO-LAYOUT-MAP:   %[[alloc:.*]] = memref.alloc() {{.*}} : memref<20x10xf32>
+//       CHECK-NO-LAYOUT-MAP:   %[[subview:.*]] = memref.subview {{.*}} : memref<20x10xf32> to memref<2x?xf32, #[[$map2]]>
+//       CHECK-NO-LAYOUT-MAP:   %[[alloc_no_layout:.*]] = memref.alloc(%{{.*}}) : memref<2x?xf32>
+//       CHECK-NO-LAYOUT-MAP:   memref.copy %[[subview]], %[[alloc_no_layout]]
+//       CHECK-NO-LAYOUT-MAP:   memref.dealloc %[[alloc]]
+//       CHECK-NO-LAYOUT-MAP:   return %[[alloc_no_layout]]
+
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP: #[[$map2a:.*]] = affine_map<(d0, d1)[s0, s1, s2] -> (d0 * s1 + s0 + d1 * s2)>
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP: #[[$map2b:.*]] = affine_map<(d0, d1)[s0] -> (d0 * 10 + s0 + d1)>
+// CHECK-FULLY-DYNAMIC-LAYOUT-MAP-LABEL: func @return_extract_slice(%{{.*}}) -> memref<2x?xf32,
+//  CHECK-FULLY-DYNAMIC-LAYOUT-MAP-SAME: #[[$map2a]]> {
+func.func @return_extract_slice(%idx: index, %sz: index) -> (tensor<2x?xf32>)
+{
+  %t = linalg.init_tensor [20, 10] : tensor<20x10xf32>
+  %0 = tensor.extract_slice %t[%idx, %idx][2, %sz][1, 1]
+      : tensor<20x10xf32> to tensor<2x?xf32>
+  return %0 : tensor<2x?xf32>
+}
+
+// -----
+
 // CHECK-LABEL: func private @private_func
 func.func private @private_func(tensor<?xf32>) -> (f32)
 

diff  --git a/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir b/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
index dc4135560778f..77011e9878d59 100644
--- a/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
@@ -6,7 +6,7 @@
 // RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs test-analysis-only analysis-fuzzer-seed=91 bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs fully-dynamic-layout-maps=0 bufferize-function-boundaries" -split-input-file | FileCheck %s --check-prefix=CHECK-NO-LAYOUT-MAP
+// RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs unknown-type-conversion=identity-layout-map function-boundary-type-conversion=identity-layout-map bufferize-function-boundaries" -split-input-file | FileCheck %s --check-prefix=CHECK-NO-LAYOUT-MAP
 
 // TODO: Some test cases from this file should be moved to other dialects.
 

diff  --git a/mlir/test/Dialect/SCF/one-shot-bufferize.mlir b/mlir/test/Dialect/SCF/one-shot-bufferize.mlir
index 22b5e41364c03..09fcd8192765a 100644
--- a/mlir/test/Dialect/SCF/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/SCF/one-shot-bufferize.mlir
@@ -6,7 +6,7 @@
 // RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs test-analysis-only analysis-fuzzer-seed=91 bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs fully-dynamic-layout-maps=0 bufferize-function-boundaries" -split-input-file -o /dev/null
+// RUN: mlir-opt %s -allow-unregistered-dialect -one-shot-bufferize="allow-return-allocs unknown-type-conversion=identity-layout-map function-boundary-type-conversion=identity-layout-map bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // CHECK-DAG: #[[$map_1d_dyn:.*]] = affine_map<(d0)[s0, s1] -> (d0 * s1 + s0)>
 

diff  --git a/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir b/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir
index c9a7afd76fbb9..382e4246b0f58 100644
--- a/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir
@@ -6,7 +6,7 @@
 // RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs test-analysis-only analysis-fuzzer-seed=91 bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // Test bufferization using memref types that have no layout map.
-// RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs fully-dynamic-layout-maps=0 bufferize-function-boundaries" -split-input-file -o /dev/null
+// RUN: mlir-opt %s -one-shot-bufferize="allow-return-allocs unknown-type-conversion=identity-layout-map bufferize-function-boundaries" -split-input-file -o /dev/null
 
 // CHECK-DAG: #[[$map_1d_dyn:.*]] = affine_map<(d0)[s0, s1] -> (d0 * s1 + s0)>