[Mlir-commits] [mlir] [mlir] Add a contiguous<perm, offset> layout, use as identity layout (PR #131663)

Mon Mar 17 12:38:03 PDT 2025

llvmbot wrote:




@llvm/pr-subscribers-mlir-core

Author: Krzysztof Drewniak (krzysz00)

<details>
<summary>Changes</summary>

This PR introduces a new ContiguousLayoutAttr, which holds a permutation of the dimensions of the memref and an optional offset, and replaces the default memref layout (which was previously the N-D identity map) with contiguous<N>.

In general, the syntax for this attribute is
`contiguous<[I0, I1, .. IN], offset: O>` where `I0` through `IN` are integers in 0..=N and O is either a static offset or ? for a dynamic one. If the offset is 0, the offset isn't printed, while if the permutation is `[0, 1, ... N]`, we print it as N+1. That is, the 2-D identity/row-major layout is `contiguous<2>` and not `(d0, d1) -> (d0, d1)` like it used to be.

# Motivation

In summary, the contiguous<> layout both fills in the "layout hierarchy" (all contiguous layouts are strided, and all strided layouts are affine maps, but you can't go back down) with a primitive that enables useful optimizations and makes it easier to have relocatable/mergable allocations in MLIR code.

Consider `memref<?0 x ?1 x ?2 x T>` - a memref with three dynamic dimensions. This memref has a row-major identity layout.

Suppose I want to make this memref "relocatable" - declare that it has an unonwn offset so that I can, for example, have a pass that merges allocations into larger contiguous buffers. With the current layouts in MLIR, I can either use:
- `strided<[?, ?, 1], offset: ?>`, which loses the fact that this is a row-major memref. We don't know the relationship of those two `?`s to each other.
- `(d0, d1, d2)[s0] -> (d0, d1, d2 + s0)`, which isn't a "strided" layout by existing definitions and encounters the fact that meany memref operations don't handle non-strided or arbitrary affine layouts.

Being able to use `contiguous<3, offset: ?>` (or, in its long form, contiguous<[0, 1, 2], offset: ?>`) resolves this isue. That is now a strided layout that directly encodes the fact that this is a 3-D row-major memref with some dynamic offset.

As seen in my changes to some passes like `gpu-decompose-memrefs` or the vector transfer op flattener, knowing that a layout is contiguous - if not necessarily row-major, allows us to use operations like `affine.linearize_index` for index computations, which fold well with operations like `affine.delinearize_index`, allowing for eliminating unnecessariy "divide an ID into parts and multiply them together again" computations that often come up in tiling-based code generation that the affine map simplifier has difficulty with or generates inefficiently.

This layout also allows describing permuted layouts, like column-major layouts, without needing code to handle the general complexity of an affine map layout. For exmample,

```
memref.expand_shape %arg [[0, 1], [2]]
  : memref<?x?xi32, contiguous<[1, 0]>
  into memref<?x?x?xi32, contiguous<[1, 2, 0]>
```

accurately describes the effects of expand_shape'ing a column-major memref.

## Why change the default layout?

Since the built-in layout attributes form a hierarchy of specificy (all contiguous layouts are strided ...), there are multiple ways to represent the identity row-major layout. The contiguous layout is the most specific of these, so it makes sense to declare it the canonical form of the identity layout. That is, `strided<[?, ?, 1]>` is less specific of a layout for `memref<?x?x?xi32>`. The identity affine_map also has non-canonical forms and is less spcefici: code that can handle te identity AffineMapAttr may not know what to do with other affine maps because of how general they are, but it will be easier to go from the identity ContiguousLayoutAttr to permuted and/or offset attributes.

Therefore, making the contiguous layout the default form of MemRefLayoutAttrInterface makes writing memref-handling code easier going forward.

# Concrete impacts of the change

1. `memref<...xT, affine_map<(d0, d1, ..., dN) -> (d0, d1, ... dN)>` no longer prints as `memref<...xT>`.
2. Similarly, the default memref layout is no longer an AffineMapAttr. This didn't break any code in-tree, since almost everything had moved to MemRefLayoutAttrInterface::getAffineMap(), but it's worth calling out.
3. `memref.subview`, `memref.reinterperet_cast`, and so on do not alwasy produce a `strided` layout: if code needed to create `strided<[], offset: O>`, it'll now create `contiguous<0, offset: O>` and similarly for `strided<[1], offset: O>`, which is a 1-D contiguous layout. This is facilitated by the new `StridedLayout::getCanonical` method, which doesn't always return a strided layout
4. Some passes have been updated to use `affine.linearize_index disjoint` when they were flatting a contiguous (subset of) a memref, allowing for more efficient code generatino compared to an `affine.apply` over the strides.
5. `getStridesAndOfffset()` has learned a new trick for affine maps: any "offset permutation" (that is, a permutation where the last result can be dX + E for any E) is now considered strided. This means that you can now `getStridesAndOffset` a
`memref<MxNxf32, affine_map<(i, j) -> (j, i)>`, which would previously fail.
6. `MemRefType::canonicalizeLayout` has been updated to canonicalize strided layouts to their `contiguous` equivalent for static-shaped memrefs.
7. `bufferization.buffer_layout` can be any `MemRefLayoutAttrInterface`, and any identity maps present in such attributes are transparently migrated to their contiguous<> equivalents.
8. Certain reshape folders will now work with any row-major layout, even if it has an offset.

While this is a breaking change, we expect that it will allow long-term improvments to how MLIR represents memrefs in common situations.

---

Patch is 193.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131663.diff


56 Files Affected:

- (modified) mlir/include/mlir-c/BuiltinAttributes.h (+39) 
- (modified) mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td (+3-2) 
- (modified) mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h (+14-12) 
- (modified) mlir/include/mlir/IR/BuiltinAttributes.h (+22) 
- (modified) mlir/include/mlir/IR/BuiltinAttributes.td (+89-2) 
- (modified) mlir/include/mlir/IR/BuiltinTypes.td (+51-9) 
- (modified) mlir/lib/AsmParser/AttributeParser.cpp (+86) 
- (modified) mlir/lib/AsmParser/Parser.h (+3) 
- (modified) mlir/lib/AsmParser/TokenKinds.def (+1) 
- (modified) mlir/lib/Bindings/Python/IRAttributes.cpp (+55) 
- (modified) mlir/lib/CAPI/IR/BuiltinAttributes.cpp (+46) 
- (modified) mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp (+10-3) 
- (modified) mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp (+4-2) 
- (modified) mlir/lib/Dialect/Affine/Utils/Utils.cpp (+1-2) 
- (modified) mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp (+2-2) 
- (modified) mlir/lib/Dialect/Bufferization/Transforms/FuncBufferizableOpInterfaceImpl.cpp (+11-4) 
- (modified) mlir/lib/Dialect/GPU/Transforms/DecomposeMemRefs.cpp (+59-7) 
- (modified) mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (+175-53) 
- (modified) mlir/lib/Dialect/MemRef/Transforms/EmulateNarrowType.cpp (+14-17) 
- (modified) mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp (+1-1) 
- (modified) mlir/lib/Dialect/Utils/ReshapeOpsUtils.cpp (+2-2) 
- (modified) mlir/lib/Dialect/Vector/Transforms/LowerVectorGather.cpp (+20-20) 
- (modified) mlir/lib/Dialect/Vector/Transforms/VectorTransferOpTransforms.cpp (+9-33) 
- (modified) mlir/lib/IR/AsmPrinter.cpp (+4-1) 
- (modified) mlir/lib/IR/BuiltinAttributes.cpp (+115) 
- (modified) mlir/lib/IR/BuiltinTypes.cpp (+220-53) 
- (modified) mlir/python/mlir/_mlir_libs/_mlir/ir.pyi (+35) 
- (modified) mlir/python/mlir/extras/types.py (+2-1) 
- (modified) mlir/test/CAPI/ir.c (+27) 
- (modified) mlir/test/Conversion/MemRefToLLVM/expand-then-convert-to-llvm.mlir (+6-6) 
- (modified) mlir/test/Conversion/MemRefToLLVM/memref-to-llvm.mlir (+46-7) 
- (modified) mlir/test/Dialect/Affine/dma.mlir (+4-4) 
- (modified) mlir/test/Dialect/Affine/pipeline-data-transfer.mlir (+10-10) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-empty-tensor-elimination.mlir (+2-2) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-encodings.mlir (+2-2) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir (+3-4) 
- (modified) mlir/test/Dialect/GPU/decompose-memrefs.mlir (+41-16) 
- (modified) mlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir (+2-2) 
- (modified) mlir/test/Dialect/Linalg/transform-patterns.mlir (+16-16) 
- (modified) mlir/test/Dialect/MemRef/canonicalize.mlir (+44-17) 
- (modified) mlir/test/Dialect/MemRef/emulate-narrow-type.mlir (+12-12) 
- (modified) mlir/test/Dialect/MemRef/make-loop-independent.mlir (+6-7) 
- (modified) mlir/test/Dialect/MemRef/ops.mlir (+74-20) 
- (modified) mlir/test/Dialect/MemRef/transform-ops.mlir (+8-8) 
- (modified) mlir/test/Dialect/Tensor/bufferize.mlir (+8-8) 
- (modified) mlir/test/Dialect/Tensor/one-shot-bufferize.mlir (+5-5) 
- (modified) mlir/test/Dialect/Transform/test-pattern-application.mlir (+3-3) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir (+24-24) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-flatten.mlir (+47-36) 
- (modified) mlir/test/IR/affine-map.mlir (+5-5) 
- (modified) mlir/test/IR/parser.mlir (+33-24) 
- (modified) mlir/test/IR/print-attr-type-aliases.mlir (+1-2) 
- (modified) mlir/test/python/dialects/memref.py (+2-13) 
- (modified) mlir/test/python/ir/attributes.py (+32) 
- (modified) mlir/test/python/ir/builtin_types.py (+1-1) 
- (modified) mlir/unittests/Dialect/MemRef/InferShapeTest.cpp (+3-3) 


``````````diff

diff --git a/mlir/include/mlir-c/BuiltinAttributes.h b/mlir/include/mlir-c/BuiltinAttributes.h
index 1d0edf9ea809d..5c62495110df8 100644
--- a/mlir/include/mlir-c/BuiltinAttributes.h
+++ b/mlir/include/mlir-c/BuiltinAttributes.h
@@ -697,6 +697,13 @@ MLIR_CAPI_EXPORTED MlirAttribute
 mlirStridedLayoutAttrGet(MlirContext ctx, int64_t offset, intptr_t numStrides,
                          const int64_t *strides);
 
+// Creates a strided layout attribute from given strides and offset,
+// canonicalizing the 0D and 1D unit stride to contiguous layout attributes. The
+// returned value may not be a StridedLayoutAttr.
+MLIR_CAPI_EXPORTED MlirAttribute
+mlirStridedLayoutAttrGetCanonical(MlirContext ctx, int64_t offset,
+                                  intptr_t numStrides, const int64_t *strides);
+
 // Returns the offset in the given strided layout layout attribute.
 MLIR_CAPI_EXPORTED int64_t mlirStridedLayoutAttrGetOffset(MlirAttribute attr);
 
@@ -711,6 +718,38 @@ MLIR_CAPI_EXPORTED int64_t mlirStridedLayoutAttrGetStride(MlirAttribute attr,
 /// Returns the typeID of a StridedLayout attribute.
 MLIR_CAPI_EXPORTED MlirTypeID mlirStridedLayoutAttrGetTypeID(void);
 
+//===----------------------------------------------------------------------===//
+// Contiguous layout attribute.
+//===----------------------------------------------------------------------===//
+
+// Checks wheather the given attribute is a contiguous layout attribute.
+MLIR_CAPI_EXPORTED bool mlirAttributeIsAContiguousLayout(MlirAttribute attr);
+
+// Creates a contiguous layout attribute from given permutation and offset.
+// There must be `rank` values in `permutation`.
+MLIR_CAPI_EXPORTED MlirAttribute mlirContiguousLayoutAttrGet(
+    MlirContext ctx, int64_t offset, intptr_t rank, const int64_t *permutation);
+
+// Creates a row-major contiguous layout attribute from given offset and rank.
+MLIR_CAPI_EXPORTED MlirAttribute mlirContiguousLayoutAttrGetRowMajor(
+    MlirContext ctx, int64_t offset, int64_t rank);
+
+// Returns the offset in the given contiguous layout attribute.
+MLIR_CAPI_EXPORTED int64_t
+mlirContiguousLayoutAttrGetOffset(MlirAttribute attr);
+
+// Returns the number of permutation entries in the given contiguous layout
+// attribute.
+MLIR_CAPI_EXPORTED intptr_t mlirContiguousLayoutAttrGetRank(MlirAttribute attr);
+
+// Returns the pos-th permutation entry stored in the given contiguous layout
+// attribute.
+MLIR_CAPI_EXPORTED int64_t
+mlirContiguousLayoutAttrGetPermutationEntry(MlirAttribute attr, intptr_t pos);
+
+/// Returns the typeID of a ContiguousLayout attribute.
+MLIR_CAPI_EXPORTED MlirTypeID mlirContiguousLayoutAttrGetTypeID(void);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td b/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
index 134cca5800918..121099f3c2590 100644
--- a/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
+++ b/mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
@@ -32,8 +32,9 @@ def MemRefTypeAttr
 class MemRef_Op<string mnemonic, list<Trait> traits = []>
     : Op<MemRef_Dialect, mnemonic, traits>;
 
-// Base class for ops with static/dynamic offset, sizes and strides
-// attributes/arguments.
+// Base class for ops with static/dynamic offset, sizes and optional strides
+// attributes/arguments. When the strides are not specified, this implies a
+// contiguous layout.
 class MemRef_OpWithOffsetSizesAndStrides<string mnemonic,
                                          list<Trait> traits = []>
     : MemRef_Op<mnemonic, traits> {
diff --git a/mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h b/mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h
index 3af89a6ab3799..183bdb005c186 100644
--- a/mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h
+++ b/mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h
@@ -178,8 +178,10 @@ LogicalResult reshapeLikeShapesAreCompatible(
     ArrayRef<int64_t> collapsedShape, ArrayRef<int64_t> expandedShape,
     ArrayRef<ReassociationIndices> reassociationMaps, bool isExpandingReshape);
 
-/// Returns true iff the type is a MemRefType and has a non-identity layout.
-bool hasNonIdentityLayout(Type type);
+/// Returns true iff the type is a MemRefType and has a layout that is not
+/// row-major contiguous - that is, the identity layout with an optional
+/// offset.
+bool hasNonRowMajorContiguousLayout(Type type);
 
 enum class ReshapeOpKind { kExpand, kCollapse };
 
@@ -197,9 +199,9 @@ struct ComposeReassociativeReshapeOps : public OpRewritePattern<ReshapeOpTy> {
 
     ShapedType resultType = reshapeOp.getResultType();
 
-    if (hasNonIdentityLayout(srcReshapeOp.getSrc().getType()) ||
-        hasNonIdentityLayout(reshapeOp.getSrc().getType()) ||
-        hasNonIdentityLayout(reshapeOp.getResult().getType()))
+    if (hasNonRowMajorContiguousLayout(srcReshapeOp.getSrc().getType()) ||
+        hasNonRowMajorContiguousLayout(reshapeOp.getSrc().getType()) ||
+        hasNonRowMajorContiguousLayout(reshapeOp.getResult().getType()))
       return failure();
 
     std::optional<SmallVector<ReassociationIndices>> reassociationIndices =
@@ -265,9 +267,9 @@ struct ComposeCollapseOfExpandOp : public OpRewritePattern<CollapseOpTy> {
     ShapedType srcType = expandOp.getSrcType();
     ShapedType resultType = collapseOp.getResultType();
 
-    if (hasNonIdentityLayout(collapseOp.getSrc().getType()) ||
-        hasNonIdentityLayout(expandOp.getSrc().getType()) ||
-        hasNonIdentityLayout(expandOp.getResult().getType()))
+    if (hasNonRowMajorContiguousLayout(collapseOp.getSrc().getType()) ||
+        hasNonRowMajorContiguousLayout(expandOp.getSrc().getType()) ||
+        hasNonRowMajorContiguousLayout(expandOp.getResult().getType()))
       return failure();
 
     int64_t srcRank = srcType.getRank();
@@ -331,9 +333,9 @@ struct ComposeExpandOfCollapseOp : public OpRewritePattern<ExpandOpTy> {
     ShapedType srcType = collapseOp.getSrcType();
     ShapedType resultType = expandOp.getResultType();
 
-    if (hasNonIdentityLayout(expandOp.getSrc().getType()) ||
-        hasNonIdentityLayout(collapseOp.getSrc().getType()) ||
-        hasNonIdentityLayout(collapseOp.getResult().getType()))
+    if (hasNonRowMajorContiguousLayout(expandOp.getSrc().getType()) ||
+        hasNonRowMajorContiguousLayout(collapseOp.getSrc().getType()) ||
+        hasNonRowMajorContiguousLayout(collapseOp.getResult().getType()))
       return failure();
 
     int64_t srcRank = srcType.getRank();
@@ -451,7 +453,7 @@ getLinearizedDimensions(ArrayRef<ReassociationIndices> reassociationIndices);
 ///    %4 = tensor.extract_slice %0 [%3#0, %3#1, %3#2, 0] [1, 1, 1, 10] [1, 1, 1, 1] :
 ///          tensor<3x7x11x10xf32> to tensor<1x1x1x10xf32>
 ///
-///    %5 = tensor.collapse_shape %4 [[0, 1, 2], [3]] : 
+///    %5 = tensor.collapse_shape %4 [[0, 1, 2], [3]] :
 ///          tensor<1x1x1x10xf32> into tensor<1x10xf32>
 ///    %6 = tensor.insert_slice %5 into %arg0 [%iv, 0] [1, 10] [1, 1] :
 ///          tensor<1x10xf32> into tensor<10x10xf32>
diff --git a/mlir/include/mlir/IR/BuiltinAttributes.h b/mlir/include/mlir/IR/BuiltinAttributes.h
index 901df3a25a46f..f7b9a78ef4cc9 100644
--- a/mlir/include/mlir/IR/BuiltinAttributes.h
+++ b/mlir/include/mlir/IR/BuiltinAttributes.h
@@ -1081,6 +1081,28 @@ inline bool operator!=(StringRef lhs, StringAttr rhs) { return !(lhs == rhs); }
 
 namespace mlir {
 
+/// Given an N-dimensional permutation and an offset (which can use
+/// ShapedType::kDynamic) to represent a dynamic value), return the
+/// N-dimensional map that is permuted according to said permutation and adds
+/// the offset to the final output. If the permutation has no outputs (it's a
+/// 0-D map), add one result to hold the offset.
+///
+/// Examples:
+/// =========
+///
+/// offset = 0, permutation = [0, 1, 2] gives
+/// [](d0, d1, d2) -> (d0, d1, d2)
+/// while offset = 5 gives [](d0, d1, d2) -> (d0, d1, d2 + 5)
+/// and offset = ? gives [s0](d0, d1, d2) -> (d0, d1, d2 + s0).
+///
+/// offset = ?, permutation = [2, 1, 0] gives
+/// [s0](d0, d1, d2) -> (d2, d1, d0 + s0)
+///
+/// Finally, offset = 0, permutation = [], gives []() -> (0), while
+/// offset = ?, permutation = [] gives [s0]() -> (s0).
+AffineMap makePermutedMapWithOffset(ArrayRef<int64_t> permutation,
+                                    int64_t offset, MLIRContext *context);
+
 /// Given a list of strides (in which ShapedType::kDynamic
 /// represents a dynamic value), return the single result AffineMap which
 /// represents the linearized strided layout map. Dimensions correspond to the
diff --git a/mlir/include/mlir/IR/BuiltinAttributes.td b/mlir/include/mlir/IR/BuiltinAttributes.td
index 6826d1a437775..455864d50cd8e 100644
--- a/mlir/include/mlir/IR/BuiltinAttributes.td
+++ b/mlir/include/mlir/IR/BuiltinAttributes.td
@@ -164,7 +164,7 @@ def Builtin_DenseArrayRawDataParameter : ArrayRefParameter<
   }];
 }
 
-def Builtin_DenseArray : Builtin_Attr<"DenseArray", "dense_array", 
+def Builtin_DenseArray : Builtin_Attr<"DenseArray", "dense_array",
     [BlobAttrInterface]> {
   let summary = "A dense array of integer or floating point elements.";
   let description = [{
@@ -494,7 +494,7 @@ def Builtin_DenseResourceElementsAttr : Builtin_Attr<"DenseResourceElements",
     /// when building the attribute. The provided `blobName` is used as a hint
     /// for the key of the new handle for the `blob` resource, but may be
     /// changed if necessary to ensure uniqueness during insertion.
-    /// This base class builder does no element type specific size or alignment 
+    /// This base class builder does no element type specific size or alignment
     /// checking. Use the typed subclasses for more safety unless if performing
     /// generic operations.
     AttrBuilderWithInferredContext<(ins
@@ -1051,9 +1051,96 @@ def StridedLayoutAttr : Builtin_Attr<"StridedLayout", "strided_layout",
     /// Returns true if this layout is static, i.e. the strides and offset all
     /// have a known value > 0.
     bool hasStaticLayout() const;
+
+    /// Get a "canonical" strided layout for the given strides.
+    /// This constructs a strided layout with the given `offset` and `strides`,
+    /// except that if either the strides are empty or equal to [1], it returns
+    /// the corresponding ContiguousLayoutAttr in order to guard against multiple
+    /// representations of the identity layout.
+    static ::mlir::MemRefLayoutAttrInterface getCanonical(MLIRContext *context,
+      int64_t offset, ::llvm::ArrayRef<int64_t> strides);
   }];
 }
 
+//===----------------------------------------------------------------------===//
+// ContiguousLayoutAttr
+//===----------------------------------------------------------------------===//
+
+def ContiguousLayoutAttr : Builtin_Attr<"ContiguousLayout", "contiguous_layout",
+    [DeclareAttrInterfaceMethods<MemRefLayoutAttrInterface,
+                                 ["isIdentity", "verifyLayout"]>]> {
+  let summary = "An Attribute representing a contiguous layout of a shaped type";
+  let description = [{
+    Syntax:
+
+    ```
+    contiguous-layout-attribute ::= `contiguous` `<` maybe-permutation
+                                 (`,` `offset` `:` dimension)? `>`
+    maybe-permutation ::= decimal-literal | `[` permutation `]`
+    permutation ::= decimal-literal (`,` decimal-literal)*
+    dimension ::= decimal-literal | `?`
+    ```
+
+    A contiguous layout is a layout that represents a sequence of dimensions
+    laid out in linear memory in its canonical form. Specifically, it indicates
+    that if one permutes the dimensions of a memref according to `permutaton`,
+    they will be in a row-major contiguos form: that is, the stride (in the
+    sense of the strided layout) of dimension `permutation[i]` is equal
+    to the products of the sizes of all dimensions appearing later in the permutation.
+
+    For example, a MxN memref with a `contiguous<[1, 0]>` layout is colmn-major:
+    advancing in the M dimension requires moving by 1 element in linear memory,
+    while the N dimension requires moving by M elements. Conversely,
+    if the layout is `contiguous<[0, 1]>` (which can be written `contiguous<2>`
+    for brevity and will be omitted from printing without an offset), the stride
+    of the N dimension will be 1 element while the stride of the M dimension will be
+    N elements.
+
+    As a more complex example, `memref<AxBxCxT, contigous<[2, 0, 1], offset: D>>`
+    , where A, B, C, and D are potentially dynamic values, means that
+    the value at index `[%i, %j, %k]` is located `%k * A * B + %i * B + %j + D`
+    elements from the beginning of the memory underlying that memref.
+
+    The permutation must contain the integers between 0 and the rank of the memref - 1,
+    and must have one distinct entry for each memref dimension. The value
+    `[0, 1, ..., N-1]`, specifying a row-major format, may be printed as `N`
+    for clarity.
+
+    If an offset is specified, it is a number of elements to move within
+    the underlying linear memory after the permutation is applied. This offset
+    may be _dynamic_, meaning that it may not be known at compile time.
+    A dynamic offset is represented as a `?` in the assembly syntax and as
+    `ShapedType::kDynamic` in the code. The offset must be non-negative.
+
+    See [Dialects/Builtin.md#memreftype](MemRef type) for more information.
+  }];
+
+  let parameters = (ins
+    "int64_t":$offset,
+    ArrayRefParameter<
+      "int64_t",
+      "permutation (64-bit integer)"
+    >:$permutation
+  );
+
+  let builders = [
+    // Builder for row-major contiguous attribute.
+    AttrBuilder<(ins "int64_t":$offset, "int64_t":$rank)>
+  ];
+  let genVerifyDecl = 1;
+
+  let extraClassDeclaration = [{
+    /// Print the attribute to the given output stream.
+    void print(raw_ostream &os) const;
+
+    /// Returns true if this layout is static, i.e. the offset has a static value.
+    bool hasStaticLayout() const;
+
+    /// Return true if this layout has a row-major permutation - that is, the
+    /// dimensions of the shape are not permuted.
+    bool isRowMajor() const;
+  }];
+}
 
 //===----------------------------------------------------------------------===//
 // StringAttr
diff --git a/mlir/include/mlir/IR/BuiltinTypes.td b/mlir/include/mlir/IR/BuiltinTypes.td
index af474b3e3ec47..db21dfd656161 100644
--- a/mlir/include/mlir/IR/BuiltinTypes.td
+++ b/mlir/include/mlir/IR/BuiltinTypes.td
@@ -585,20 +585,27 @@ def Builtin_MemRef : Builtin_Type<"MemRef", "memref", [
     layout must avoid internal aliasing, i.e., two distinct tuples of
     _in-bounds_ indices must be pointing to different elements in memory. The
     layout is an attribute that implements `MemRefLayoutAttrInterface`. The
-    bulitin dialect offers two kinds of layouts: strided and affine map, each
-    of which is available as an attribute. Other attributes may be used to
-    represent the layout as long as they can be converted to a
+    bulitin dialect offers three kinds of layouts: contiguous, strided and
+    affine map, each of which is available as an attribute. Other attributes may be
+    used to represent the layout as long as they can be converted to a
     [semi-affine map](Affine.md/#semi-affine-maps) and implement the required
     interface. Users of memref are expected to fallback to the affine
     representation when handling unknown memref layouts. Multi-dimensional
     affine forms are interpreted in _row-major_ fashion.
 
     In absence of an explicit layout, a memref is considered to have a
-    multi-dimensional identity affine map layout.  Identity layout maps do not
-    contribute to the MemRef type identification and are discarded on
-    construction. That is, a type with an explicit identity map is
+    row-major contiguous layout with an offset of 0, which is equivalent
+    to a multi-dimensional identity map.  For backwards compatibility,
+    identity layout maps do not contribute to the MemRef type identification and
+    are discarded on construction. That is, a type with an explicit identity map is
     `memref<?x?xf32, (i,j)->(i,j)>` is strictly the same as the one without a
-    layout, `memref<?x?xf32>`.
+    layout, `memref<?x?xf32>`, which, written explicitly, has the layout
+    `memref<?x?xf32, contiguous<2>>`.
+
+    The built-in layouts form a hierarchy: all contiguous layuts are strided layouts,
+    and all strided layouts are affine map layouts, but the reverse is not true.
+    Using a more specific layout may permit a greater degree of optimization in
+    the generated code.
 
     ##### Affine Map Layout
 
@@ -656,6 +663,37 @@ def Builtin_MemRef : Builtin_Type<"MemRef", "memref", [
     Therefore, it is never subject to the implicit row-major layout
     interpretation.
 
+    ### Contiguous layout
+
+    The most restricted of the built-in layouts is the _contiguous_ layout, which
+    expresses the fact that the in-memory layout of the memref would be row-major
+    without padding after the associated permutation is applied. Equivalently,
+    a contigous layout is a strided layout where the strides are implicitly computed
+    from the (permuted) sizes of the memref.
+
+    This layout is necessary to allow optimizations during lowering passes in the
+    presence of dynamic sizes, since
+    `memref<?x?x?xf32, strided<[?, ?, 1], offset: ?>>` doesn't specify if it's
+    dimensions have padding in between tem or not - the two non-1 strides are
+    dynamic. By contrast, `contiguous<3, offset: ?>` indiates a row-major layout
+    with an offset, while `contiguous<[2, 1, 0], offset: ?>` indicates a
+    column-major layout. While this scheme could be expressed with an affine map,
+    some operations expect memrefs to be in a form compatible with the `strided`
+    layout, which can be difficult to detect from analyzing an affine expression.
+
+    In general, the layout `contiguous<[p0, p1, ..., pN], offset: V>`
+    corresponds to the affine map
+
+    ```mlir
+    affine_map<(d0, ..., dN) -> (d[p0], d[p1], ... + d[pN] + V)>
+    ```
+
+    where `V` is either `s0` if it is dynamic or some constant value.
+
+    For convenience, the layout `contigous<[0, 1, ..., N], offset: V>` is printed
+    as `contigous<N+1, offset: V>`, and the `, offset: V` segment is omitted if `V`
+    is `0`.
+
     ##### Codegen of Unranked Memref
 
     Using unranked memref in codegen besides the case mentioned above is highly
@@ -815,6 +853,10 @@ def Builtin_MemRef : Builtin_Type<"MemRef", "memref", [
     ///   considering both _all_ and _only_ the trailing 3 dims,
     ///   - memref<5x4x3x2xi8, strided<[48, 6, 2, 1]> is _only_ contiguous when
     ///   considering the trailing 3 dims.
+    ///   - memref<?x?x?xi8, contiguous<3, offset: ?>> is contiguous when
+    ///   considering all dimensions.
+    ///   - memref<?x?x?x?xi32, contiguous<[1, 0, 2, 3], offset: ?>> is
+    ///   _only_ contiguous when considering the trailing 2 dimensions.
     ///
     bool areTrailingDimsContiguous(int64_t n);
 
@@ -830,8 +872,8 @@ def Builtin_MemRef : Builtin_Type<"MemRef", "memref", [
 
     /// Returns the strides of the MemRef if the layout map is in strided form.
     /// MemRefs with a layout map in strided form include:
-    ///   1. empty or identity layout map, in which case the stride information
-    ///      is the canonical form computed from sizes;
+    ///   1. the empty layout, the identity layout affine map, and any ContigousLayoutAttr,
+    ///      in which case the stride information is the canonical form computed from sizes;
     ///   2. a StridedLayoutAttr layout;
     ///   3. any other layout that be converted into a single affine map layout
     ///      of the form `K + k0 * d0 + ... kn * dn`, where K and ki's are
diff --git a/mlir/lib/AsmParser/AttributeParser.cpp b/mlir/lib/AsmParser/AttributeParser.cpp
index 2013d3623711b..0175c6cc3d093 100644
--- a/mlir/lib/AsmParser/AttributeParser.cpp
+++ b/mlir/lib/AsmParser/AttributeParser.cpp
@@ -156,6 +156,10 @@ Attribute Parser::parseAttribute(Type type) {
   case Token::kw_strided:
     return parseStridedLayoutAttr();
 
+  // Parse a contiguous layout attribute.
+  case Token::kw_contiguous:
+    return parseContiguousLayoutAttr();
+
   // Parse a distinct attribute.
   case Token::kw_distinct:
     return parseDistinctAttr(type);
@@ -1100,6 +1104,88 @@ Attribute Parser::parseSparseElementsAttr(Type attrType) {
   ret...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/131663