[Mlir-commits] [mlir] [mlir][vector] Make the in_bounds attribute mandatory (PR #97049)

Tue Jul 2 12:05:16 PDT 2024

llvmbot wrote:



@llvm/pr-subscribers-mlir-memref
@llvm/pr-subscribers-mlir-nvgpu
@llvm/pr-subscribers-mlir-sme

@llvm/pr-subscribers-mlir

Author: Andrzej Warzyński (banach-space)

<details>
<summary>Changes</summary>

Makes the `in_bounds` attribute for vector.transfer_read and
vector.transfer_write Ops mandatory. In addition, makes the Asm printer
always print this attribute - tests are updated accordingly.

### Updates in tests
####  Default `in_bounds` value

Originally, most tests would skip the `in_bounds` attribute - this was
equivalent to setting all values to `false` [1]. With this change, this
has to be done explicitly when writing a test. Note, especially when
reviewing this change, that the vast majority of newly inserted
`in_bounds` attributes are set to `false` to preserve the original
semantics of the tests.

There is only one exception - for broadcast dimensions the newly
inserted `in_bounds` attribute is set to `true`. As per [2]:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

This matches the original semantics:
  * the `in_bounds` attribute in the context of broadcast dimensions
    would only be checked when present,
  * the verifier wasn't aware of the default value set in [1],

This means that effectively, the attribute was set to `false` even for
broadcast dims, but the verifier wasn't aware of that. This change makes
that behaviour more explicit by setting the attribute to `true` for
broadcast dims. In all other cases, the attribute is set to `false` - if
that's not the case, consider that as a typo.

####  0-D vectors

Reading and writing to/from 0D vectors also requires the `in_bounds`
attribute. In this case, the attribute has to be empty:

```mlir
vector.transfer_write %5, %m1[] {in_bounds=[]} : vector<f32>, memref<f32>
```

#### CHECK lines

With this PR, the `in_bounds` attribute is always print. This required
updating the `CHECK` lines that previously assumed that the attribute
would be skipped. To keep this type of changes simple, I've only added
`{{.*}}` to make sure that tests pass.

### Changes in "Vectorization.cpp"

The following patterns are updated to explicitly set the `in_bounds`
attribute to `false`:
  * `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

### Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The updates in `vectorizeAffineLoad` (SuperVectorize.cpp) and
`vectorizeAsLinalgGeneric` (Vectorization.cpp) are introduced to make
sure that xfer Ops created by these vectorisers set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would fail.

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Previously, when `in_bounds` was
optional, the verification would skip checking the attribute if it
wasn't present. However, it would default to `false` in other places.
Put differently, this change does not change the existing behaviour, it
merely makes it more explicit.

[1] https://github.com/llvm/llvm-project/blob/4145ad2bac4bb99d5034d60c74bb2789f6c6e802/mlir/include/mlir/Interfaces/VectorInterfaces.td#L243-L246
[2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop

---

Patch is 314.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97049.diff


104 Files Affected:

- (modified) mlir/include/mlir/Dialect/Vector/IR/VectorOps.td (+2-2) 
- (modified) mlir/include/mlir/IR/AffineMap.h (+4) 
- (modified) mlir/include/mlir/Interfaces/VectorInterfaces.td (+2-2) 
- (modified) mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp (+13-1) 
- (modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (+19-7) 
- (modified) mlir/lib/Dialect/Vector/IR/VectorOps.cpp (+20-23) 
- (modified) mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp (+2-2) 
- (modified) mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp (+2-6) 
- (modified) mlir/lib/IR/AffineMap.cpp (+17) 
- (modified) mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir (+1-1) 
- (modified) mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir (+8-8) 
- (modified) mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir (+1-1) 
- (modified) mlir/test/Conversion/VectorToSCF/vector-to-scf-mask-and-permutation-map.mlir (+2-2) 
- (modified) mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir (+33-33) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vector_utils.mlir (+1-1) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir (+10-10) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir (+5-5) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir (+6-6) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_2d.mlir (+1-1) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_transpose_2d.mlir (+4-4) 
- (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_transpose_2d.mlir (+4-4) 
- (modified) mlir/test/Dialect/ArmSME/vector-legalization.mlir (+5-5) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-analysis-bottom-up-from-terminators.mlir (+2-2) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir (+7-7) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize.mlir (+5-5) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-analysis.mlir (+11-11) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-force-copy-before-write.mlir (+2-2) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir (+4-4) 
- (modified) mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir (+5-5) 
- (modified) mlir/test/Dialect/Linalg/forward-vector-transfers.mlir (+8-5) 
- (modified) mlir/test/Dialect/Linalg/hoisting.mlir (+50-50) 
- (modified) mlir/test/Dialect/Linalg/one-shot-bufferize.mlir (+3-3) 
- (modified) mlir/test/Dialect/Linalg/transform-op-bufferize-to-allocation.mlir (+1-1) 
- (modified) mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir (+9-9) 
- (modified) mlir/test/Dialect/Linalg/vectorization.mlir (+1-1) 
- (modified) mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir (+1-1) 
- (modified) mlir/test/Dialect/MemRef/extract-address-computations.mlir (+10-10) 
- (modified) mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir (+1-1) 
- (modified) mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir (+4-4) 
- (modified) mlir/test/Dialect/SCF/one-shot-bufferize-analysis.mlir (+19-19) 
- (modified) mlir/test/Dialect/SCF/one-shot-bufferize.mlir (+4-3) 
- (modified) mlir/test/Dialect/Tensor/fold-tensor-subset-ops.mlir (+1-1) 
- (modified) mlir/test/Dialect/Tensor/one-shot-bufferize.mlir (+2-2) 
- (modified) mlir/test/Dialect/Vector/bufferize-invalid.mlir (+1-1) 
- (modified) mlir/test/Dialect/Vector/canonicalize.mlir (+22-22) 
- (modified) mlir/test/Dialect/Vector/invalid.mlir (+27-26) 
- (modified) mlir/test/Dialect/Vector/lower-vector-mask.mlir (+6-6) 
- (modified) mlir/test/Dialect/Vector/one-shot-bufferize.mlir (+4-4) 
- (modified) mlir/test/Dialect/Vector/ops.mlir (+49-49) 
- (modified) mlir/test/Dialect/Vector/scalar-vector-transfer-to-memref.mlir (+6-6) 
- (modified) mlir/test/Dialect/Vector/value-bounds-op-interface-impl.mlir (+1-1) 
- (modified) mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir (+2-2) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir (+7-7) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-drop-unit-dims-patterns.mlir (+7-7) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-flatten.mlir (+12-12) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-full-partial-split-copy-transform.mlir (+4-4) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir (+11-11) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-permutation-lowering.mlir (+2-2) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir (+21-21) 
- (modified) mlir/test/Dialect/Vector/vector-transfer-unroll.mlir (+12-12) 
- (modified) mlir/test/Dialect/Vector/vector-transforms.mlir (+19-19) 
- (modified) mlir/test/Dialect/Vector/vector-warp-distribute.mlir (+33-33) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/dual_sparse_conv_2d.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/padded_sparse_conv_2d.mlir (+2-2) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_block_matmul.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir (+10-10) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cmp.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_collapse_shape.mlir (+6-6) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir (+2-2) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_55.mlir (+6-6) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nchw_fchw.mlir (+4-4) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_element.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_coo_test.mlir (+4-4) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir (+4-4) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_expand_shape.mlir (+6-6) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir (+8-8) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir (+5-5) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_permute.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pooling_nhwc.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir (+15-15) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir (+2-2) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir (+4-4) 
- (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir (+1-1) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/realloc.mlir (+3-3) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read-1d.mlir (+5-5) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read-2d.mlir (+9-9) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read-3d.mlir (+9-4) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read.mlir (+3-3) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-to-loops.mlir (+7-7) 
- (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-write.mlir (+4-3) 
- (modified) mlir/test/Transforms/loop-invariant-subset-hoisting.mlir (+39-39) 


``````````diff

diff --git a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
index 097e5e6fb0d61..4a77291b7fafb 100644
--- a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
+++ b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
@@ -1363,7 +1363,7 @@ def Vector_TransferReadOp :
                    AffineMapAttr:$permutation_map,
                    AnyType:$padding,
                    Optional<VectorOf<[I1]>>:$mask,
-                   OptionalAttr<BoolArrayAttr>:$in_bounds)>,
+                   BoolArrayAttr:$in_bounds)>,
     Results<(outs AnyVectorOfAnyRank:$vector)> {
 
   let summary = "Reads a supervector from memory into an SSA vector value.";
@@ -1607,7 +1607,7 @@ def Vector_TransferWriteOp :
                    Variadic<Index>:$indices,
                    AffineMapAttr:$permutation_map,
                    Optional<VectorOf<[I1]>>:$mask,
-                   OptionalAttr<BoolArrayAttr>:$in_bounds)>,
+                   BoolArrayAttr:$in_bounds)>,
     Results<(outs Optional<AnyRankedTensor>:$result)> {
 
   let summary = "The vector.transfer_write op writes a supervector to memory.";
diff --git a/mlir/include/mlir/IR/AffineMap.h b/mlir/include/mlir/IR/AffineMap.h
index cce141253989e..01772f7782ba8 100644
--- a/mlir/include/mlir/IR/AffineMap.h
+++ b/mlir/include/mlir/IR/AffineMap.h
@@ -156,6 +156,10 @@ class AffineMap {
   bool isMinorIdentityWithBroadcasting(
       SmallVectorImpl<unsigned> *broadcastedDims = nullptr) const;
 
+  // TODO: Document
+  void
+  getBroadcastDims(SmallVectorImpl<unsigned> *broadcastedDims = nullptr) const;
+
   /// Return true if this affine map can be converted to a minor identity with
   /// broadcast by doing a permute. Return a permutation (there may be
   /// several) to apply to get to a minor identity with broadcasts.
diff --git a/mlir/include/mlir/Interfaces/VectorInterfaces.td b/mlir/include/mlir/Interfaces/VectorInterfaces.td
index 781d6d3e3f813..f6682f2eabe1e 100644
--- a/mlir/include/mlir/Interfaces/VectorInterfaces.td
+++ b/mlir/include/mlir/Interfaces/VectorInterfaces.td
@@ -98,7 +98,7 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
         dimension whether it is in-bounds or not. (Broadcast dimensions are
         always in-bounds).
       }],
-      /*retTy=*/"::std::optional<::mlir::ArrayAttr>",
+      /*retTy=*/"::mlir::ArrayAttr",
       /*methodName=*/"getInBounds",
       /*args=*/(ins)
     >,
@@ -242,7 +242,7 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
         return true;
       if (!$_op.getInBounds())
         return false;
-      auto inBounds = ::llvm::cast<::mlir::ArrayAttr>(*$_op.getInBounds());
+      auto inBounds = $_op.getInBounds();
       return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();
     }
 
diff --git a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
index 71e9648a5e00f..033f35391849e 100644
--- a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
+++ b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
@@ -1223,8 +1223,20 @@ static Operation *vectorizeAffineLoad(AffineLoadOp loadOp,
   LLVM_DEBUG(dbgs() << "\n[early-vect]+++++ permutationMap: ");
   LLVM_DEBUG(permutationMap.print(dbgs()));
 
+  // Make sure that the in_bounds attribute corresponding to a broadcast dim
+  // is set to `true` - that's required by the xfer Op.
+  // FIXME: We're not veryfying whether the corresponding access is in bounds.
+  // TODO: Use masking instead.
+  SmallVector<unsigned> broadcastedDims = {};
+  permutationMap.getBroadcastDims(&broadcastedDims);
+  SmallVector<bool> inBounds(vectorType.getRank(), false);
+
+  for (auto idx : broadcastedDims)
+    inBounds[idx] = true;
+
   auto transfer = state.builder.create<vector::TransferReadOp>(
-      loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap);
+      loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap,
+      ArrayRef<bool>(inBounds));
 
   // Register replacement for future uses in the scope.
   state.registerOpVectorReplacement(loadOp, transfer);
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 3a75d2ac08157..c15d101afa94b 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1338,8 +1338,18 @@ vectorizeAsLinalgGeneric(RewriterBase &rewriter, VectorizationState &state,
 
     SmallVector<Value> indices(linalgOp.getShape(opOperand).size(), zero);
 
+    // Make sure that the in_bounds attribute corresponding to a broadcast dim
+    // is `true`
+    SmallVector<unsigned> broadcastedDims = {};
+    readMap.getBroadcastDims(&broadcastedDims);
+    SmallVector<bool> inBounds(readType.getRank(), false);
+
+    for (auto idx : broadcastedDims)
+      inBounds[idx] = true;
+
     Operation *read = rewriter.create<vector::TransferReadOp>(
-        loc, readType, opOperand->get(), indices, readMap);
+        loc, readType, opOperand->get(), indices, readMap,
+        ArrayRef<bool>(inBounds));
     read = state.maskOperation(rewriter, read, linalgOp, maskingMap);
     Value readValue = read->getResult(0);
 
@@ -2676,11 +2686,12 @@ LogicalResult LinalgCopyVTRForwardingPattern::matchAndRewrite(
   // The `masked` attribute is only valid on this padded buffer.
   // When forwarding to vector.transfer_read, the attribute must be reset
   // conservatively.
+  auto vectorType = xferOp.getVectorType();
   Value res = rewriter.create<vector::TransferReadOp>(
-      xferOp.getLoc(), xferOp.getVectorType(), in, xferOp.getIndices(),
+      xferOp.getLoc(), vectorType, in, xferOp.getIndices(),
       xferOp.getPermutationMapAttr(), xferOp.getPadding(), xferOp.getMask(),
-      // in_bounds is explicitly reset
-      /*inBoundsAttr=*/ArrayAttr());
+      rewriter.getBoolArrayAttr(
+          SmallVector<bool>(vectorType.getRank(), false)));
 
   if (maybeFillOp)
     rewriter.eraseOp(maybeFillOp);
@@ -2734,11 +2745,12 @@ LogicalResult LinalgCopyVTWForwardingPattern::matchAndRewrite(
   // The `masked` attribute is only valid on this padded buffer.
   // When forwarding to vector.transfer_write, the attribute must be reset
   // conservatively.
+  auto vector = xferOp.getVector();
   rewriter.create<vector::TransferWriteOp>(
-      xferOp.getLoc(), xferOp.getVector(), out, xferOp.getIndices(),
+      xferOp.getLoc(), vector, out, xferOp.getIndices(),
       xferOp.getPermutationMapAttr(), xferOp.getMask(),
-      // in_bounds is explicitly reset
-      /*inBoundsAttr=*/ArrayAttr());
+      rewriter.getBoolArrayAttr(
+          SmallVector<bool>(vector.getType().getRank(), false)));
 
   rewriter.eraseOp(copyOp);
   rewriter.eraseOp(xferOp);
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 149723f51cc12..cb4cb92a66a93 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -3817,7 +3817,8 @@ void TransferReadOp::build(OpBuilder &builder, OperationState &result,
   auto permutationMapAttr = AffineMapAttr::get(permutationMap);
   auto inBoundsAttr = (inBounds && !inBounds.value().empty())
                           ? builder.getBoolArrayAttr(inBounds.value())
-                          : ArrayAttr();
+                          : builder.getBoolArrayAttr(
+                                SmallVector<bool>(vectorType.getRank(), false));
   build(builder, result, vectorType, source, indices, permutationMapAttr,
         inBoundsAttr);
 }
@@ -3832,7 +3833,8 @@ void TransferReadOp::build(OpBuilder &builder, OperationState &result,
   auto permutationMapAttr = AffineMapAttr::get(permutationMap);
   auto inBoundsAttr = (inBounds && !inBounds.value().empty())
                           ? builder.getBoolArrayAttr(inBounds.value())
-                          : ArrayAttr();
+                          : builder.getBoolArrayAttr(
+                                SmallVector<bool>(vectorType.getRank(), false));
   build(builder, result, vectorType, source, indices, permutationMapAttr,
         padding,
         /*mask=*/Value(), inBoundsAttr);
@@ -3950,17 +3952,15 @@ verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,
            << inferredMaskType << ") and mask operand type (" << maskType
            << ") don't match";
 
-  if (inBounds) {
-    if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
-      return op->emitOpError("expects the optional in_bounds attr of same rank "
-                             "as permutation_map results: ")
-             << AffineMapAttr::get(permutationMap)
-             << " vs inBounds of size: " << inBounds.size();
-    for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
-      if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
-          !llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
-        return op->emitOpError("requires broadcast dimensions to be in-bounds");
-  }
+  if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
+    return op->emitOpError("expects the in_bounds attr of same rank "
+                           "as permutation_map results: ")
+           << AffineMapAttr::get(permutationMap)
+           << " vs inBounds of size: " << inBounds.size();
+  for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
+    if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
+        !llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
+      return op->emitOpError("requires broadcast dimensions to be in-bounds");
 
   return success();
 }
@@ -3970,9 +3970,6 @@ static void printTransferAttrs(OpAsmPrinter &p, VectorTransferOpInterface op) {
   elidedAttrs.push_back(TransferReadOp::getOperandSegmentSizeAttr());
   if (op.getPermutationMap().isMinorIdentity())
     elidedAttrs.push_back(op.getPermutationMapAttrName());
-  // Elide in_bounds attribute if all dims are out-of-bounds.
-  if (llvm::none_of(op.getInBoundsValues(), [](bool b) { return b; }))
-    elidedAttrs.push_back(op.getInBoundsAttrName());
   p.printOptionalAttrDict(op->getAttrs(), elidedAttrs);
 }
 
@@ -4080,8 +4077,7 @@ LogicalResult TransferReadOp::verify() {
 
   if (failed(verifyTransferOp(cast<VectorTransferOpInterface>(getOperation()),
                               shapedType, vectorType, maskType,
-                              inferredMaskType, permutationMap,
-                              getInBounds() ? *getInBounds() : ArrayAttr())))
+                              inferredMaskType, permutationMap, getInBounds())))
     return failure();
 
   if (auto sourceVectorElementType =
@@ -4354,9 +4350,11 @@ void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
                             AffineMap permutationMap,
                             std::optional<ArrayRef<bool>> inBounds) {
   auto permutationMapAttr = AffineMapAttr::get(permutationMap);
-  auto inBoundsAttr = (inBounds && !inBounds.value().empty())
-                          ? builder.getBoolArrayAttr(inBounds.value())
-                          : ArrayAttr();
+  auto inBoundsAttr =
+      (inBounds && !inBounds.value().empty())
+          ? builder.getBoolArrayAttr(inBounds.value())
+          : builder.getBoolArrayAttr(SmallVector<bool>(
+                llvm::cast<VectorType>(vector.getType()).getRank(), false));
   build(builder, result, vector, dest, indices, permutationMapAttr,
         /*mask=*/Value(), inBoundsAttr);
 }
@@ -4462,8 +4460,7 @@ LogicalResult TransferWriteOp::verify() {
 
   if (failed(verifyTransferOp(cast<VectorTransferOpInterface>(getOperation()),
                               shapedType, vectorType, maskType,
-                              inferredMaskType, permutationMap,
-                              getInBounds() ? *getInBounds() : ArrayAttr())))
+                              inferredMaskType, permutationMap, getInBounds())))
     return failure();
 
   return verifyPermutationMap(permutationMap,
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
index f53bb5157eb37..dfeb7bc53adad 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
@@ -224,7 +224,7 @@ struct MaskedTransferReadOpPattern
     rewriter.replaceOpWithNewOp<TransferReadOp>(
         maskingOp.getOperation(), readOp.getVectorType(), readOp.getSource(),
         readOp.getIndices(), readOp.getPermutationMap(), readOp.getPadding(),
-        maskingOp.getMask(), readOp.getInBounds().value_or(ArrayAttr()));
+        maskingOp.getMask(), readOp.getInBounds());
     return success();
   }
 };
@@ -246,7 +246,7 @@ struct MaskedTransferWriteOpPattern
     rewriter.replaceOpWithNewOp<TransferWriteOp>(
         maskingOp.getOperation(), resultType, writeOp.getVector(),
         writeOp.getSource(), writeOp.getIndices(), writeOp.getPermutationMap(),
-        maskingOp.getMask(), writeOp.getInBounds().value_or(ArrayAttr()));
+        maskingOp.getMask(), writeOp.getInBounds());
     return success();
   }
 };
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
index c31c51489ecc9..b3c6dec47f6be 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
@@ -133,9 +133,7 @@ struct TransferReadPermutationLowering
 
     // Transpose in_bounds attribute.
     ArrayAttr newInBoundsAttr =
-        op.getInBounds() ? inverseTransposeInBoundsAttr(
-                               rewriter, op.getInBounds().value(), permutation)
-                         : ArrayAttr();
+        inverseTransposeInBoundsAttr(rewriter, op.getInBounds(), permutation);
 
     // Generate new transfer_read operation.
     VectorType newReadType = VectorType::get(
@@ -208,9 +206,7 @@ struct TransferWritePermutationLowering
 
     // Transpose in_bounds attribute.
     ArrayAttr newInBoundsAttr =
-        op.getInBounds() ? inverseTransposeInBoundsAttr(
-                               rewriter, op.getInBounds().value(), permutation)
-                         : ArrayAttr();
+        inverseTransposeInBoundsAttr(rewriter, op.getInBounds(), permutation);
 
     // Generate new transfer_write operation.
     Value newVec = rewriter.create<vector::TransposeOp>(
diff --git a/mlir/lib/IR/AffineMap.cpp b/mlir/lib/IR/AffineMap.cpp
index e5993eb08dc8b..3df6d1e81d0f8 100644
--- a/mlir/lib/IR/AffineMap.cpp
+++ b/mlir/lib/IR/AffineMap.cpp
@@ -188,6 +188,23 @@ bool AffineMap::isMinorIdentityWithBroadcasting(
   return true;
 }
 
+void AffineMap::getBroadcastDims(
+    SmallVectorImpl<unsigned> *broadcastedDims) const {
+  if (broadcastedDims)
+    broadcastedDims->clear();
+  for (const auto &idxAndExpr : llvm::enumerate(getResults())) {
+    unsigned resIdx = idxAndExpr.index();
+    AffineExpr expr = idxAndExpr.value();
+    if (auto constExpr = dyn_cast<AffineConstantExpr>(expr)) {
+      // Each result may be either a constant 0 (broadcasted dimension).
+      if (constExpr.getValue() != 0)
+        continue;
+      if (broadcastedDims)
+        broadcastedDims->push_back(resIdx);
+    }
+  }
+}
+
 /// Return true if this affine map can be converted to a minor identity with
 /// broadcast by doing a permute. Return a permutation (there may be
 /// several) to apply to get to a minor identity with broadcasts.
diff --git a/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir b/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
index 1abadcc345cd2..d73efd41cce05 100644
--- a/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
+++ b/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
@@ -73,6 +73,6 @@ func.func @genbool_var_1d_scalable(%arg0: index) -> vector<[11]xi1> {
 
 func.func @transfer_read_1d(%A : memref<?xf32>, %i: index) -> vector<16xf32> {
   %d = arith.constant -1.0: f32
-  %f = vector.transfer_read %A[%i], %d {permutation_map = affine_map<(d0) -> (d0)>} : memref<?xf32>, vector<16xf32>
+  %f = vector.transfer_read %A[%i], %d {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} : memref<?xf32>, vector<16xf32>
   return %f : vector<16xf32>
 }
diff --git a/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir b/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
index 09b79708a9ab2..36a3c6eeb175f 100644
--- a/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
+++ b/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
@@ -1689,10 +1689,10 @@ func.func @matrix_ops_index(%A: vector<64xindex>, %B: vector<48xindex>) -> vecto
 func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) -> vector<17xf32> {
   %f7 = arith.constant 7.0: f32
   %f = vector.transfer_read %A[%base], %f7
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     memref<?xf32>, vector<17xf32>
   vector.transfer_write %f, %A[%base]
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     vector<17xf32>, memref<?xf32>
   return %f: vector<17xf32>
 }
@@ -1763,10 +1763,10 @@ func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) -> vector<17xf32>
 func.func @transfer_read_index_1d(%A : memref<?xindex>, %base: index) -> vector<17xindex> {
   %f7 = arith.constant 7: index
   %f = vector.transfer_read %A[%base], %f7
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     memref<?xindex>, vector<17xindex>
   vector.transfer_write %f, %A[%base]
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     vector<17xindex>, memref<?xindex>
   return %f: vector<17xindex>
 }
@@ -1786,7 +1786,7 @@ func.func @transfer_read_index_1d(%A : memref<?xindex>, %base: index) -> vector<
 func.func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {
   %f7 = arith.constant 7.0: f32
   %f = vector.transfer_read %A[%base0, %base1], %f7
-      {permutation_map = affine_map<(d0, d1) -> (d1)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (d1)>} :
     memref<?x?xf32>, vector<17xf32>
   return %f: vector<17xf32>
 }
@@ -1815,10 +1815,10 @@ func.func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: i
 func.func @transfer_read_1d_non_zero_addrspace(%A : memref<?xf32, 3>, %base: index) -> vector<17xf32> {
   %f7 = arith.constant 7.0: f32
   %f = vector.transfer_read %A[%base], %f7
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     memref<?xf32, 3>, vector<17xf32>
   vector.transfer_write %f, %A[%base]
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     vector<17xf32>, memref<?xf32, 3>
   return %f: vector<17xf32>
 }
@@ -1866,7 +1866,7 @@ func.func @transfer_read_1d_inbounds(%A : memref<?xf32>, %base: index) -> vector
 func.func @transfer_read_1d_mask(%A : memref<?xf32>, %base : index) -> vector<5xf32> {
   %m = arith.constant dense<[0, 0, 1, 0, 1]> : vector<5xi1>
   %f7 = arith.constant 7.0: f32
-  %f = vector.transfer_read %A[%base], %f7, %m : memref<?xf32>, vector<5xf32>
+  %f = vector.transfer_read %A[%base], %f7, %m {in_bounds=[false]} : memref<?xf32>, vector<5xf32>
   return %f: vector<5xf32>
 }
 
diff --git a/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir b/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
index 7d97829c06599..598530eecec64 100644
--- a/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
+++ b/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
@@ -51,7 +51,7 @@ func.func @transfer_read_out_of_bounds(%A : memref<?x?x?xf32>) -> (vector<2x3x4x
   // CHECK: vector.transfer_read {{.*}} : memref<?x?x?xf32>, vector<4xf32>
   // CHECK: vector.insert {{.*}} [1, 2] : vector<4xf32> into vector<2x3x4xf32>
   // CHECK-NOT: scf.for
-  %vec = vector.transfer_read %A[%c0, %c0, %c0], %f0 : memref<?x?x?xf32>, vector<2x3x4xf32>
+  %vec = vector.transfer_read %A[%c0, %c0, %c0], %f0 {in_...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/97049