[Mlir-commits] [mlir] [mlir][vector] Make the in_bounds attribute mandatory (PR #97049)
Andrzej WarzyĆski
llvmlistbot at llvm.org
Tue Jul 2 02:42:01 PDT 2024
https://github.com/banach-space updated https://github.com/llvm/llvm-project/pull/97049
>From d3445c947cea43277cd2bd34ff796d54228487fe Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Mon, 1 Jul 2024 10:35:22 +0100
Subject: [PATCH 1/2] [mlir][vector] Make the in_bounds attribute mandatory
Makes the `in_bounds` attribute for vector.transfer_read and
vector.transfer_write Ops mandatory. In addition, makes the Asm printer
always print this attribute - tests are updated accordingly.
1. Updates in tests - default `in_bounds` value
Originally, most tests would skip the `in_bounds` attribute - this was
equivalent to setting all values to `false` [1]. With this change, this
has to be done explicitly when writing a test. Note, especially when
reviewing this change, that the vast majority of newly inserted
`in_bounds` attributes are set to `false` to preserve the original
semantics of the tests.
There is only one exception - for broadcast dimensions the newly
inserted `in_bounds` attribute is set to `true`. As per [2]:
```
vector.transfer_read op requires broadcast dimensions to be in-bounds
```
This matches the original semantics:
* the `in_bounds` attribute in the context of broadcast dimensions
would only be checked when present,
* the verifier wasn't aware of the default value set in [1],
This means that effectively, the attribute was set to `false` even for
broadcast dims, but the verifier wasn't aware of that. This change makes
that behaviour more explicit by setting the attribute to `true` for
broadcast dims. In all other cases, the attribute is set to `false` - if
that's not the case, consider that as a typo.
2. Updates in tests - 0-D vectors
Reading and writing to/from 0D vectors also requires the `in_bounds`
attribute. In this case, the attribute has to be empty:
```mlir
vector.transfer_write %5, %m1[] {in_bounds=[]} : vector<f32>, memref<f32>
```
3. Updates in tests - CHECK lines
With this PR, the `in_bounds` attribute is always print. This required
updating the `CHECK` lines that previously assumed that the attribute
would be skipped. To keep this type of changes simple, I've only added
`{{.*}}` to make sure that tests pass.
4. Changes in "Vectorization.cpp"
The following patterns are updated to explicitly set the `in_bounds`
attribute to `false`:
* `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`
5. Changes in "SuperVectorize.cpp" and "Vectorization.cpp"
The updates in `vectorizeAffineLoad` (SuperVectorize.cpp) and
`vectorizeAsLinalgGeneric` (Vectorization.cpp) are introduced to make
sure that xfer Ops created by these vectorisers set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would fail.
Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Previously, when `in_bounds` was
optional, the verification would skip checking the attribute if it
wasn't present. However, it would default to `false` in other places.
Put differently, this change does not change the existing behaviour, it
merely makes it more explicit.
[1] https://github.com/llvm/llvm-project/blob/4145ad2bac4bb99d5034d60c74bb2789f6c6e802/mlir/include/mlir/Interfaces/VectorInterfaces.td#L243-L246
[2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
---
.../mlir/Dialect/Vector/IR/VectorOps.td | 4 +-
mlir/include/mlir/IR/AffineMap.h | 4 ++
.../mlir/Interfaces/VectorInterfaces.td | 4 +-
.../Affine/Transforms/SuperVectorize.cpp | 14 +++++-
.../Linalg/Transforms/Vectorization.cpp | 26 ++++++++---
mlir/lib/Dialect/Vector/IR/VectorOps.cpp | 43 +++++++++----------
.../Vector/Transforms/LowerVectorMask.cpp | 4 +-
.../Vector/Transforms/LowerVectorTransfer.cpp | 8 +---
mlir/lib/IR/AffineMap.cpp | 17 ++++++++
9 files changed, 81 insertions(+), 43 deletions(-)
diff --git a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
index 097e5e6fb0d61..4a77291b7fafb 100644
--- a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
+++ b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
@@ -1363,7 +1363,7 @@ def Vector_TransferReadOp :
AffineMapAttr:$permutation_map,
AnyType:$padding,
Optional<VectorOf<[I1]>>:$mask,
- OptionalAttr<BoolArrayAttr>:$in_bounds)>,
+ BoolArrayAttr:$in_bounds)>,
Results<(outs AnyVectorOfAnyRank:$vector)> {
let summary = "Reads a supervector from memory into an SSA vector value.";
@@ -1607,7 +1607,7 @@ def Vector_TransferWriteOp :
Variadic<Index>:$indices,
AffineMapAttr:$permutation_map,
Optional<VectorOf<[I1]>>:$mask,
- OptionalAttr<BoolArrayAttr>:$in_bounds)>,
+ BoolArrayAttr:$in_bounds)>,
Results<(outs Optional<AnyRankedTensor>:$result)> {
let summary = "The vector.transfer_write op writes a supervector to memory.";
diff --git a/mlir/include/mlir/IR/AffineMap.h b/mlir/include/mlir/IR/AffineMap.h
index cce141253989e..01772f7782ba8 100644
--- a/mlir/include/mlir/IR/AffineMap.h
+++ b/mlir/include/mlir/IR/AffineMap.h
@@ -156,6 +156,10 @@ class AffineMap {
bool isMinorIdentityWithBroadcasting(
SmallVectorImpl<unsigned> *broadcastedDims = nullptr) const;
+ // TODO: Document
+ void
+ getBroadcastDims(SmallVectorImpl<unsigned> *broadcastedDims = nullptr) const;
+
/// Return true if this affine map can be converted to a minor identity with
/// broadcast by doing a permute. Return a permutation (there may be
/// several) to apply to get to a minor identity with broadcasts.
diff --git a/mlir/include/mlir/Interfaces/VectorInterfaces.td b/mlir/include/mlir/Interfaces/VectorInterfaces.td
index 781d6d3e3f813..f6682f2eabe1e 100644
--- a/mlir/include/mlir/Interfaces/VectorInterfaces.td
+++ b/mlir/include/mlir/Interfaces/VectorInterfaces.td
@@ -98,7 +98,7 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
dimension whether it is in-bounds or not. (Broadcast dimensions are
always in-bounds).
}],
- /*retTy=*/"::std::optional<::mlir::ArrayAttr>",
+ /*retTy=*/"::mlir::ArrayAttr",
/*methodName=*/"getInBounds",
/*args=*/(ins)
>,
@@ -242,7 +242,7 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
return true;
if (!$_op.getInBounds())
return false;
- auto inBounds = ::llvm::cast<::mlir::ArrayAttr>(*$_op.getInBounds());
+ auto inBounds = $_op.getInBounds();
return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();
}
diff --git a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
index 71e9648a5e00f..033f35391849e 100644
--- a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
+++ b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
@@ -1223,8 +1223,20 @@ static Operation *vectorizeAffineLoad(AffineLoadOp loadOp,
LLVM_DEBUG(dbgs() << "\n[early-vect]+++++ permutationMap: ");
LLVM_DEBUG(permutationMap.print(dbgs()));
+ // Make sure that the in_bounds attribute corresponding to a broadcast dim
+ // is set to `true` - that's required by the xfer Op.
+ // FIXME: We're not veryfying whether the corresponding access is in bounds.
+ // TODO: Use masking instead.
+ SmallVector<unsigned> broadcastedDims = {};
+ permutationMap.getBroadcastDims(&broadcastedDims);
+ SmallVector<bool> inBounds(vectorType.getRank(), false);
+
+ for (auto idx : broadcastedDims)
+ inBounds[idx] = true;
+
auto transfer = state.builder.create<vector::TransferReadOp>(
- loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap);
+ loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap,
+ ArrayRef<bool>(inBounds));
// Register replacement for future uses in the scope.
state.registerOpVectorReplacement(loadOp, transfer);
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 3a75d2ac08157..c15d101afa94b 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1338,8 +1338,18 @@ vectorizeAsLinalgGeneric(RewriterBase &rewriter, VectorizationState &state,
SmallVector<Value> indices(linalgOp.getShape(opOperand).size(), zero);
+ // Make sure that the in_bounds attribute corresponding to a broadcast dim
+ // is `true`
+ SmallVector<unsigned> broadcastedDims = {};
+ readMap.getBroadcastDims(&broadcastedDims);
+ SmallVector<bool> inBounds(readType.getRank(), false);
+
+ for (auto idx : broadcastedDims)
+ inBounds[idx] = true;
+
Operation *read = rewriter.create<vector::TransferReadOp>(
- loc, readType, opOperand->get(), indices, readMap);
+ loc, readType, opOperand->get(), indices, readMap,
+ ArrayRef<bool>(inBounds));
read = state.maskOperation(rewriter, read, linalgOp, maskingMap);
Value readValue = read->getResult(0);
@@ -2676,11 +2686,12 @@ LogicalResult LinalgCopyVTRForwardingPattern::matchAndRewrite(
// The `masked` attribute is only valid on this padded buffer.
// When forwarding to vector.transfer_read, the attribute must be reset
// conservatively.
+ auto vectorType = xferOp.getVectorType();
Value res = rewriter.create<vector::TransferReadOp>(
- xferOp.getLoc(), xferOp.getVectorType(), in, xferOp.getIndices(),
+ xferOp.getLoc(), vectorType, in, xferOp.getIndices(),
xferOp.getPermutationMapAttr(), xferOp.getPadding(), xferOp.getMask(),
- // in_bounds is explicitly reset
- /*inBoundsAttr=*/ArrayAttr());
+ rewriter.getBoolArrayAttr(
+ SmallVector<bool>(vectorType.getRank(), false)));
if (maybeFillOp)
rewriter.eraseOp(maybeFillOp);
@@ -2734,11 +2745,12 @@ LogicalResult LinalgCopyVTWForwardingPattern::matchAndRewrite(
// The `masked` attribute is only valid on this padded buffer.
// When forwarding to vector.transfer_write, the attribute must be reset
// conservatively.
+ auto vector = xferOp.getVector();
rewriter.create<vector::TransferWriteOp>(
- xferOp.getLoc(), xferOp.getVector(), out, xferOp.getIndices(),
+ xferOp.getLoc(), vector, out, xferOp.getIndices(),
xferOp.getPermutationMapAttr(), xferOp.getMask(),
- // in_bounds is explicitly reset
- /*inBoundsAttr=*/ArrayAttr());
+ rewriter.getBoolArrayAttr(
+ SmallVector<bool>(vector.getType().getRank(), false)));
rewriter.eraseOp(copyOp);
rewriter.eraseOp(xferOp);
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 149723f51cc12..cb4cb92a66a93 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -3817,7 +3817,8 @@ void TransferReadOp::build(OpBuilder &builder, OperationState &result,
auto permutationMapAttr = AffineMapAttr::get(permutationMap);
auto inBoundsAttr = (inBounds && !inBounds.value().empty())
? builder.getBoolArrayAttr(inBounds.value())
- : ArrayAttr();
+ : builder.getBoolArrayAttr(
+ SmallVector<bool>(vectorType.getRank(), false));
build(builder, result, vectorType, source, indices, permutationMapAttr,
inBoundsAttr);
}
@@ -3832,7 +3833,8 @@ void TransferReadOp::build(OpBuilder &builder, OperationState &result,
auto permutationMapAttr = AffineMapAttr::get(permutationMap);
auto inBoundsAttr = (inBounds && !inBounds.value().empty())
? builder.getBoolArrayAttr(inBounds.value())
- : ArrayAttr();
+ : builder.getBoolArrayAttr(
+ SmallVector<bool>(vectorType.getRank(), false));
build(builder, result, vectorType, source, indices, permutationMapAttr,
padding,
/*mask=*/Value(), inBoundsAttr);
@@ -3950,17 +3952,15 @@ verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,
<< inferredMaskType << ") and mask operand type (" << maskType
<< ") don't match";
- if (inBounds) {
- if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
- return op->emitOpError("expects the optional in_bounds attr of same rank "
- "as permutation_map results: ")
- << AffineMapAttr::get(permutationMap)
- << " vs inBounds of size: " << inBounds.size();
- for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
- if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
- !llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
- return op->emitOpError("requires broadcast dimensions to be in-bounds");
- }
+ if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
+ return op->emitOpError("expects the in_bounds attr of same rank "
+ "as permutation_map results: ")
+ << AffineMapAttr::get(permutationMap)
+ << " vs inBounds of size: " << inBounds.size();
+ for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
+ if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
+ !llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
+ return op->emitOpError("requires broadcast dimensions to be in-bounds");
return success();
}
@@ -3970,9 +3970,6 @@ static void printTransferAttrs(OpAsmPrinter &p, VectorTransferOpInterface op) {
elidedAttrs.push_back(TransferReadOp::getOperandSegmentSizeAttr());
if (op.getPermutationMap().isMinorIdentity())
elidedAttrs.push_back(op.getPermutationMapAttrName());
- // Elide in_bounds attribute if all dims are out-of-bounds.
- if (llvm::none_of(op.getInBoundsValues(), [](bool b) { return b; }))
- elidedAttrs.push_back(op.getInBoundsAttrName());
p.printOptionalAttrDict(op->getAttrs(), elidedAttrs);
}
@@ -4080,8 +4077,7 @@ LogicalResult TransferReadOp::verify() {
if (failed(verifyTransferOp(cast<VectorTransferOpInterface>(getOperation()),
shapedType, vectorType, maskType,
- inferredMaskType, permutationMap,
- getInBounds() ? *getInBounds() : ArrayAttr())))
+ inferredMaskType, permutationMap, getInBounds())))
return failure();
if (auto sourceVectorElementType =
@@ -4354,9 +4350,11 @@ void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
AffineMap permutationMap,
std::optional<ArrayRef<bool>> inBounds) {
auto permutationMapAttr = AffineMapAttr::get(permutationMap);
- auto inBoundsAttr = (inBounds && !inBounds.value().empty())
- ? builder.getBoolArrayAttr(inBounds.value())
- : ArrayAttr();
+ auto inBoundsAttr =
+ (inBounds && !inBounds.value().empty())
+ ? builder.getBoolArrayAttr(inBounds.value())
+ : builder.getBoolArrayAttr(SmallVector<bool>(
+ llvm::cast<VectorType>(vector.getType()).getRank(), false));
build(builder, result, vector, dest, indices, permutationMapAttr,
/*mask=*/Value(), inBoundsAttr);
}
@@ -4462,8 +4460,7 @@ LogicalResult TransferWriteOp::verify() {
if (failed(verifyTransferOp(cast<VectorTransferOpInterface>(getOperation()),
shapedType, vectorType, maskType,
- inferredMaskType, permutationMap,
- getInBounds() ? *getInBounds() : ArrayAttr())))
+ inferredMaskType, permutationMap, getInBounds())))
return failure();
return verifyPermutationMap(permutationMap,
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
index f53bb5157eb37..dfeb7bc53adad 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
@@ -224,7 +224,7 @@ struct MaskedTransferReadOpPattern
rewriter.replaceOpWithNewOp<TransferReadOp>(
maskingOp.getOperation(), readOp.getVectorType(), readOp.getSource(),
readOp.getIndices(), readOp.getPermutationMap(), readOp.getPadding(),
- maskingOp.getMask(), readOp.getInBounds().value_or(ArrayAttr()));
+ maskingOp.getMask(), readOp.getInBounds());
return success();
}
};
@@ -246,7 +246,7 @@ struct MaskedTransferWriteOpPattern
rewriter.replaceOpWithNewOp<TransferWriteOp>(
maskingOp.getOperation(), resultType, writeOp.getVector(),
writeOp.getSource(), writeOp.getIndices(), writeOp.getPermutationMap(),
- maskingOp.getMask(), writeOp.getInBounds().value_or(ArrayAttr()));
+ maskingOp.getMask(), writeOp.getInBounds());
return success();
}
};
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
index c31c51489ecc9..b3c6dec47f6be 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
@@ -133,9 +133,7 @@ struct TransferReadPermutationLowering
// Transpose in_bounds attribute.
ArrayAttr newInBoundsAttr =
- op.getInBounds() ? inverseTransposeInBoundsAttr(
- rewriter, op.getInBounds().value(), permutation)
- : ArrayAttr();
+ inverseTransposeInBoundsAttr(rewriter, op.getInBounds(), permutation);
// Generate new transfer_read operation.
VectorType newReadType = VectorType::get(
@@ -208,9 +206,7 @@ struct TransferWritePermutationLowering
// Transpose in_bounds attribute.
ArrayAttr newInBoundsAttr =
- op.getInBounds() ? inverseTransposeInBoundsAttr(
- rewriter, op.getInBounds().value(), permutation)
- : ArrayAttr();
+ inverseTransposeInBoundsAttr(rewriter, op.getInBounds(), permutation);
// Generate new transfer_write operation.
Value newVec = rewriter.create<vector::TransposeOp>(
diff --git a/mlir/lib/IR/AffineMap.cpp b/mlir/lib/IR/AffineMap.cpp
index e5993eb08dc8b..3df6d1e81d0f8 100644
--- a/mlir/lib/IR/AffineMap.cpp
+++ b/mlir/lib/IR/AffineMap.cpp
@@ -188,6 +188,23 @@ bool AffineMap::isMinorIdentityWithBroadcasting(
return true;
}
+void AffineMap::getBroadcastDims(
+ SmallVectorImpl<unsigned> *broadcastedDims) const {
+ if (broadcastedDims)
+ broadcastedDims->clear();
+ for (const auto &idxAndExpr : llvm::enumerate(getResults())) {
+ unsigned resIdx = idxAndExpr.index();
+ AffineExpr expr = idxAndExpr.value();
+ if (auto constExpr = dyn_cast<AffineConstantExpr>(expr)) {
+ // Each result may be either a constant 0 (broadcasted dimension).
+ if (constExpr.getValue() != 0)
+ continue;
+ if (broadcastedDims)
+ broadcastedDims->push_back(resIdx);
+ }
+ }
+}
+
/// Return true if this affine map can be converted to a minor identity with
/// broadcast by doing a permute. Return a permutation (there may be
/// several) to apply to get to a minor identity with broadcasts.
>From 01bf1554d9859c6ac6958f410d2de8ee0e1d7d7f Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski <andrzej.warzynski at arm.com>
Date: Mon, 1 Jul 2024 10:37:07 +0100
Subject: [PATCH 2/2] Update tests
---
.../VectorToLLVM/vector-mask-to-llvm.mlir | 2 +-
.../VectorToLLVM/vector-to-llvm.mlir | 16 +--
.../VectorToSCF/unrolled-vector-to-loops.mlir | 2 +-
...ector-to-scf-mask-and-permutation-map.mlir | 4 +-
.../Conversion/VectorToSCF/vector-to-scf.mlir | 66 ++++++------
.../Affine/SuperVectorize/vector_utils.mlir | 2 +-
.../Affine/SuperVectorize/vectorize_1d.mlir | 20 ++--
.../Affine/SuperVectorize/vectorize_2d.mlir | 10 +-
.../vectorize_affine_apply.mlir | 12 +--
.../vectorize_outer_loop_2d.mlir | 2 +-
.../vectorize_outer_loop_transpose_2d.mlir | 8 +-
.../vectorize_transpose_2d.mlir | 8 +-
.../Dialect/ArmSME/vector-legalization.mlir | 10 +-
...e-analysis-bottom-up-from-terminators.mlir | 4 +-
.../one-shot-bufferize-partial.mlir | 14 +--
.../Transforms/one-shot-bufferize.mlir | 10 +-
.../one-shot-module-bufferize-analysis.mlir | 22 ++--
...ule-bufferize-force-copy-before-write.mlir | 4 +-
.../Transforms/one-shot-module-bufferize.mlir | 8 +-
.../Transforms/transform-ops.mlir | 10 +-
.../Linalg/forward-vector-transfers.mlir | 13 ++-
mlir/test/Dialect/Linalg/hoisting.mlir | 100 +++++++++---------
.../Dialect/Linalg/one-shot-bufferize.mlir | 6 +-
.../transform-op-bufferize-to-allocation.mlir | 2 +-
.../Linalg/vectorization-with-patterns.mlir | 18 ++--
mlir/test/Dialect/Linalg/vectorization.mlir | 2 +-
.../Linalg/vectorize-tensor-extract.mlir | 2 +-
.../MemRef/extract-address-computations.mlir | 20 ++--
.../Dialect/MemRef/fold-memref-alias-ops.mlir | 2 +-
.../NVGPU/transform-pipeline-shared.mlir | 8 +-
.../SCF/one-shot-bufferize-analysis.mlir | 38 +++----
mlir/test/Dialect/SCF/one-shot-bufferize.mlir | 7 +-
.../Tensor/fold-tensor-subset-ops.mlir | 2 +-
.../Dialect/Tensor/one-shot-bufferize.mlir | 4 +-
.../Dialect/Vector/bufferize-invalid.mlir | 2 +-
mlir/test/Dialect/Vector/canonicalize.mlir | 44 ++++----
mlir/test/Dialect/Vector/invalid.mlir | 53 +++++-----
.../Dialect/Vector/lower-vector-mask.mlir | 12 +--
.../Dialect/Vector/one-shot-bufferize.mlir | 8 +-
mlir/test/Dialect/Vector/ops.mlir | 98 ++++++++---------
.../scalar-vector-transfer-to-memref.mlir | 12 +--
.../value-bounds-op-interface-impl.mlir | 2 +-
.../Vector/vector-emulate-narrow-type.mlir | 4 +-
...tor-transfer-collapse-inner-most-dims.mlir | 14 +--
...ctor-transfer-drop-unit-dims-patterns.mlir | 14 +--
.../Vector/vector-transfer-flatten.mlir | 24 ++---
...fer-full-partial-split-copy-transform.mlir | 8 +-
.../vector-transfer-full-partial-split.mlir | 22 ++--
.../vector-transfer-permutation-lowering.mlir | 4 +-
.../vector-transfer-to-vector-load-store.mlir | 42 ++++----
.../Vector/vector-transfer-unroll.mlir | 24 ++---
.../Dialect/Vector/vector-transforms.mlir | 38 +++----
.../Vector/vector-warp-distribute.mlir | 66 ++++++------
.../SparseTensor/CPU/dual_sparse_conv_2d.mlir | 2 +-
.../CPU/padded_sparse_conv_2d.mlir | 4 +-
.../SparseTensor/CPU/sparse_block_matmul.mlir | 2 +-
.../Dialect/SparseTensor/CPU/sparse_cast.mlir | 20 ++--
.../Dialect/SparseTensor/CPU/sparse_cmp.mlir | 2 +-
.../CPU/sparse_collapse_shape.mlir | 12 +--
.../CPU/sparse_conv_1d_nwc_wcf.mlir | 2 +-
.../SparseTensor/CPU/sparse_conv_2d.mlir | 4 +-
.../SparseTensor/CPU/sparse_conv_2d_55.mlir | 12 +--
.../CPU/sparse_conv_2d_nchw_fchw.mlir | 8 +-
.../CPU/sparse_conv_2d_nhwc_hwcf.mlir | 2 +-
.../SparseTensor/CPU/sparse_conv_3d.mlir | 2 +-
.../CPU/sparse_conv_3d_ndhwc_dhwcf.mlir | 2 +-
.../CPU/sparse_conversion_element.mlir | 2 +-
.../CPU/sparse_conversion_sparse2dense.mlir | 2 +-
.../CPU/sparse_conversion_sparse2sparse.mlir | 2 +-
.../SparseTensor/CPU/sparse_coo_test.mlir | 8 +-
.../CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir | 8 +-
.../SparseTensor/CPU/sparse_expand_shape.mlir | 12 +--
.../CPU/sparse_filter_conv2d.mlir | 2 +-
.../SparseTensor/CPU/sparse_index_dense.mlir | 16 +--
.../SparseTensor/CPU/sparse_matvec.mlir | 2 +-
.../Dialect/SparseTensor/CPU/sparse_pack.mlir | 10 +-
.../SparseTensor/CPU/sparse_permute.mlir | 2 +-
.../SparseTensor/CPU/sparse_pooling_nhwc.mlir | 2 +-
.../CPU/sparse_quantized_matmul.mlir | 2 +-
.../CPU/sparse_rewrite_push_back.mlir | 2 +-
.../CPU/sparse_rewrite_sort_coo.mlir | 30 +++---
.../CPU/sparse_sampled_matmul.mlir | 2 +-
.../CPU/sparse_sampled_mm_fusion.mlir | 4 +-
.../Dialect/SparseTensor/CPU/sparse_spmm.mlir | 2 +-
.../CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir | 8 +-
.../SparseTensor/CPU/sparse_unary.mlir | 2 +-
.../Standard/CPU/test-ceil-floor-pos-neg.mlir | 2 +-
.../Dialect/Vector/CPU/realloc.mlir | 6 +-
.../Dialect/Vector/CPU/transfer-read-1d.mlir | 10 +-
.../Dialect/Vector/CPU/transfer-read-2d.mlir | 18 ++--
.../Dialect/Vector/CPU/transfer-read-3d.mlir | 13 ++-
.../Dialect/Vector/CPU/transfer-read.mlir | 6 +-
.../Dialect/Vector/CPU/transfer-to-loops.mlir | 14 +--
.../Dialect/Vector/CPU/transfer-write.mlir | 7 +-
.../loop-invariant-subset-hoisting.mlir | 78 +++++++-------
95 files changed, 654 insertions(+), 643 deletions(-)
diff --git a/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir b/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
index 1abadcc345cd2..d73efd41cce05 100644
--- a/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
+++ b/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
@@ -73,6 +73,6 @@ func.func @genbool_var_1d_scalable(%arg0: index) -> vector<[11]xi1> {
func.func @transfer_read_1d(%A : memref<?xf32>, %i: index) -> vector<16xf32> {
%d = arith.constant -1.0: f32
- %f = vector.transfer_read %A[%i], %d {permutation_map = affine_map<(d0) -> (d0)>} : memref<?xf32>, vector<16xf32>
+ %f = vector.transfer_read %A[%i], %d {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} : memref<?xf32>, vector<16xf32>
return %f : vector<16xf32>
}
diff --git a/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir b/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
index 09b79708a9ab2..36a3c6eeb175f 100644
--- a/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
+++ b/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
@@ -1689,10 +1689,10 @@ func.func @matrix_ops_index(%A: vector<64xindex>, %B: vector<48xindex>) -> vecto
func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) -> vector<17xf32> {
%f7 = arith.constant 7.0: f32
%f = vector.transfer_read %A[%base], %f7
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
memref<?xf32>, vector<17xf32>
vector.transfer_write %f, %A[%base]
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
vector<17xf32>, memref<?xf32>
return %f: vector<17xf32>
}
@@ -1763,10 +1763,10 @@ func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) -> vector<17xf32>
func.func @transfer_read_index_1d(%A : memref<?xindex>, %base: index) -> vector<17xindex> {
%f7 = arith.constant 7: index
%f = vector.transfer_read %A[%base], %f7
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
memref<?xindex>, vector<17xindex>
vector.transfer_write %f, %A[%base]
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
vector<17xindex>, memref<?xindex>
return %f: vector<17xindex>
}
@@ -1786,7 +1786,7 @@ func.func @transfer_read_index_1d(%A : memref<?xindex>, %base: index) -> vector<
func.func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {
%f7 = arith.constant 7.0: f32
%f = vector.transfer_read %A[%base0, %base1], %f7
- {permutation_map = affine_map<(d0, d1) -> (d1)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (d1)>} :
memref<?x?xf32>, vector<17xf32>
return %f: vector<17xf32>
}
@@ -1815,10 +1815,10 @@ func.func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: i
func.func @transfer_read_1d_non_zero_addrspace(%A : memref<?xf32, 3>, %base: index) -> vector<17xf32> {
%f7 = arith.constant 7.0: f32
%f = vector.transfer_read %A[%base], %f7
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
memref<?xf32, 3>, vector<17xf32>
vector.transfer_write %f, %A[%base]
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
vector<17xf32>, memref<?xf32, 3>
return %f: vector<17xf32>
}
@@ -1866,7 +1866,7 @@ func.func @transfer_read_1d_inbounds(%A : memref<?xf32>, %base: index) -> vector
func.func @transfer_read_1d_mask(%A : memref<?xf32>, %base : index) -> vector<5xf32> {
%m = arith.constant dense<[0, 0, 1, 0, 1]> : vector<5xi1>
%f7 = arith.constant 7.0: f32
- %f = vector.transfer_read %A[%base], %f7, %m : memref<?xf32>, vector<5xf32>
+ %f = vector.transfer_read %A[%base], %f7, %m {in_bounds=[false]} : memref<?xf32>, vector<5xf32>
return %f: vector<5xf32>
}
diff --git a/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir b/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
index 7d97829c06599..598530eecec64 100644
--- a/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
+++ b/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
@@ -51,7 +51,7 @@ func.func @transfer_read_out_of_bounds(%A : memref<?x?x?xf32>) -> (vector<2x3x4x
// CHECK: vector.transfer_read {{.*}} : memref<?x?x?xf32>, vector<4xf32>
// CHECK: vector.insert {{.*}} [1, 2] : vector<4xf32> into vector<2x3x4xf32>
// CHECK-NOT: scf.for
- %vec = vector.transfer_read %A[%c0, %c0, %c0], %f0 : memref<?x?x?xf32>, vector<2x3x4xf32>
+ %vec = vector.transfer_read %A[%c0, %c0, %c0], %f0 {in_bounds=[false, false, false]} : memref<?x?x?xf32>, vector<2x3x4xf32>
return %vec : vector<2x3x4xf32>
}
diff --git a/mlir/test/Conversion/VectorToSCF/vector-to-scf-mask-and-permutation-map.mlir b/mlir/test/Conversion/VectorToSCF/vector-to-scf-mask-and-permutation-map.mlir
index 812c8d95f371c..43137c925c9fc 100644
--- a/mlir/test/Conversion/VectorToSCF/vector-to-scf-mask-and-permutation-map.mlir
+++ b/mlir/test/Conversion/VectorToSCF/vector-to-scf-mask-and-permutation-map.mlir
@@ -12,7 +12,7 @@
// CHECK: scf.for {{.*}} {
// CHECK: scf.if {{.*}} {
// CHECK: %[[MASK_LOADED:.*]] = memref.load %[[MASK_CASTED]][%{{.*}}] : memref<4xvector<9xi1>>
-// CHECK: %[[READ:.*]] = vector.transfer_read %{{.*}}, %{{.*}}, %[[MASK_LOADED]] : memref<?x?xf32>, vector<9xf32>
+// CHECK: %[[READ:.*]] = vector.transfer_read %{{.*}}, %{{.*}}, %[[MASK_LOADED]] {in_bounds = [false]} : memref<?x?xf32>, vector<9xf32>
// CHECK: memref.store %[[READ]], %{{.*}} : memref<4xvector<9xf32>>
// CHECK: }
// CHECK: }
@@ -29,7 +29,7 @@ func.func @transfer_read_2d_mask_transposed(
[1, 1, 1, 1, 1, 1, 1, 0, 1],
[0, 0, 1, 0, 1, 1, 1, 0, 1]]> : vector<4x9xi1>
%f = vector.transfer_read %A[%base1, %base2], %fm42, %mask
- {permutation_map = affine_map<(d0, d1) -> (d1, d0)>} :
+ {permutation_map = affine_map<(d0, d1) -> (d1, d0)>, in_bounds = [false, false]} :
memref<?x?xf32>, vector<9x4xf32>
return %f : vector<9x4xf32>
}
diff --git a/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir b/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
index e1babdd2f1f63..4e884869b88f0 100644
--- a/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
+++ b/mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
@@ -9,11 +9,11 @@ func.func @vector_transfer_ops_0d(%M: memref<f32>) {
// 0-d transfers are left untouched by vector-to-scf.
// They are independently lowered to the proper memref.load/store.
// CHECK: vector.transfer_read {{.*}}: memref<f32>, vector<f32>
- %0 = vector.transfer_read %M[], %f0 {permutation_map = affine_map<()->()>} :
+ %0 = vector.transfer_read %M[], %f0 {permutation_map = affine_map<()->()>, in_bounds = []} :
memref<f32>, vector<f32>
// CHECK: vector.transfer_write {{.*}}: vector<f32>, memref<f32>
- vector.transfer_write %0, %M[] {permutation_map = affine_map<()->()>} :
+ vector.transfer_write %0, %M[] {permutation_map = affine_map<()->()>, in_bounds = []} :
vector<f32>, memref<f32>
return
@@ -27,13 +27,13 @@ func.func @materialize_read_1d() {
%A = memref.alloc () : memref<7x42xf32>
affine.for %i0 = 0 to 7 step 4 {
affine.for %i1 = 0 to 42 step 4 {
- %f1 = vector.transfer_read %A[%i0, %i1], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>} : memref<7x42xf32>, vector<4xf32>
+ %f1 = vector.transfer_read %A[%i0, %i1], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]} : memref<7x42xf32>, vector<4xf32>
%ip1 = affine.apply affine_map<(d0) -> (d0 + 1)> (%i1)
- %f2 = vector.transfer_read %A[%i0, %ip1], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>} : memref<7x42xf32>, vector<4xf32>
+ %f2 = vector.transfer_read %A[%i0, %ip1], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]} : memref<7x42xf32>, vector<4xf32>
%ip2 = affine.apply affine_map<(d0) -> (d0 + 2)> (%i1)
- %f3 = vector.transfer_read %A[%i0, %ip2], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>} : memref<7x42xf32>, vector<4xf32>
+ %f3 = vector.transfer_read %A[%i0, %ip2], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]} : memref<7x42xf32>, vector<4xf32>
%ip3 = affine.apply affine_map<(d0) -> (d0 + 3)> (%i1)
- %f4 = vector.transfer_read %A[%i0, %ip3], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>} : memref<7x42xf32>, vector<4xf32>
+ %f4 = vector.transfer_read %A[%i0, %ip3], %f0 {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]} : memref<7x42xf32>, vector<4xf32>
// Both accesses in the load must be clipped otherwise %i1 + 2 and %i1 + 3 will go out of bounds.
// CHECK: scf.if
// CHECK-NEXT: memref.load
@@ -60,9 +60,9 @@ func.func @materialize_read_1d_partially_specialized(%dyn1 : index, %dyn2 : inde
affine.for %i2 = 0 to %dyn2 {
affine.for %i3 = 0 to 42 step 2 {
affine.for %i4 = 0 to %dyn4 {
- %f1 = vector.transfer_read %A[%i0, %i1, %i2, %i3, %i4], %f0 {permutation_map = affine_map<(d0, d1, d2, d3, d4) -> (d3)>} : memref<7x?x?x42x?xf32>, vector<4xf32>
+ %f1 = vector.transfer_read %A[%i0, %i1, %i2, %i3, %i4], %f0 {permutation_map = affine_map<(d0, d1, d2, d3, d4) -> (d3)>, in_bounds = [false]} : memref<7x?x?x42x?xf32>, vector<4xf32>
%i3p1 = affine.apply affine_map<(d0) -> (d0 + 1)> (%i3)
- %f2 = vector.transfer_read %A[%i0, %i1, %i2, %i3p1, %i4], %f0 {permutation_map = affine_map<(d0, d1, d2, d3, d4) -> (d3)>} : memref<7x?x?x42x?xf32>, vector<4xf32>
+ %f2 = vector.transfer_read %A[%i0, %i1, %i2, %i3p1, %i4], %f0 {permutation_map = affine_map<(d0, d1, d2, d3, d4) -> (d3)>, in_bounds = [false]} : memref<7x?x?x42x?xf32>, vector<4xf32>
// Add a dummy use to prevent dead code elimination from removing
// transfer read ops.
"dummy_use"(%f1, %f2) : (vector<4xf32>, vector<4xf32>) -> ()
@@ -133,7 +133,7 @@ func.func @materialize_read(%M: index, %N: index, %O: index, %P: index) {
affine.for %i1 = 0 to %N {
affine.for %i2 = 0 to %O {
affine.for %i3 = 0 to %P step 5 {
- %f = vector.transfer_read %A[%i0, %i1, %i2, %i3], %f0 {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, 0, d0)>} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
+ %f = vector.transfer_read %A[%i0, %i1, %i2, %i3], %f0 {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, 0, d0)>, in_bounds = [false, true, false]} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
// Add a dummy use to prevent dead code elimination from removing
// transfer read ops.
"dummy_use"(%f) : (vector<5x4x3xf32>) -> ()
@@ -174,7 +174,7 @@ func.func @materialize_write(%M: index, %N: index, %O: index, %P: index) {
// CHECK: scf.for %[[I6:.*]] = %[[C0]] to %[[C1]] step %[[C1]] {
// CHECK: %[[S0:.*]] = affine.apply #[[$ADD]](%[[I2]], %[[I6]])
// CHECK: %[[VEC:.*]] = memref.load %[[VECTOR_VIEW3]][%[[I4]], %[[I5]], %[[I6]]] : memref<3x4x1xvector<5xf32>>
- // CHECK: vector.transfer_write %[[VEC]], %{{.*}}[%[[S3]], %[[S1]], %[[S0]], %[[I3]]] : vector<5xf32>, memref<?x?x?x?xf32>
+ // CHECK: vector.transfer_write %[[VEC]], %{{.*}}[%[[S3]], %[[S1]], %[[S0]], %[[I3]]] {in_bounds = [false]} : vector<5xf32>, memref<?x?x?x?xf32>
// CHECK: }
// CHECK: }
// CHECK: }
@@ -196,7 +196,7 @@ func.func @materialize_write(%M: index, %N: index, %O: index, %P: index) {
affine.for %i1 = 0 to %N step 4 {
affine.for %i2 = 0 to %O {
affine.for %i3 = 0 to %P step 5 {
- vector.transfer_write %f1, %A[%i0, %i1, %i2, %i3] {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, d1, d0)>} : vector<5x4x3xf32>, memref<?x?x?x?xf32>
+ vector.transfer_write %f1, %A[%i0, %i1, %i2, %i3] {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, d1, d0)>, in_bounds = [false, false, false]} : vector<5x4x3xf32>, memref<?x?x?x?xf32>
}
}
}
@@ -234,7 +234,7 @@ func.func @transfer_read_progressive(%A : memref<?x?xf32>, %base: index) -> vect
// CHECK: %[[add:.*]] = affine.apply #[[$MAP0]](%[[I]])[%[[base]]]
// CHECK: %[[cond1:.*]] = arith.cmpi sgt, %[[dim]], %[[add]] : index
// CHECK: scf.if %[[cond1]] {
- // CHECK: %[[vec_1d:.*]] = vector.transfer_read %[[A]][%{{.*}}, %[[base]]], %[[C7]] : memref<?x?xf32>, vector<15xf32>
+ // CHECK: %[[vec_1d:.*]] = vector.transfer_read %[[A]][%{{.*}}, %[[base]]], %[[C7]] {{.*}} : memref<?x?xf32>, vector<15xf32>
// CHECK: memref.store %[[vec_1d]], %[[alloc_casted]][%[[I]]] : memref<3xvector<15xf32>>
// CHECK: } else {
// CHECK: store %[[splat]], %[[alloc_casted]][%[[I]]] : memref<3xvector<15xf32>>
@@ -248,7 +248,7 @@ func.func @transfer_read_progressive(%A : memref<?x?xf32>, %base: index) -> vect
// FULL-UNROLL: %[[DIM:.*]] = memref.dim %[[A]], %[[C0]] : memref<?x?xf32>
// FULL-UNROLL: cmpi sgt, %[[DIM]], %[[base]] : index
// FULL-UNROLL: %[[VEC1:.*]] = scf.if %{{.*}} -> (vector<3x15xf32>) {
- // FULL-UNROLL: vector.transfer_read %[[A]][%[[base]], %[[base]]], %[[C7]] : memref<?x?xf32>, vector<15xf32>
+ // FULL-UNROLL: vector.transfer_read %[[A]][%[[base]], %[[base]]], %[[C7]] {{.*}} : memref<?x?xf32>, vector<15xf32>
// FULL-UNROLL: vector.insert %{{.*}}, %[[VEC0]] [0] : vector<15xf32> into vector<3x15xf32>
// FULL-UNROLL: scf.yield %{{.*}} : vector<3x15xf32>
// FULL-UNROLL: } else {
@@ -257,7 +257,7 @@ func.func @transfer_read_progressive(%A : memref<?x?xf32>, %base: index) -> vect
// FULL-UNROLL: affine.apply #[[$MAP1]]()[%[[base]]]
// FULL-UNROLL: cmpi sgt, %{{.*}}, %{{.*}} : index
// FULL-UNROLL: %[[VEC2:.*]] = scf.if %{{.*}} -> (vector<3x15xf32>) {
- // FULL-UNROLL: vector.transfer_read %[[A]][%{{.*}}, %[[base]]], %[[C7]] : memref<?x?xf32>, vector<15xf32>
+ // FULL-UNROLL: vector.transfer_read %[[A]][%{{.*}}, %[[base]]], %[[C7]] {{.*}} : memref<?x?xf32>, vector<15xf32>
// FULL-UNROLL: vector.insert %{{.*}}, %[[VEC1]] [1] : vector<15xf32> into vector<3x15xf32>
// FULL-UNROLL: scf.yield %{{.*}} : vector<3x15xf32>
// FULL-UNROLL: } else {
@@ -266,14 +266,14 @@ func.func @transfer_read_progressive(%A : memref<?x?xf32>, %base: index) -> vect
// FULL-UNROLL: affine.apply #[[$MAP2]]()[%[[base]]]
// FULL-UNROLL: cmpi sgt, %{{.*}}, %{{.*}} : index
// FULL-UNROLL: %[[VEC3:.*]] = scf.if %{{.*}} -> (vector<3x15xf32>) {
- // FULL-UNROLL: vector.transfer_read %[[A]][%{{.*}}, %[[base]]], %[[C7]] : memref<?x?xf32>, vector<15xf32>
+ // FULL-UNROLL: vector.transfer_read %[[A]][%{{.*}}, %[[base]]], %[[C7]] {{.*}} : memref<?x?xf32>, vector<15xf32>
// FULL-UNROLL: vector.insert %{{.*}}, %[[VEC2]] [2] : vector<15xf32> into vector<3x15xf32>
// FULL-UNROLL: scf.yield %{{.*}} : vector<3x15xf32>
// FULL-UNROLL: } else {
// FULL-UNROLL: scf.yield %{{.*}} : vector<3x15xf32>
// FULL-UNROLL: }
- %f = vector.transfer_read %A[%base, %base], %f7 :
+ %f = vector.transfer_read %A[%base, %base], %f7 {in_bounds = [false, false]} :
memref<?x?xf32>, vector<3x15xf32>
return %f: vector<3x15xf32>
@@ -307,7 +307,7 @@ func.func @transfer_write_progressive(%A : memref<?x?xf32>, %base: index, %vec:
// CHECK: %[[cmp:.*]] = arith.cmpi sgt, %[[dim]], %[[add]] : index
// CHECK: scf.if %[[cmp]] {
// CHECK: %[[vec_1d:.*]] = memref.load %[[vmemref]][%[[I]]] : memref<3xvector<15xf32>>
- // CHECK: vector.transfer_write %[[vec_1d]], %[[A]][{{.*}}, %[[base]]] : vector<15xf32>, memref<?x?xf32>
+ // CHECK: vector.transfer_write %[[vec_1d]], %[[A]][{{.*}}, %[[base]]] {{.*}} : vector<15xf32>, memref<?x?xf32>
// CHECK: }
// CHECK: }
@@ -316,22 +316,22 @@ func.func @transfer_write_progressive(%A : memref<?x?xf32>, %base: index, %vec:
// FULL-UNROLL: %[[CMP0:.*]] = arith.cmpi sgt, %[[DIM]], %[[base]] : index
// FULL-UNROLL: scf.if %[[CMP0]] {
// FULL-UNROLL: %[[V0:.*]] = vector.extract %[[vec]][0] : vector<15xf32> from vector<3x15xf32>
- // FULL-UNROLL: vector.transfer_write %[[V0]], %[[A]][%[[base]], %[[base]]] : vector<15xf32>, memref<?x?xf32>
+ // FULL-UNROLL: vector.transfer_write %[[V0]], %[[A]][%[[base]], %[[base]]] {{.*}} : vector<15xf32>, memref<?x?xf32>
// FULL-UNROLL: }
// FULL-UNROLL: %[[I1:.*]] = affine.apply #[[$MAP1]]()[%[[base]]]
// FULL-UNROLL: %[[CMP1:.*]] = arith.cmpi sgt, %{{.*}}, %[[I1]] : index
// FULL-UNROLL: scf.if %[[CMP1]] {
// FULL-UNROLL: %[[V1:.*]] = vector.extract %[[vec]][1] : vector<15xf32> from vector<3x15xf32>
- // FULL-UNROLL: vector.transfer_write %[[V1]], %[[A]][%{{.*}}, %[[base]]] : vector<15xf32>, memref<?x?xf32>
+ // FULL-UNROLL: vector.transfer_write %[[V1]], %[[A]][%{{.*}}, %[[base]]] {{.*}} : vector<15xf32>, memref<?x?xf32>
// FULL-UNROLL: }
// FULL-UNROLL: %[[I2:.*]] = affine.apply #[[$MAP2]]()[%[[base]]]
// FULL-UNROLL: %[[CMP2:.*]] = arith.cmpi sgt, %{{.*}}, %[[I2]] : index
// FULL-UNROLL: scf.if %[[CMP2]] {
// FULL-UNROLL: %[[V2:.*]] = vector.extract %[[vec]][2] : vector<15xf32> from vector<3x15xf32>
- // FULL-UNROLL: vector.transfer_write %[[V2]], %[[A]][%{{.*}}, %[[base]]] : vector<15xf32>, memref<?x?xf32>
+ // FULL-UNROLL: vector.transfer_write %[[V2]], %[[A]][%{{.*}}, %[[base]]] {{.*}} : vector<15xf32>, memref<?x?xf32>
// FULL-UNROLL: }
- vector.transfer_write %vec, %A[%base, %base] :
+ vector.transfer_write %vec, %A[%base, %base] {in_bounds = [false, false]} :
vector<3x15xf32>, memref<?x?xf32>
return
}
@@ -389,7 +389,7 @@ func.func @transfer_read_simple(%A : memref<2x2xf32>) -> vector<2x2xf32> {
// FULL-UNROLL: %[[RES0:.*]] = vector.insert %[[V0]], %[[VC0]] [0] : vector<2xf32> into vector<2x2xf32>
// FULL-UNROLL: %[[V1:.*]] = vector.transfer_read %{{.*}}[%[[C1]], %[[C0]]]
// FULL-UNROLL: %[[RES1:.*]] = vector.insert %[[V1]], %[[RES0]] [1] : vector<2xf32> into vector<2x2xf32>
- %0 = vector.transfer_read %A[%c0, %c0], %f0 : memref<2x2xf32>, vector<2x2xf32>
+ %0 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds = [false, false]} : memref<2x2xf32>, vector<2x2xf32>
return %0 : vector<2x2xf32>
}
@@ -397,7 +397,7 @@ func.func @transfer_read_minor_identity(%A : memref<?x?x?x?xf32>) -> vector<3x3x
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f32
%0 = vector.transfer_read %A[%c0, %c0, %c0, %c0], %f0
- { permutation_map = affine_map<(d0, d1, d2, d3) -> (d2, d3)> }
+ { permutation_map = affine_map<(d0, d1, d2, d3) -> (d2, d3)>, in_bounds = [false, false]}
: memref<?x?x?x?xf32>, vector<3x3xf32>
return %0 : vector<3x3xf32>
}
@@ -416,7 +416,7 @@ func.func @transfer_read_minor_identity(%A : memref<?x?x?x?xf32>) -> vector<3x3x
// CHECK: %[[d:.*]] = memref.dim %[[A]], %[[c2]] : memref<?x?x?x?xf32>
// CHECK: %[[cmp:.*]] = arith.cmpi sgt, %[[d]], %[[arg1]] : index
// CHECK: scf.if %[[cmp]] {
-// CHECK: %[[tr:.*]] = vector.transfer_read %[[A]][%c0, %c0, %[[arg1]], %c0], %[[f0]] : memref<?x?x?x?xf32>, vector<3xf32>
+// CHECK: %[[tr:.*]] = vector.transfer_read %[[A]][%c0, %c0, %[[arg1]], %c0], %[[f0]] {{.*}} : memref<?x?x?x?xf32>, vector<3xf32>
// CHECK: memref.store %[[tr]], %[[cast]][%[[arg1]]] : memref<3xvector<3xf32>>
// CHECK: } else {
// CHECK: memref.store %[[cst0]], %[[cast]][%[[arg1]]] : memref<3xvector<3xf32>>
@@ -429,7 +429,7 @@ func.func @transfer_write_minor_identity(%A : vector<3x3xf32>, %B : memref<?x?x?
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f32
vector.transfer_write %A, %B[%c0, %c0, %c0, %c0]
- { permutation_map = affine_map<(d0, d1, d2, d3) -> (d2, d3)> }
+ { permutation_map = affine_map<(d0, d1, d2, d3) -> (d2, d3)>, in_bounds = [false, false]}
: vector<3x3xf32>, memref<?x?x?x?xf32>
return
}
@@ -449,7 +449,7 @@ func.func @transfer_write_minor_identity(%A : vector<3x3xf32>, %B : memref<?x?x?
// CHECK: %[[cmp:.*]] = arith.cmpi sgt, %[[d]], %[[arg2]] : index
// CHECK: scf.if %[[cmp]] {
// CHECK: %[[tmp:.*]] = memref.load %[[cast]][%[[arg2]]] : memref<3xvector<3xf32>>
-// CHECK: vector.transfer_write %[[tmp]], %[[B]][%[[c0]], %[[c0]], %[[arg2]], %[[c0]]] : vector<3xf32>, memref<?x?x?x?xf32>
+// CHECK: vector.transfer_write %[[tmp]], %[[B]][%[[c0]], %[[c0]], %[[arg2]], %[[c0]]] {{.*}} : vector<3xf32>, memref<?x?x?x?xf32>
// CHECK: }
// CHECK: }
// CHECK: return
@@ -460,7 +460,7 @@ func.func @transfer_write_minor_identity(%A : vector<3x3xf32>, %B : memref<?x?x?
func.func @transfer_read_strided(%A : memref<8x4xf32, affine_map<(d0, d1) -> (d0 + d1 * 8)>>) -> vector<4xf32> {
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %A[%c0, %c0], %f0
+ %0 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds = [false]}
: memref<8x4xf32, affine_map<(d0, d1) -> (d0 + d1 * 8)>>, vector<4xf32>
return %0 : vector<4xf32>
}
@@ -471,8 +471,8 @@ func.func @transfer_read_strided(%A : memref<8x4xf32, affine_map<(d0, d1) -> (d0
func.func @transfer_write_strided(%A : vector<4xf32>, %B : memref<8x4xf32, affine_map<(d0, d1) -> (d0 + d1 * 8)>>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %A, %B[%c0, %c0] :
- vector<4xf32>, memref<8x4xf32, affine_map<(d0, d1) -> (d0 + d1 * 8)>>
+ vector.transfer_write %A, %B[%c0, %c0] {in_bounds = [false]}
+ : vector<4xf32>, memref<8x4xf32, affine_map<(d0, d1) -> (d0 + d1 * 8)>>
return
}
@@ -492,7 +492,7 @@ func.func @transfer_read_within_async_execute(%A : memref<2x2xf32>) -> !async.to
// CHECK: async.execute
// CHECK: alloca
%token = async.execute {
- %0 = vector.transfer_read %A[%c0, %c0], %f0 : memref<2x2xf32>, vector<2x2xf32>
+ %0 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds = [false, false]}: memref<2x2xf32>, vector<2x2xf32>
func.call @fake_side_effecting_fun(%0) : (vector<2x2xf32>) -> ()
async.yield
}
@@ -507,7 +507,7 @@ func.func @transfer_read_with_tensor(%arg: tensor<f32>) -> vector<1xf32> {
// CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : f32 to vector<1xf32>
// CHECK-NEXT: return %[[RESULT]] : vector<1xf32>
%f0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg[], %f0 {permutation_map = affine_map<()->(0)>} :
+ %0 = vector.transfer_read %arg[], %f0 {permutation_map = affine_map<()->(0)>, in_bounds = [true]} :
tensor<f32>, vector<1xf32>
return %0: vector<1xf32>
}
@@ -746,7 +746,7 @@ func.func @cannot_lower_transfer_read_with_leading_scalable(%arg0: memref<?x4xf3
func.func @does_not_crash_on_unpack_one_dim(%subview: memref<1x1x1x1xi32>, %mask: vector<1x1xi1>) -> vector<1x1x1x1xi32> {
%c0 = arith.constant 0 : index
%c0_i32 = arith.constant 0 : i32
- %3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {permutation_map = #map1}
+ %3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {permutation_map = #map1, in_bounds = [false, true, true, false]}
: memref<1x1x1x1xi32>, vector<1x1x1x1xi32>
return %3 : vector<1x1x1x1xi32>
}
@@ -793,7 +793,7 @@ func.func @cannot_fully_unroll_transfer_write_of_nd_scalable_vector(%vec: vector
func.func @unroll_transfer_write_target_rank_zero(%vec : vector<2xi32>) {
%alloc = memref.alloc() : memref<4xi32>
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %alloc[%c0] : vector<2xi32>, memref<4xi32>
+ vector.transfer_write %vec, %alloc[%c0] {in_bounds = [false]} : vector<2xi32>, memref<4xi32>
return
}
// TARGET-RANK-ZERO: %[[ALLOC:.*]] = memref.alloc() : memref<4xi32>
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vector_utils.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vector_utils.mlir
index bd71164244c00..53423595da3a6 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vector_utils.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vector_utils.mlir
@@ -56,7 +56,7 @@ func.func @double_loop_nest(%a: memref<20x30xf32>, %b: memref<20xf32>) {
// VECNEST: vector.transfer_read
// VECNEST-NEXT: affine.for %{{.*}} = 0 to 30 {
// VECNEST: vector.transfer_read
-// VECNEST-NEXT: vector.transfer_write %{{.*}}, %{{.*}}[%{{.*}}, %{{.*}}] {permutation_map = #{{.*}}}
+// VECNEST-NEXT: vector.transfer_write %{{.*}}, %{{.*}}[%{{.*}}, %{{.*}}] {{{.*}} permutation_map = #{{.*}}}
// VECNEST-NEXT: }
// VECNEST-NEXT: vector.transfer_write
// VECNEST: }
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
index 9244604128cb7..f3d7185e4b914 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
@@ -22,7 +22,7 @@ func.func @vec1d_1(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%[[C0]])
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%[[C0]])
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
-// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
+// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i0 = 0 to %M { // vectorized due to scalar -> vector
%a0 = affine.load %A[%c0, %c0] : memref<?x?xf32>
}
@@ -48,7 +48,7 @@ func.func @vec1d_2(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK:for [[IV3:%[a-zA-Z0-9]+]] = 0 to [[ARG_M]] step 128
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.0{{.*}}: f32
-// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %[[CST]] : memref<?x?xf32>, vector<128xf32>
+// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %[[CST]] {{.*}} : memref<?x?xf32>, vector<128xf32>
affine.for %i3 = 0 to %M { // vectorized
%a3 = affine.load %A[%c0, %i3] : memref<?x?xf32>
}
@@ -77,7 +77,7 @@ func.func @vec1d_3(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK-NEXT: %[[APP9_0:[0-9a-zA-Z_]+]] = affine.apply {{.*}}([[IV9]], [[IV8]])
// CHECK-NEXT: %[[APP9_1:[0-9a-zA-Z_]+]] = affine.apply {{.*}}([[IV9]], [[IV8]])
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.0{{.*}}: f32
-// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%[[APP9_0]], %[[APP9_1]]], %[[CST]] : memref<?x?xf32>, vector<128xf32>
+// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%[[APP9_0]], %[[APP9_1]]], %[[CST]] {{.*}} : memref<?x?xf32>, vector<128xf32>
affine.for %i8 = 0 to %M { // vectorized
affine.for %i9 = 0 to %N {
%a9 = affine.load %A[%i9, %i8 + %i9] : memref<?x?xf32>
@@ -115,13 +115,13 @@ func.func @vector_add_2d(%M : index, %N : index) -> f32 {
affine.for %i5 = 0 to %N {
// CHECK: %[[SPLAT2:.*]] = arith.constant dense<2.000000e+00> : vector<128xf32>
// CHECK: %[[SPLAT1:.*]] = arith.constant dense<1.000000e+00> : vector<128xf32>
- // CHECK: %[[A5:.*]] = vector.transfer_read %{{.*}}[{{.*}}], %{{[a-zA-Z0-9_]*}} : memref<?x?xf32>, vector<128xf32>
- // CHECK: %[[B5:.*]] = vector.transfer_read %{{.*}}[{{.*}}], %{{[a-zA-Z0-9_]*}} : memref<?x?xf32>, vector<128xf32>
+ // CHECK: %[[A5:.*]] = vector.transfer_read %{{.*}}[{{.*}}], %{{[a-zA-Z0-9_]*}} {{.*}} : memref<?x?xf32>, vector<128xf32>
+ // CHECK: %[[B5:.*]] = vector.transfer_read %{{.*}}[{{.*}}], %{{[a-zA-Z0-9_]*}} {{.*}} : memref<?x?xf32>, vector<128xf32>
// CHECK: %[[S5:.*]] = arith.addf %[[A5]], %[[B5]] : vector<128xf32>
// CHECK: %[[S6:.*]] = arith.addf %[[S5]], %[[SPLAT1]] : vector<128xf32>
// CHECK: %[[S7:.*]] = arith.addf %[[S5]], %[[SPLAT2]] : vector<128xf32>
// CHECK: %[[S8:.*]] = arith.addf %[[S7]], %[[S6]] : vector<128xf32>
- // CHECK: vector.transfer_write %[[S8]], {{.*}} : vector<128xf32>, memref<?x?xf32>
+ // CHECK: vector.transfer_write %[[S8]], {{.*}} {{.*}} : vector<128xf32>, memref<?x?xf32>
%a5 = affine.load %A[%i4, %i5] : memref<?x?xf32, 0>
%b5 = affine.load %B[%i4, %i5] : memref<?x?xf32, 0>
%s5 = arith.addf %a5, %b5 : f32
@@ -171,7 +171,7 @@ func.func @vec_block_arg(%A : memref<32x512xi32>) {
// CHECK-NEXT: affine.for %[[IV1:[0-9a-zA-Z_]+]] = 0 to 32 {
// CHECK-NEXT: %[[BROADCAST:.*]] = vector.broadcast %[[IV1]] : index to vector<128xindex>
// CHECK-NEXT: %[[CAST:.*]] = arith.index_cast %[[BROADCAST]] : vector<128xindex> to vector<128xi32>
- // CHECK-NEXT: vector.transfer_write %[[CAST]], {{.*}}[%[[IV1]], %[[IV0]]] : vector<128xi32>, memref<32x512xi32>
+ // CHECK-NEXT: vector.transfer_write %[[CAST]], {{.*}}[%[[IV1]], %[[IV0]]] {{.*}} : vector<128xi32>, memref<32x512xi32>
affine.for %i = 0 to 512 { // vectorized
affine.for %j = 0 to 32 {
%idx = arith.index_cast %j : index to i32
@@ -281,7 +281,7 @@ func.func @vec_rejected_3(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK:for [[IV4:%[0-9a-zA-Z_]+]] = 0 to [[ARG_M]] step 128 {
// CHECK-NEXT: for [[IV5:%[0-9a-zA-Z_]*]] = 0 to [[ARG_N]] {
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
-// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{[a-zA-Z0-9_]*}} : memref<?x?xf32>, vector<128xf32>
+// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{[a-zA-Z0-9_]*}} {{.*}} : memref<?x?xf32>, vector<128xf32>
affine.for %i4 = 0 to %M { // vectorized
affine.for %i5 = 0 to %N { // not vectorized, would vectorize with --test-fastest-varying=1
%a5 = affine.load %A[%i5, %i4] : memref<?x?xf32>
@@ -425,7 +425,7 @@ func.func @vec_rejected_8(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK: %{{.*}} = arith.constant 0.0{{.*}}: f32
-// CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
+// CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %{{.*}} in DFS post-order prevents vectorizing %{{.*}}
affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
%a18 = affine.load %A[%c0, %c0] : memref<?x?xf32>
@@ -459,7 +459,7 @@ func.func @vec_rejected_9(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
-// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
+// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %i18 in DFS post-order prevents vectorizing %{{.*}}
affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
%a18 = affine.load %A[%c0, %c0] : memref<?x?xf32>
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
index 83916e755363b..7c60d3058dfc8 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
@@ -113,7 +113,7 @@ func.func @vectorize_matmul(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg
// VECT: {{.*}} #[[$map_id1]](%[[M]]) step 4 {
// VECT-NEXT: {{.*}} #[[$map_id1]](%[[N]]) step 8 {
// VECT: %[[VC0:.*]] = arith.constant dense<0.000000e+00> : vector<4x8xf32>
- // VECT-NEXT: vector.transfer_write %[[VC0]], %{{.*}}[%{{.*}}, %{{.*}}] : vector<4x8xf32>, memref<?x?xf32>
+ // VECT-NEXT: vector.transfer_write %[[VC0]], %{{.*}}[%{{.*}}, %{{.*}}] {{.*}} : vector<4x8xf32>, memref<?x?xf32>
affine.for %i0 = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%M) {
affine.for %i1 = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%N) {
%cst = arith.constant 0.000000e+00 : f32
@@ -123,13 +123,13 @@ func.func @vectorize_matmul(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg
// VECT: affine.for %[[I2:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[M]]) step 4 {
// VECT-NEXT: affine.for %[[I3:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[N]]) step 8 {
// VECT-NEXT: affine.for %[[I4:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[K]]) {
- // VECT: %[[A:.*]] = vector.transfer_read %{{.*}}[%[[I4]], %[[I3]]], %{{.*}} {permutation_map = #[[$map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
- // VECT: %[[B:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I4]]], %{{.*}} {permutation_map = #[[$map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
+ // VECT: %[[A:.*]] = vector.transfer_read %{{.*}}[%[[I4]], %[[I3]]], %{{.*}} {{{.*}} permutation_map = #[[$map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
+ // VECT: %[[B:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I4]]], %{{.*}} {{{.*}} permutation_map = #[[$map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT-NEXT: %[[C:.*]] = arith.mulf %[[B]], %[[A]] : vector<4x8xf32>
// VECT: %[[D:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I3]]], %{{.*}} : memref<?x?xf32>, vector<4x8xf32>
// VECT-NEXT: %[[E:.*]] = arith.addf %[[D]], %[[C]] : vector<4x8xf32>
- // VECT: vector.transfer_write %[[E]], %{{.*}}[%[[I2]], %[[I3]]] : vector<4x8xf32>, memref<?x?xf32>
- affine.for %i2 = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%M) {
+ // VECT: vector.transfer_write %[[E]], %{{.*}}[%[[I2]], %[[I3]]] {{.*}} : vector<4x8xf32>, memref<?x?xf32>
+ affine.for %i2 = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%M ) {
affine.for %i3 = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%N) {
affine.for %i4 = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%K) {
%6 = affine.load %arg1[%i4, %i3] : memref<?x?xf32>
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
index 15a7133cf0f65..494e03c797d13 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
@@ -12,8 +12,8 @@ func.func @vec_affine_apply(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24x48xf3
// CHECK-NEXT: %[[S0:.*]] = affine.apply #[[$MAP_ID0]](%[[ARG3]])
// CHECK-NEXT: %[[S1:.*]] = affine.apply #[[$MAP_ID1]](%[[ARG4]])
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.000000e+00 : f32
-// CHECK-NEXT: %[[S2:.*]] = vector.transfer_read %[[ARG0]][%[[ARG2]], %[[S0]], %[[S1]]], %[[CST]] : memref<8x12x16xf32>, vector<8xf32>
-// CHECK-NEXT: vector.transfer_write %[[S2]], %[[ARG1]][%[[ARG2]], %[[ARG3]], %[[ARG4]]] : vector<8xf32>, memref<8x24x48xf32>
+// CHECK-NEXT: %[[S2:.*]] = vector.transfer_read %[[ARG0]][%[[ARG2]], %[[S0]], %[[S1]]], %[[CST]] {{.*}} : memref<8x12x16xf32>, vector<8xf32>
+// CHECK-NEXT: vector.transfer_write %[[S2]], %[[ARG1]][%[[ARG2]], %[[ARG3]], %[[ARG4]]] {{.*}} : vector<8xf32>, memref<8x24x48xf32>
// CHECK-NEXT: }
// CHECK-NEXT: }
// CHECK-NEXT: }
@@ -43,8 +43,8 @@ func.func @vec_affine_apply_2(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24x48x
// CHECK-NEXT: affine.for %[[ARG4:.*]] = 0 to 48 step 8 {
// CHECK-NEXT: %[[S0:.*]] = affine.apply #[[$MAP_ID2]](%[[ARG4]])
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.000000e+00 : f32
-// CHECK-NEXT: %[[S1:.*]] = vector.transfer_read %[[ARG0]][%[[ARG2]], %[[ARG3]], %[[S0]]], %[[CST]] : memref<8x12x16xf32>, vector<8xf32>
-// CHECK-NEXT: vector.transfer_write %[[S1]], %[[ARG1]][%[[ARG2]], %[[ARG3]], %[[ARG4]]] : vector<8xf32>, memref<8x24x48xf32>
+// CHECK-NEXT: %[[S1:.*]] = vector.transfer_read %[[ARG0]][%[[ARG2]], %[[ARG3]], %[[S0]]], %[[CST]] {{.*}} : memref<8x12x16xf32>, vector<8xf32>
+// CHECK-NEXT: vector.transfer_write %[[S1]], %[[ARG1]][%[[ARG2]], %[[ARG3]], %[[ARG4]]] {{.*}} : vector<8xf32>, memref<8x24x48xf32>
// CHECK-NEXT: }
// CHECK-NEXT: }
// CHECK-NEXT: }
@@ -141,8 +141,8 @@ func.func @affine_map_with_expr_2(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24
// CHECK-NEXT: %[[S1:.*]] = affine.apply #[[$MAP_ID4]](%[[ARG3]], %[[ARG4]], %[[I0]])
// CHECK-NEXT: %[[S2:.*]] = affine.apply #[[$MAP_ID5]](%[[ARG3]], %[[ARG4]], %[[I0]])
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.000000e+00 : f32
-// CHECK-NEXT: %[[S3:.*]] = vector.transfer_read %[[ARG0]][%[[S0]], %[[S1]], %[[S2]]], %[[CST]] {permutation_map = #[[$MAP_ID6]]} : memref<8x12x16xf32>, vector<8xf32>
-// CHECK-NEXT: vector.transfer_write %[[S3]], %[[ARG1]][%[[ARG3]], %[[ARG4]], %[[ARG5]]] : vector<8xf32>, memref<8x24x48xf32>
+// CHECK-NEXT: %[[S3:.*]] = vector.transfer_read %[[ARG0]][%[[S0]], %[[S1]], %[[S2]]], %[[CST]] {{{.*}} permutation_map = #[[$MAP_ID6]]} : memref<8x12x16xf32>, vector<8xf32>
+// CHECK-NEXT: vector.transfer_write %[[S3]], %[[ARG1]][%[[ARG3]], %[[ARG4]], %[[ARG5]]] {{.*}} : vector<8xf32>, memref<8x24x48xf32>
// CHECK-NEXT: }
// CHECK-NEXT: }
// CHECK-NEXT: }
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_2d.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_2d.mlir
index 6b8f03ba9c6b5..587c5a0e15525 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_2d.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_2d.mlir
@@ -13,7 +13,7 @@ func.func @vec2d(%A : memref<?x?x?xf32>) {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 32
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256
- // CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d0d2]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d0d2]]} : memref<?x?x?xf32>, vector<32x256xf32>
affine.for %i0 = 0 to %M {
affine.for %i1 = 0 to %N {
affine.for %i2 = 0 to %P {
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_transpose_2d.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_transpose_2d.mlir
index 05465d734d0b3..5d4742d3baa45 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_transpose_2d.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_transpose_2d.mlir
@@ -25,7 +25,7 @@ func.func @vec2d(%A : memref<?x?x?xf32>) {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 32
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
- // CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
affine.for %i3 = 0 to %M {
affine.for %i4 = 0 to %N {
affine.for %i5 = 0 to %P {
@@ -46,12 +46,12 @@ func.func @vec2d_imperfectly_nested(%A : memref<?x?x?xf32>) {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 32 {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256 {
- // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256 {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
- // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
- // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d0]]} : memref<?x?x?xf32>, vector<32x256xf32>
affine.for %i0 = 0 to %0 {
affine.for %i1 = 0 to %1 {
affine.for %i2 = 0 to %2 {
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_transpose_2d.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_transpose_2d.mlir
index f1662b78242ed..23cf4183e0440 100644
--- a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_transpose_2d.mlir
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_transpose_2d.mlir
@@ -25,7 +25,7 @@ func.func @vec2d(%A : memref<?x?x?xf32>) {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 32
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256
- // CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
affine.for %i3 = 0 to %M {
affine.for %i4 = 0 to %N {
affine.for %i5 = 0 to %P {
@@ -46,12 +46,12 @@ func.func @vec2d_imperfectly_nested(%A : memref<?x?x?xf32>) {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 32 {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256 {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
- // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} {
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256 {
- // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
// CHECK: affine.for %{{.*}} = 0 to %{{.*}} step 256 {
- // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
+ // CHECK: %{{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}, %{{.*}}], %{{.*}} {{{.*}} permutation_map = #[[map_proj_d0d1d2_d2d1]]} : memref<?x?x?xf32>, vector<32x256xf32>
affine.for %i0 = 0 to %0 {
affine.for %i1 = 0 to %1 {
affine.for %i2 = 0 to %2 {
diff --git a/mlir/test/Dialect/ArmSME/vector-legalization.mlir b/mlir/test/Dialect/ArmSME/vector-legalization.mlir
index 71d80bc16ea12..2eaf8bf1efd16 100644
--- a/mlir/test/Dialect/ArmSME/vector-legalization.mlir
+++ b/mlir/test/Dialect/ArmSME/vector-legalization.mlir
@@ -418,10 +418,10 @@ func.func @lift_illegal_transpose_to_memory(%a: index, %b: index, %memref: memre
// CHECK-NEXT: %[[READ_SUBVIEW:.*]] = memref.subview %[[MEMREF]][%[[INDEXA]], %[[INDEXB]]] [%[[C8_VSCALE]], 4] [1, 1] : memref<?x?xf32> to memref<?x4xf32, strided<[?, 1], offset: ?>>
// CHECK-NEXT: %[[CAST:.*]] = memref.cast %[[READ_SUBVIEW]] : memref<?x4xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
// CHECK-NEXT: %[[TRANSPOSE:.*]] = memref.transpose %[[CAST]] (d0, d1) -> (d1, d0) : memref<?x?xf32, strided<[?, ?], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
- // CHECK-NEXT: %[[LEGAL_READ:.*]] = vector.transfer_read %[[TRANSPOSE]][%c0, %c0], %[[C0_F32]] : memref<?x?xf32, strided<[?, ?], offset: ?>>, vector<4x[8]xf32>
+ // CHECK-NEXT: %[[LEGAL_READ:.*]] = vector.transfer_read %[[TRANSPOSE]][%c0, %c0], %[[C0_F32]] {{.*}} : memref<?x?xf32, strided<[?, ?], offset: ?>>, vector<4x[8]xf32>
// CHECK-NEXT: return %[[LEGAL_READ]]
%pad = arith.constant 0.0 : f32
- %illegalRead = vector.transfer_read %memref[%a, %b], %pad : memref<?x?xf32>, vector<[8]x4xf32>
+ %illegalRead = vector.transfer_read %memref[%a, %b], %pad {in_bounds = [false, false]}: memref<?x?xf32>, vector<[8]x4xf32>
%legalType = vector.transpose %illegalRead, [1, 0] : vector<[8]x4xf32> to vector<4x[8]xf32>
return %legalType : vector<4x[8]xf32>
}
@@ -438,11 +438,11 @@ func.func @lift_illegal_transpose_to_memory_with_mask(%dim0: index, %dim1: index
// CHECK-DAG: %[[TRANSPOSE:.*]] = memref.transpose %[[CAST]]
// CHECK-DAG: %[[MASK:.*]] = vector.create_mask %[[DIM1]], %[[DIM0]] : vector<4x[8]xi1>
// CHECK: %[[LEGAL_READ:.*]] = vector.transfer_read %[[TRANSPOSE]]
- // CHECK-SAME: %[[MASK]] : memref<?x?xf32, strided<[?, ?], offset: ?>>, vector<4x[8]xf32>
+ // CHECK-SAME: %[[MASK]] {{.*}} : memref<?x?xf32, strided<[?, ?], offset: ?>>, vector<4x[8]xf32>
// CHECK-NEXT: return %[[LEGAL_READ]]
%pad = arith.constant 0.0 : f32
%mask = vector.create_mask %dim0, %dim1 : vector<[8]x4xi1>
- %illegalRead = vector.transfer_read %memref[%a, %b], %pad, %mask : memref<?x?xf32>, vector<[8]x4xf32>
+ %illegalRead = vector.transfer_read %memref[%a, %b], %pad, %mask {in_bounds = [false, false]} : memref<?x?xf32>, vector<[8]x4xf32>
%legalType = vector.transpose %illegalRead, [1, 0] : vector<[8]x4xf32> to vector<4x[8]xf32>
return %legalType : vector<4x[8]xf32>
}
@@ -459,7 +459,7 @@ func.func @lift_illegal_transpose_to_memory_with_arith_extop(%a: index, %b: inde
// CHECK-NEXT: %[[EXT_TYPE:.*]] = arith.extsi %[[LEGAL_READ]] : vector<4x[8]xi8> to vector<4x[8]xi32>
// CHECK-NEXT: return %[[EXT_TYPE]]
%pad = arith.constant 0 : i8
- %illegalRead = vector.transfer_read %memref[%a, %b], %pad : memref<?x?xi8>, vector<[8]x4xi8>
+ %illegalRead = vector.transfer_read %memref[%a, %b], %pad {in_bounds = [false, false]} : memref<?x?xi8>, vector<[8]x4xi8>
%extRead = arith.extsi %illegalRead : vector<[8]x4xi8> to vector<[8]x4xi32>
%legalType = vector.transpose %extRead, [1, 0] : vector<[8]x4xi32> to vector<4x[8]xi32>
return %legalType : vector<4x[8]xi32>
diff --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-analysis-bottom-up-from-terminators.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-analysis-bottom-up-from-terminators.mlir
index 1b75edc4c157f..d57d812dd3ad1 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-analysis-bottom-up-from-terminators.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-analysis-bottom-up-from-terminators.mlir
@@ -16,7 +16,7 @@ func.func @simple_test(%lb: index, %ub: index, %step: index, %f1: f32, %f2: f32)
%2 = linalg.fill ins(%f1 : f32) outs(%t : tensor<5xf32>) -> tensor<5xf32>
// CHECK: linalg.fill {__inplace_operands_attr__ = ["none", "true"]}
%3 = linalg.fill ins(%f2 : f32) outs(%t : tensor<5xf32>) -> tensor<5xf32>
- %4 = vector.transfer_read %2[%c0], %p : tensor<5xf32>, vector<5xf32>
+ %4 = vector.transfer_read %2[%c0], %p {in_bounds=[false]} : tensor<5xf32>, vector<5xf32>
vector.print %4 : vector<5xf32>
scf.yield %3 : tensor<5xf32>
}
@@ -27,7 +27,7 @@ func.func @simple_test(%lb: index, %ub: index, %step: index, %f1: f32, %f2: f32)
%7 = linalg.fill ins(%f1 : f32) outs(%t : tensor<5xf32>) -> tensor<5xf32>
// CHECK: linalg.fill {__inplace_operands_attr__ = ["none", "false"]}
%8 = linalg.fill ins(%f2 : f32) outs(%t : tensor<5xf32>) -> tensor<5xf32>
- %9 = vector.transfer_read %8[%c0], %p : tensor<5xf32>, vector<5xf32>
+ %9 = vector.transfer_read %8[%c0], %p {in_bounds=[false]} : tensor<5xf32>, vector<5xf32>
vector.print %9 : vector<5xf32>
scf.yield %7 : tensor<5xf32>
}
diff --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
index 9380c81ce235c..9c236df843a0c 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir
@@ -29,7 +29,7 @@ func.func @use_of_unknown_op_1(%t1: tensor<?xf32>)
// CHECK: vector.transfer_read %[[dummy_memref]][%{{.*}}], %{{.*}} : memref<?xf32, strided<[?], offset: ?>>
// CHECK-NO-LAYOUT-MAP: %[[dummy_memref:.*]] = bufferization.to_memref %[[dummy]] : memref<?xf32>
// CHECK-NO-LAYOUT-MAP: vector.transfer_read %[[dummy_memref]][%{{.*}}], %{{.*}} : memref<?xf32>
- %1 = vector.transfer_read %0[%idx], %cst : tensor<?xf32>, vector<5xf32>
+ %1 = vector.transfer_read %0[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
return %1 : vector<5xf32>
}
@@ -57,13 +57,13 @@ func.func @use_of_unknown_op_3(%t1: tensor<?xf32>)
%cst = arith.constant 0.0 : f32
// CHECK: %[[m1:.*]] = bufferization.to_memref %[[t1]]
// CHECK: %[[v1:.*]] = vector.transfer_read %[[m1]]
- %1 = vector.transfer_read %t1[%idx], %cst : tensor<?xf32>, vector<5xf32>
+ %1 = vector.transfer_read %t1[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: %[[dummy:.*]] = "test.dummy_op"(%[[t1]])
%0 = "test.dummy_op"(%t1) : (tensor<?xf32>) -> tensor<?xf32>
// CHECK: %[[dummy_memref:.*]] = bufferization.to_memref %[[dummy]] : memref<?xf32, strided<[?], offset: ?>>
// CHECK: %[[v2:.*]] = vector.transfer_read %[[dummy_memref]]
- %2 = vector.transfer_read %0[%idx], %cst : tensor<?xf32>, vector<5xf32>
+ %2 = vector.transfer_read %0[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: return %[[v1]], %[[v2]]
return %1, %2 : vector<5xf32>, vector<5xf32>
@@ -83,7 +83,7 @@ func.func @use_of_unknown_op_4(%t1: tensor<?xf32>)
// CHECK: %[[dummy_memref:.*]] = bufferization.to_memref %[[dummy]]
// CHECK: %[[v1:.*]] = vector.transfer_read %[[dummy_memref]]
- %1 = vector.transfer_read %0[%idx], %cst : tensor<?xf32>, vector<5xf32>
+ %1 = vector.transfer_read %0[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: %[[another_dummy:.*]] = "test.another_dummy_op"(%[[dummy]])
%2 = "test.another_dummy_op"(%0) : (tensor<?xf32>) -> tensor<?xf32>
@@ -121,7 +121,7 @@ func.func @unused_unknown_op(%t1 : tensor<?xf32>) -> vector<5xf32> {
// CHECK: %[[m1:.*]] = bufferization.to_memref %[[t1]]
// CHECK: vector.transfer_read %[[m1]]
- %1 = vector.transfer_read %t1[%idx], %cst : tensor<?xf32>, vector<5xf32>
+ %1 = vector.transfer_read %t1[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: "test.dummy_op"(%[[t1]])
"test.dummy_op"(%t1) : (tensor<?xf32>) -> ()
@@ -150,7 +150,7 @@ func.func @unknown_op_may_read(%v: vector<5xf32>)
// CHECK: memref.copy %[[m1]], %[[alloc]]
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
// CHECK: %[[alloc_tensor:.*]] = bufferization.to_tensor %[[alloc]]
- %1 = vector.transfer_write %v, %filled[%idx] : vector<5xf32>, tensor<10xf32>
+ %1 = vector.transfer_write %v, %filled[%idx] {in_bounds=[true]} : vector<5xf32>, tensor<10xf32>
// CHECK: %[[dummy:.*]] = "test.dummy_op"(%[[filled_tensor]])
%2 = "test.dummy_op"(%filled) : (tensor<10xf32>) -> (tensor<10xf32>)
@@ -174,7 +174,7 @@ func.func @unknown_op_not_writable(
// CHECK: %[[alloc:.*]] = memref.alloc(%[[dim]])
// CHECK: memref.copy %[[dummy_memref]], %[[alloc]]
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
- %1 = vector.transfer_write %v, %0[%idx] : vector<5xf32>, tensor<?xf32>
+ %1 = vector.transfer_write %v, %0[%idx] {in_bounds=[true]} : vector<5xf32>, tensor<?xf32>
// CHECK: %[[alloc_tensor:.*]] = bufferization.to_tensor %[[alloc]]
// CHECK: return %[[alloc_tensor]]
diff --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize.mlir
index dbf8d6563477b..fdc1268ea6d93 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize.mlir
@@ -41,7 +41,7 @@ func.func @use_tensor_func_arg(%A : tensor<?xf32>) -> (vector<4xf32>) {
// CHECK: %[[A_memref:.*]] = bufferization.to_memref %[[A]]
// CHECK: %[[res:.*]] = vector.transfer_read %[[A_memref]]
- %0 = vector.transfer_read %A[%c0], %f0 : tensor<?xf32>, vector<4xf32>
+ %0 = vector.transfer_read %A[%c0], %f0 {in_bounds=[false]} : tensor<?xf32>, vector<4xf32>
// CHECK: return %[[res]]
return %0 : vector<4xf32>
@@ -60,7 +60,7 @@ func.func @return_tensor(%A : tensor<?xf32>, %v : vector<4xf32>) -> (tensor<?xf3
// CHECK: memref.copy %[[A_memref]], %[[alloc]]
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
// CHECK: %[[res_tensor:.*]] = bufferization.to_tensor %[[alloc]]
- %0 = vector.transfer_write %v, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %A[%c0] {in_bounds=[false]} : vector<4xf32>, tensor<?xf32>
// CHECK: return %[[res_tensor]]
return %0 : tensor<?xf32>
@@ -75,11 +75,11 @@ func.func @func_without_tensor_args(%v : vector<10xf32>) -> () {
%c0 = arith.constant 0 : index
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
- %1 = vector.transfer_write %v, %0[%c0] : vector<10xf32>, tensor<10xf32>
+ %1 = vector.transfer_write %v, %0[%c0] {in_bounds=[false]} : vector<10xf32>, tensor<10xf32>
%cst = arith.constant 0.0 : f32
// CHECK: vector.transfer_read %[[alloc]]
- %r = vector.transfer_read %1[%c0], %cst : tensor<10xf32>, vector<11xf32>
+ %r = vector.transfer_read %1[%c0], %cst {in_bounds=[false]} : tensor<10xf32>, vector<11xf32>
vector.print %r : vector<11xf32>
return
@@ -268,4 +268,4 @@ func.func @materialize_in_dest_raw(%f: f32, %f2: f32, %idx: index) -> (tensor<5x
%r = tensor.extract %dest_filled[%idx] : tensor<5xf32>
return %0, %r : tensor<5xf32>, f32
-}
\ No newline at end of file
+}
diff --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-analysis.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-analysis.mlir
index 42d9cc00d3ff5..b52cc0fc4dee9 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-analysis.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-analysis.mlir
@@ -271,7 +271,7 @@ func.func @read_of_matching_insert_slice_source(
// CHECK-SAME: {__inplace_operands_attr__ = ["true", "true", "none", "none"]}
%2 = tensor.insert_slice %1 into %A[%idx][%idx][1] : tensor<?xf32> into tensor<?xf32>
- %3 = vector.transfer_read %1[%idx2], %cst2 : tensor<?xf32>, vector<5xf32>
+ %3 = vector.transfer_read %1[%idx2], %cst2 {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: return
// CHECK-SAME: __equivalent_func_args__ = [0, -1]
@@ -311,7 +311,7 @@ func.func @read_of_matching_insert_slice_source_interleaved(
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true"]}
%5 = linalg.fill ins(%cst : f32) outs(%4 : tensor<?xf32>) -> tensor<?xf32>
- %3 = vector.transfer_read %1[%idx2], %cst2 : tensor<?xf32>, vector<5xf32>
+ %3 = vector.transfer_read %1[%idx2], %cst2 {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: tensor.insert_slice
// CHECK-SAME: {__inplace_operands_attr__ = ["true", "true", "none", "none"]}
@@ -670,8 +670,8 @@ func.func @write_into_constant_via_alias(%v : vector<5xi32>,
// CHECK-SAME: {__inplace_operands_attr__ = ["false", "none", "none"]}
%b = tensor.extract_slice %A[%s1][%s2][1] : tensor<4xi32> to tensor<?xi32>
// CHECK: vector.transfer_write
- // CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]}
- %r = vector.transfer_write %v, %b[%s3] : vector<5xi32>, tensor<?xi32>
+ // CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"], {{.*}}}
+ %r = vector.transfer_write %v, %b[%s3] {in_bounds=[false]} : vector<5xi32>, tensor<?xi32>
return %r : tensor<?xi32>
}
@@ -732,7 +732,7 @@ func.func @matmul_on_tensors(
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none", "none"]
%8 = linalg.fill ins(%cst_0 : f32) outs(%7 : tensor<256x256xf32>) -> tensor<256x256xf32>
- %9 = vector.transfer_read %arg0[%c0, %c0], %cst_0 {in_bounds = [false, true]} : tensor<518x518xf32>, vector<256x256xf32>
+ %9 = vector.transfer_read %arg0[%c0, %c0], %cst_0 {in_bounds=[false, true]} : tensor<518x518xf32>, vector<256x256xf32>
%10 = vector.transfer_write %9, %8[%c0, %c0] {in_bounds = [true, true]} : vector<256x256xf32>, tensor<256x256xf32>
// CHECK: linalg.fill
@@ -791,7 +791,7 @@ func.func @insert_slice_chain(
%2 = tensor.extract_slice %0[0, 0] [32, 90] [1, 1] : tensor<62x90xf32> to tensor<32x90xf32>
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none", "none"]
- %7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds = [true, true]} : vector<32x90xf32>, tensor<32x90xf32>
+ %7 = vector.transfer_write %v1, %2[%c0, %c0] {in_bounds=[false, false]} : vector<32x90xf32>, tensor<32x90xf32>
// CHECK: tensor.insert_slice
// CHECK-SAME: {__inplace_operands_attr__ = ["true", "true"]
%8 = tensor.insert_slice %7 into %0[0, 0] [32, 90] [1, 1] : tensor<32x90xf32> into tensor<62x90xf32>
@@ -801,7 +801,7 @@ func.func @insert_slice_chain(
%10 = tensor.extract_slice %8[32, 0] [30, 90] [1, 1] : tensor<62x90xf32> to tensor<30x90xf32>
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none", "none"]
- %14 = vector.transfer_write %v2, %10[%c0, %c0] {in_bounds = [true, true]} : vector<30x90xf32>, tensor<30x90xf32>
+ %14 = vector.transfer_write %v2, %10[%c0, %c0] {in_bounds=[true, true]} : vector<30x90xf32>, tensor<30x90xf32>
// CHECK: tensor.insert_slice
// CHECK-SAME: {__inplace_operands_attr__ = ["true", "true"]
%15 = tensor.insert_slice %14 into %8[32, 0] [30, 90] [1, 1] : tensor<30x90xf32> into tensor<62x90xf32>
@@ -829,7 +829,7 @@ func.func @ip(%t: tensor<10x20xf32> {bufferization.writable = true},
%r = scf.for %arg0 = %c0 to %c257 step %c256 iter_args(%arg1 = %t) -> (tensor<10x20xf32>) {
%t1 = tensor.extract_slice %arg1[%x, 0] [5, %y] [1, 1] : tensor<10x20xf32> to tensor<5x?xf32>
%t11 = tensor.extract_slice %t1[0, 0] [5, %y] [1, 1] : tensor<5x?xf32> to tensor<5x?xf32>
- %t2 = vector.transfer_write %v, %t11[%c0, %c0] : vector<5x6xf32>, tensor<5x?xf32>
+ %t2 = vector.transfer_write %v, %t11[%c0, %c0] {in_bounds=[false, false]} : vector<5x6xf32>, tensor<5x?xf32>
%t3 = tensor.insert_slice %t2 into %arg1[%x, 0] [5, %y] [1, 1] : tensor<5x?xf32> into tensor<10x20xf32>
scf.yield %t3 : tensor<10x20xf32>
}
@@ -1044,7 +1044,7 @@ func.func @some_use(%A : tensor<?xf32> {bufferization.writable = true},
%idx = arith.constant 0 : index
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %0 = vector.transfer_write %v, %A[%idx] : vector<5xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %A[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
return %0 : tensor<?xf32>
}
@@ -1069,11 +1069,11 @@ func.func @to_tensor_op_not_writable(%m: memref<?xf32>, %v: vector<5xf32>,
// Write to the tensor. Cannot be inplace due to tensor_load.
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "false", "none"]
- %w = vector.transfer_write %v, %0[%idx1] : vector<5xf32>, tensor<?xf32>
+ %w = vector.transfer_write %v, %0[%idx1] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// Read from the tensor and return result.
%cst = arith.constant 0.0 : f32
- %r = vector.transfer_read %w[%idx2], %cst : tensor<?xf32>, vector<10xf32>
+ %r = vector.transfer_read %w[%idx2], %cst {in_bounds=[false]} : tensor<?xf32>, vector<10xf32>
return %r : vector<10xf32>
}
diff --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-force-copy-before-write.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-force-copy-before-write.mlir
index 7685f2ef3aafe..22f7010d26e1d 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-force-copy-before-write.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-force-copy-before-write.mlir
@@ -30,7 +30,7 @@ module {
func.func @contains_to_memref_op(%arg0: tensor<?xf32> {bufferization.writable = true}, %arg1: index) -> vector<5xf32> {
%0 = bufferization.to_memref %arg0 : memref<?xf32>
%cst = arith.constant 0.000000e+00 : f32
- %1 = vector.transfer_read %0[%arg1], %cst : memref<?xf32>, vector<5xf32>
+ %1 = vector.transfer_read %0[%arg1], %cst {in_bounds=[false]} : memref<?xf32>, vector<5xf32>
return %1 : vector<5xf32>
}
-}
\ No newline at end of file
+}
diff --git a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
index 0248afb11f167..f9b273476840d 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir
@@ -117,11 +117,11 @@ func.func @func_without_tensor_args(%v : vector<10xf32>) -> () {
%c0 = arith.constant 0 : index
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
- %1 = vector.transfer_write %v, %0[%c0] : vector<10xf32>, tensor<10xf32>
+ %1 = vector.transfer_write %v, %0[%c0] {in_bounds=[false]} : vector<10xf32>, tensor<10xf32>
%cst = arith.constant 0.0 : f32
// CHECK: vector.transfer_read %[[alloc]]
- %r = vector.transfer_read %1[%c0], %cst : tensor<10xf32>, vector<11xf32>
+ %r = vector.transfer_read %1[%c0], %cst {in_bounds=[false]} : tensor<10xf32>, vector<11xf32>
vector.print %r : vector<11xf32>
return
@@ -593,7 +593,7 @@ func.func @transfer_read(
%f0 = arith.constant 0.0 : f32
// CHECK: %[[RES:.*]] = vector.transfer_read {{.*}} : memref<?xf32, strided{{.*}}>, vector<4xf32>
- %0 = vector.transfer_read %A[%c0], %f0 : tensor<?xf32>, vector<4xf32>
+ %0 = vector.transfer_read %A[%c0], %f0 {in_bounds=[false]} : tensor<?xf32>, vector<4xf32>
// CHECK: return %[[RES]] : vector<4xf32>
return %0 : vector<4xf32>
@@ -646,7 +646,7 @@ func.func @to_memref_op_unsupported(
// CHECK: vector.transfer_read %[[arg0]]
%cst = arith.constant 0.0 : f32
- %r1 = vector.transfer_read %t1[%idx3], %cst : tensor<?xf32>, vector<5xf32>
+ %r1 = vector.transfer_read %t1[%idx3], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
return %r1 : vector<5xf32>
}
diff --git a/mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir b/mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir
index 3c50a9e72d9d9..e7711dace6617 100644
--- a/mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir
+++ b/mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir
@@ -21,7 +21,7 @@ func.func @test_function(%A : tensor<?xf32>, %v : vector<4xf32>) -> (tensor<?xf3
// CHECK: memref.copy %[[A_memref]], %[[alloc]]
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
// CHECK: %[[res_tensor:.*]] = bufferization.to_tensor %[[alloc]]
- %0 = vector.transfer_write %v, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %A[%c0] {in_bounds=[false]} : vector<4xf32>, tensor<?xf32>
// CHECK: return %[[res_tensor]]
return %0 : tensor<?xf32>
@@ -51,7 +51,7 @@ func.func @test_function(%A : tensor<?xf32>, %v : vector<4xf32>) -> (tensor<?xf3
// CHECK: linalg.copy ins(%[[A_memref]] : memref<{{.*}}>) outs(%[[alloc]]
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
// CHECK: %[[res_tensor:.*]] = bufferization.to_tensor %[[alloc]]
- %0 = vector.transfer_write %v, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %A[%c0] {in_bounds=[false]} : vector<4xf32>, tensor<?xf32>
// CHECK: return %[[res_tensor]]
return %0 : tensor<?xf32>
@@ -75,9 +75,9 @@ module attributes {transform.with_named_sequence} {
func.func @test_function_analysis(%A : tensor<?xf32>, %v : vector<4xf32>) -> (tensor<?xf32>) {
%c0 = arith.constant 0 : index
// CHECK: vector.transfer_write
- // CHECK-SAME: {__inplace_operands_attr__ = ["none", "false", "none"]}
+ // CHECK-SAME: {__inplace_operands_attr__ = ["none", "false", "none"], {{.*}}}
// CHECK-SAME: tensor<?xf32>
- %0 = vector.transfer_write %v, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %A[%c0] {in_bounds=[false]} : vector<4xf32>, tensor<?xf32>
return %0 : tensor<?xf32>
}
@@ -123,7 +123,7 @@ module {
// CHECK: memref.copy %[[A_memref]], %[[alloc]]
// CHECK: vector.transfer_write %{{.*}}, %[[alloc]]
// CHECK: %[[res_tensor:.*]] = bufferization.to_tensor %[[alloc]]
- %0 = vector.transfer_write %v, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %A[%c0] {in_bounds=[false]} : vector<4xf32>, tensor<?xf32>
// CHECK: return %[[res_tensor]]
return %0 : tensor<?xf32>
diff --git a/mlir/test/Dialect/Linalg/forward-vector-transfers.mlir b/mlir/test/Dialect/Linalg/forward-vector-transfers.mlir
index 3530770580782..418d15140a457 100644
--- a/mlir/test/Dialect/Linalg/forward-vector-transfers.mlir
+++ b/mlir/test/Dialect/Linalg/forward-vector-transfers.mlir
@@ -6,7 +6,7 @@
// CHECK-NOT: memref.copy
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: vector.transfer_read %[[ARG0]]
-// CHECK-NOT: in_bounds
+// CHECK: in_bounds = [false]
func.func @testAllocRead(%in: memref<? x f32>) -> vector<32 x f32> {
%c0 = arith.constant 0: index
%f0 = arith.constant 0.0: f32
@@ -24,7 +24,7 @@ func.func @testAllocRead(%in: memref<? x f32>) -> vector<32 x f32> {
// CHECK-NOT: memref.copy
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: vector.transfer_read %[[ARG0]]
-// CHECK-NOT: in_bounds
+// CHECK: in_bounds = [false]
func.func @testAllocFillRead(%in: memref<? x f32>) -> vector<32 x f32> {
%c0 = arith.constant 0: index
%f0 = arith.constant 0.0: f32
@@ -44,6 +44,7 @@ func.func @testAllocFillRead(%in: memref<? x f32>) -> vector<32 x f32> {
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: vector.transfer_read %[[ARG0]]
// CHECK-NOT: in_bounds
+// CHECK: in_bounds = [false]
func.func @testViewRead(%in: memref<? x f32>) -> vector<32 x f32> {
%c0 = arith.constant 0: index
%f0 = arith.constant 0.0: f32
@@ -63,6 +64,7 @@ func.func @testViewRead(%in: memref<? x f32>) -> vector<32 x f32> {
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: vector.transfer_read %[[ARG0]]
// CHECK-NOT: in_bounds
+// CHECK: in_bounds = [false]
func.func @testViewFillRead(%in: memref<? x f32>) -> vector<32 x f32> {
%c0 = arith.constant 0: index
%f0 = arith.constant 0.0: f32
@@ -83,6 +85,7 @@ func.func @testViewFillRead(%in: memref<? x f32>) -> vector<32 x f32> {
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: vector.transfer_write %[[ARG0]], %[[ARG1]]
// CHECK-NOT: in_bounds
+// CHECK: in_bounds = [false]
func.func @testAllocWrite(%vec: vector<32 x f32>, %out: memref<? x f32>) {
%c0 = arith.constant 0: index
%f0 = arith.constant 0.0: f32
@@ -100,7 +103,7 @@ func.func @testAllocWrite(%vec: vector<32 x f32>, %out: memref<? x f32>) {
// CHECK-NOT: memref.copy
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: vector.transfer_write %[[ARG0]], %[[ARG1]]
-// CHECK-NOT: in_bounds
+// CHECK: in_bounds = [false]
func.func @testViewWrite(%vec: vector<32 x f32>, %out: memref<? x f32>) {
%c0 = arith.constant 0: index
%f0 = arith.constant 0.0: f32
@@ -133,7 +136,7 @@ func.func @failAllocFillRead(%in: memref<? x f32>) -> vector<32 x f32> {
%subview = memref.subview %alloc[0][16][1] : memref<32 x f32> to memref<16 x f32>
memref.copy %in, %subview : memref<? x f32> to memref<16 x f32>
"some_interleaved_use"(%subview) : (memref<16 x f32>) -> ()
- %0 = vector.transfer_read %alloc[%c0], %f1: memref<32 x f32>, vector<32 x f32>
+ %0 = vector.transfer_read %alloc[%c0], %f1 {in_bounds = [false]} : memref<32 x f32>, vector<32 x f32>
memref.dealloc %alloc : memref<32 x f32>
return %0: vector<32 x f32>
}
@@ -151,7 +154,7 @@ func.func @failAllocWrite(%vec: vector<32 x f32>, %out: memref<? x f32>) {
%f0 = arith.constant 0.0: f32
%alloc = memref.alloc() : memref<32 x f32>
%subview = memref.subview %alloc[0][16][1] : memref<32 x f32> to memref<16 x f32>
- vector.transfer_write %vec, %alloc[%c0] : vector<32 x f32>, memref<32 x f32>
+ vector.transfer_write %vec, %alloc[%c0] {in_bounds = [false]} : vector<32 x f32>, memref<32 x f32>
"some_interleaved_use"(%subview) : (memref<16 x f32>) -> ()
memref.copy %subview, %out : memref<16 x f32> to memref<? x f32>
memref.dealloc %alloc : memref<32 x f32>
diff --git a/mlir/test/Dialect/Linalg/hoisting.mlir b/mlir/test/Dialect/Linalg/hoisting.mlir
index 241b8a486c012..653e5a3df1d30 100644
--- a/mlir/test/Dialect/Linalg/hoisting.mlir
+++ b/mlir/test/Dialect/Linalg/hoisting.mlir
@@ -46,13 +46,13 @@ func.func @hoist_vector_transfer_pairs(
// CHECK: "unrelated_use"(%[[MEMREF1]]) : (memref<?x?xf32>) -> ()
scf.for %i = %lb to %ub step %step {
scf.for %j = %lb to %ub step %step {
- %r0 = vector.transfer_read %memref1[%c0, %c0], %cst: memref<?x?xf32>, vector<1xf32>
- %r1 = vector.transfer_read %memref0[%i, %i], %cst: memref<?x?xf32>, vector<2xf32>
- %r2 = vector.transfer_read %memref2[%c0, %c0], %cst: memref<?x?xf32>, vector<3xf32>
- %r3 = vector.transfer_read %memref3[%c0, %c0], %cst: memref<?x?xf32>, vector<4xf32>
+ %r0 = vector.transfer_read %memref1[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<1xf32>
+ %r1 = vector.transfer_read %memref0[%i, %i], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<2xf32>
+ %r2 = vector.transfer_read %memref2[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<3xf32>
+ %r3 = vector.transfer_read %memref3[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
"some_crippling_use"(%memref4) : (memref<?x?xf32>) -> ()
- %r4 = vector.transfer_read %memref4[%c0, %c0], %cst: memref<?x?xf32>, vector<5xf32>
- %r5 = vector.transfer_read %memref5[%c0, %c0], %cst: memref<?x?xf32>, vector<6xf32>
+ %r4 = vector.transfer_read %memref4[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<5xf32>
+ %r5 = vector.transfer_read %memref5[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<6xf32>
"some_crippling_use"(%memref5) : (memref<?x?xf32>) -> ()
%u0 = "some_use"(%r0) : (vector<1xf32>) -> vector<1xf32>
%u1 = "some_use"(%r1) : (vector<2xf32>) -> vector<2xf32>
@@ -60,12 +60,12 @@ func.func @hoist_vector_transfer_pairs(
%u3 = "some_use"(%r3) : (vector<4xf32>) -> vector<4xf32>
%u4 = "some_use"(%r4) : (vector<5xf32>) -> vector<5xf32>
%u5 = "some_use"(%r5) : (vector<6xf32>) -> vector<6xf32>
- vector.transfer_write %u0, %memref1[%c0, %c0] : vector<1xf32>, memref<?x?xf32>
- vector.transfer_write %u1, %memref0[%i, %i] : vector<2xf32>, memref<?x?xf32>
- vector.transfer_write %u2, %memref2[%c0, %c0] : vector<3xf32>, memref<?x?xf32>
- vector.transfer_write %u3, %memref3[%c0, %c0] : vector<4xf32>, memref<?x?xf32>
- vector.transfer_write %u4, %memref4[%c0, %c0] : vector<5xf32>, memref<?x?xf32>
- vector.transfer_write %u5, %memref5[%c0, %c0] : vector<6xf32>, memref<?x?xf32>
+ vector.transfer_write %u0, %memref1[%c0, %c0] {in_bounds=[false]} : vector<1xf32>, memref<?x?xf32>
+ vector.transfer_write %u1, %memref0[%i, %i] {in_bounds=[false]} : vector<2xf32>, memref<?x?xf32>
+ vector.transfer_write %u2, %memref2[%c0, %c0] {in_bounds=[false]} : vector<3xf32>, memref<?x?xf32>
+ vector.transfer_write %u3, %memref3[%c0, %c0] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u4, %memref4[%c0, %c0] {in_bounds=[false]} : vector<5xf32>, memref<?x?xf32>
+ vector.transfer_write %u5, %memref5[%c0, %c0] {in_bounds=[false]} : vector<6xf32>, memref<?x?xf32>
"some_crippling_use"(%memref3) : (memref<?x?xf32>) -> ()
}
"unrelated_use"(%memref0) : (memref<?x?xf32>) -> ()
@@ -136,14 +136,14 @@ func.func @hoist_vector_transfer_pairs_disjoint(
// CHECK: vector.transfer_write %{{.*}}, %[[MEMREF2]]{{.*}} : vector<3xf32>, memref<?x?xf32>
scf.for %i = %lb to %ub step %step {
scf.for %j = %lb to %ub step %step {
- %r00 = vector.transfer_read %memref1[%c0, %c0], %cst: memref<?x?xf32>, vector<2xf32>
- %r01 = vector.transfer_read %memref1[%c0, %c1], %cst: memref<?x?xf32>, vector<2xf32>
- %r20 = vector.transfer_read %memref2[%c0, %c0], %cst: memref<?x?xf32>, vector<3xf32>
- %r21 = vector.transfer_read %memref2[%c0, %c3], %cst: memref<?x?xf32>, vector<3xf32>
- %r30 = vector.transfer_read %memref3[%c0, %random_index], %cst: memref<?x?xf32>, vector<4xf32>
- %r31 = vector.transfer_read %memref3[%c1, %random_index], %cst: memref<?x?xf32>, vector<4xf32>
- %r10 = vector.transfer_read %memref0[%i, %i], %cst: memref<?x?xf32>, vector<2xf32>
- %r11 = vector.transfer_read %memref0[%random_index, %random_index], %cst: memref<?x?xf32>, vector<2xf32>
+ %r00 = vector.transfer_read %memref1[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<2xf32>
+ %r01 = vector.transfer_read %memref1[%c0, %c1], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<2xf32>
+ %r20 = vector.transfer_read %memref2[%c0, %c0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<3xf32>
+ %r21 = vector.transfer_read %memref2[%c0, %c3], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<3xf32>
+ %r30 = vector.transfer_read %memref3[%c0, %random_index], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
+ %r31 = vector.transfer_read %memref3[%c1, %random_index], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
+ %r10 = vector.transfer_read %memref0[%i, %i], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<2xf32>
+ %r11 = vector.transfer_read %memref0[%random_index, %random_index], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<2xf32>
%u00 = "some_use"(%r00) : (vector<2xf32>) -> vector<2xf32>
%u01 = "some_use"(%r01) : (vector<2xf32>) -> vector<2xf32>
%u20 = "some_use"(%r20) : (vector<3xf32>) -> vector<3xf32>
@@ -152,14 +152,14 @@ func.func @hoist_vector_transfer_pairs_disjoint(
%u31 = "some_use"(%r31) : (vector<4xf32>) -> vector<4xf32>
%u10 = "some_use"(%r10) : (vector<2xf32>) -> vector<2xf32>
%u11 = "some_use"(%r11) : (vector<2xf32>) -> vector<2xf32>
- vector.transfer_write %u00, %memref1[%c0, %c0] : vector<2xf32>, memref<?x?xf32>
- vector.transfer_write %u01, %memref1[%c0, %c1] : vector<2xf32>, memref<?x?xf32>
- vector.transfer_write %u20, %memref2[%c0, %c0] : vector<3xf32>, memref<?x?xf32>
- vector.transfer_write %u21, %memref2[%c0, %c3] : vector<3xf32>, memref<?x?xf32>
- vector.transfer_write %u30, %memref3[%c0, %random_index] : vector<4xf32>, memref<?x?xf32>
- vector.transfer_write %u31, %memref3[%c1, %random_index] : vector<4xf32>, memref<?x?xf32>
- vector.transfer_write %u10, %memref0[%i, %i] : vector<2xf32>, memref<?x?xf32>
- vector.transfer_write %u11, %memref0[%random_index, %random_index] : vector<2xf32>, memref<?x?xf32>
+ vector.transfer_write %u00, %memref1[%c0, %c0] {in_bounds=[false]} : vector<2xf32>, memref<?x?xf32>
+ vector.transfer_write %u01, %memref1[%c0, %c1] {in_bounds=[false]} : vector<2xf32>, memref<?x?xf32>
+ vector.transfer_write %u20, %memref2[%c0, %c0] {in_bounds=[false]} : vector<3xf32>, memref<?x?xf32>
+ vector.transfer_write %u21, %memref2[%c0, %c3] {in_bounds=[false]} : vector<3xf32>, memref<?x?xf32>
+ vector.transfer_write %u30, %memref3[%c0, %random_index] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u31, %memref3[%c1, %random_index] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u10, %memref0[%i, %i] {in_bounds=[false]} : vector<2xf32>, memref<?x?xf32>
+ vector.transfer_write %u11, %memref0[%random_index, %random_index] {in_bounds=[false]} : vector<2xf32>, memref<?x?xf32>
}
}
return
@@ -184,7 +184,7 @@ module attributes {transform.with_named_sequence} {
// CHECK: %[[C0:.*]] = arith.constant 0 : i32
// CHECK: affine.for %[[I:.*]] = 0 to 64 {
// CHECK: affine.for %[[J:.*]] = 0 to 64 step 16 {
-// CHECK: %[[R0:.*]] = vector.transfer_read %[[MEMREF2]][%[[I]], %[[J]]], %[[C0]] : memref<64x64xi32>, vector<16xi32>
+// CHECK: %[[R0:.*]] = vector.transfer_read %[[MEMREF2]][%[[I]], %[[J]]], %[[C0]] {{.*}} : memref<64x64xi32>, vector<16xi32>
// CHECK: %[[R:.*]] = affine.for %[[K:.*]] = 0 to 64 iter_args(%[[ACC:.*]] = %[[R0]]) -> (vector<16xi32>) {
// CHECK: %[[AV:.*]] = vector.transfer_read %[[MEMREF0]][%[[I]], %[[K]]], %[[C0]] {{.*}}: memref<64x64xi32>, vector<16xi32>
// CHECK: %[[BV:.*]] = vector.transfer_read %[[MEMREF1]][%[[K]], %[[J]]], %[[C0]] {{.*}}: memref<64x64xi32>, vector<16xi32>
@@ -192,7 +192,7 @@ module attributes {transform.with_named_sequence} {
// CHECK: %[[T1:.*]] = arith.addi %[[ACC]], %[[T0]] : vector<16xi32>
// CHECK: affine.yield %[[T1]] : vector<16xi32>
// CHECK: }
-// CHECK: vector.transfer_write %[[R]], %[[MEMREF2]][%[[I]], %[[J]]] : vector<16xi32>, memref<64x64xi32>
+// CHECK: vector.transfer_write %[[R]], %[[MEMREF2]][%[[I]], %[[J]]] {{.*}} : vector<16xi32>, memref<64x64xi32>
// CHECK: }
// CHECK: }
func.func @hoist_vector_transfer_pairs_in_affine_loops(%memref0: memref<64x64xi32>, %memref1: memref<64x64xi32>, %memref2: memref<64x64xi32>) {
@@ -200,12 +200,12 @@ func.func @hoist_vector_transfer_pairs_in_affine_loops(%memref0: memref<64x64xi3
affine.for %arg3 = 0 to 64 {
affine.for %arg4 = 0 to 64 step 16 {
affine.for %arg5 = 0 to 64 {
- %0 = vector.transfer_read %memref0[%arg3, %arg5], %c0_i32 {permutation_map = affine_map<(d0, d1) -> (0)>} : memref<64x64xi32>, vector<16xi32>
- %1 = vector.transfer_read %memref1[%arg5, %arg4], %c0_i32 : memref<64x64xi32>, vector<16xi32>
- %2 = vector.transfer_read %memref2[%arg3, %arg4], %c0_i32 : memref<64x64xi32>, vector<16xi32>
+ %0 = vector.transfer_read %memref0[%arg3, %arg5], %c0_i32 {in_bounds=[true], permutation_map = affine_map<(d0, d1) -> (0)>} : memref<64x64xi32>, vector<16xi32>
+ %1 = vector.transfer_read %memref1[%arg5, %arg4], %c0_i32 {in_bounds=[false]} : memref<64x64xi32>, vector<16xi32>
+ %2 = vector.transfer_read %memref2[%arg3, %arg4], %c0_i32 {in_bounds=[false]} : memref<64x64xi32>, vector<16xi32>
%3 = arith.muli %0, %1 : vector<16xi32>
%4 = arith.addi %2, %3 : vector<16xi32>
- vector.transfer_write %4, %memref2[%arg3, %arg4] : vector<16xi32>, memref<64x64xi32>
+ vector.transfer_write %4, %memref2[%arg3, %arg4] {in_bounds=[false]} : vector<16xi32>, memref<64x64xi32>
}
}
}
@@ -458,17 +458,17 @@ func.func @hoist_vector_transfer_pairs_disjoint_dynamic(
scf.for %i = %lb to %ub step %step {
scf.for %j = %lb to %ub step %step {
- %r0 = vector.transfer_read %buffer[%i0, %i0], %cst: memref<?x?xf32>, vector<4xf32>
+ %r0 = vector.transfer_read %buffer[%i0, %i0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
// Disjoint leading dim
- %r1 = vector.transfer_read %buffer[%i1, %i0], %cst: memref<?x?xf32>, vector<4xf32>
+ %r1 = vector.transfer_read %buffer[%i1, %i0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
// Non-overlap trailing dim
- %r2 = vector.transfer_read %buffer[%i1, %i2], %cst: memref<?x?xf32>, vector<4xf32>
+ %r2 = vector.transfer_read %buffer[%i1, %i2], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
%u0 = "some_use"(%r0) : (vector<4xf32>) -> vector<4xf32>
%u1 = "some_use"(%r1) : (vector<4xf32>) -> vector<4xf32>
%u2 = "some_use"(%r2) : (vector<4xf32>) -> vector<4xf32>
- vector.transfer_write %u0, %buffer[%i0, %i0] : vector<4xf32>, memref<?x?xf32>
- vector.transfer_write %u1, %buffer[%i1, %i0] : vector<4xf32>, memref<?x?xf32>
- vector.transfer_write %u2, %buffer[%i1, %i2] : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u0, %buffer[%i0, %i0] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u1, %buffer[%i1, %i0] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u2, %buffer[%i1, %i2] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
}
}
return
@@ -500,13 +500,13 @@ func.func @hoist_vector_transfer_pairs_overlapping_dynamic(
scf.for %i = %lb to %ub step %step {
scf.for %j = %lb to %ub step %step {
- %r0 = vector.transfer_read %buffer[%i0, %i0], %cst: memref<?x?xf32>, vector<4xf32>
+ %r0 = vector.transfer_read %buffer[%i0, %i0], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
// Overlapping range with the above
- %r1 = vector.transfer_read %buffer[%i0, %i1], %cst: memref<?x?xf32>, vector<4xf32>
+ %r1 = vector.transfer_read %buffer[%i0, %i1], %cst {in_bounds=[false]}: memref<?x?xf32>, vector<4xf32>
%u0 = "some_use"(%r0) : (vector<4xf32>) -> vector<4xf32>
%u1 = "some_use"(%r1) : (vector<4xf32>) -> vector<4xf32>
- vector.transfer_write %u0, %buffer[%i0, %i0] : vector<4xf32>, memref<?x?xf32>
- vector.transfer_write %u1, %buffer[%i0, %i1] : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u0, %buffer[%i0, %i0] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
+ vector.transfer_write %u1, %buffer[%i0, %i1] {in_bounds=[false]} : vector<4xf32>, memref<?x?xf32>
}
}
return
@@ -542,15 +542,15 @@ func.func @hoist_vector_transfer_pairs_disjoint_dynamic(
scf.for %i = %lb to %ub step %step {
scf.for %j = %lb to %ub step %step {
- %r0 = vector.transfer_read %buffer[%i0, %i2], %cst: memref<?x?xf32>, vector<16x8xf32>
- %r1 = vector.transfer_read %buffer[%i0, %i3], %cst: memref<?x?xf32>, vector<16x8xf32>
- %r2 = vector.transfer_read %buffer[%i0, %i4], %cst: memref<?x?xf32>, vector<16x8xf32>
+ %r0 = vector.transfer_read %buffer[%i0, %i2], %cst {in_bounds=[false, false]}: memref<?x?xf32>, vector<16x8xf32>
+ %r1 = vector.transfer_read %buffer[%i0, %i3], %cst {in_bounds=[false, false]}: memref<?x?xf32>, vector<16x8xf32>
+ %r2 = vector.transfer_read %buffer[%i0, %i4], %cst {in_bounds=[false, false]}: memref<?x?xf32>, vector<16x8xf32>
%u0 = "some_use"(%r0) : (vector<16x8xf32>) -> vector<16x8xf32>
%u1 = "some_use"(%r1) : (vector<16x8xf32>) -> vector<16x8xf32>
%u2 = "some_use"(%r2) : (vector<16x8xf32>) -> vector<16x8xf32>
- vector.transfer_write %u2, %buffer[%i0, %i4] : vector<16x8xf32>, memref<?x?xf32>
- vector.transfer_write %u1, %buffer[%i0, %i3] : vector<16x8xf32>, memref<?x?xf32>
- vector.transfer_write %u0, %buffer[%i0, %i2] : vector<16x8xf32>, memref<?x?xf32>
+ vector.transfer_write %u2, %buffer[%i0, %i4] {in_bounds=[false, false]} : vector<16x8xf32>, memref<?x?xf32>
+ vector.transfer_write %u1, %buffer[%i0, %i3] {in_bounds=[false, false]} : vector<16x8xf32>, memref<?x?xf32>
+ vector.transfer_write %u0, %buffer[%i0, %i2] {in_bounds=[false, false]} : vector<16x8xf32>, memref<?x?xf32>
}
}
return
diff --git a/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir b/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
index 9616a3e32a064..33915518a136f 100644
--- a/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Linalg/one-shot-bufferize.mlir
@@ -105,7 +105,7 @@ func.func @vec_inplace(
%c0 = arith.constant 0 : index
// CHECK-NOT: alloc
- %r = vector.transfer_write %vec, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %r = vector.transfer_write %vec, %A[%c0] {in_bounds = [false]} : vector<4xf32>, tensor<?xf32>
// CHECK: return
// CHECK-NOT: tensor
@@ -127,12 +127,12 @@ func.func @vec_not_inplace(
// CHECK: %[[ALLOC:.*]] = memref.alloc
// CHECK: memref.copy {{.*}}, %[[ALLOC]]
// CHECK-NEXT: vector.transfer_write {{.*}}, %[[ALLOC]]
- %r0 = vector.transfer_write %vec, %A[%c0] : vector<4xf32>, tensor<?xf32>
+ %r0 = vector.transfer_write %vec, %A[%c0] {in_bounds = [false]} : vector<4xf32>, tensor<?xf32>
/// The second vector.transfer has no interfering reads and can reuse the buffer.
// CHECK-NOT: alloc
// CHECK-NEXT: vector.transfer_write {{.*}}, %[[A]]
- %r1 = vector.transfer_write %vec, %A[%c1] : vector<4xf32>, tensor<?xf32>
+ %r1 = vector.transfer_write %vec, %A[%c1] {in_bounds = [false]} : vector<4xf32>, tensor<?xf32>
// CHECK: return
// CHECK-NOT: tensor
diff --git a/mlir/test/Dialect/Linalg/transform-op-bufferize-to-allocation.mlir b/mlir/test/Dialect/Linalg/transform-op-bufferize-to-allocation.mlir
index 35cbd7725ec50..e65aa3b350913 100644
--- a/mlir/test/Dialect/Linalg/transform-op-bufferize-to-allocation.mlir
+++ b/mlir/test/Dialect/Linalg/transform-op-bufferize-to-allocation.mlir
@@ -205,7 +205,7 @@ module attributes {transform.with_named_sequence} {
// CHECK: memref.dealloc %[[alloc]]
// CHECK: return %[[r]]
func.func @vector_mask(%t: tensor<?xf32>, %val: vector<16xf32>, %idx: index, %m0: vector<16xi1>) -> tensor<?xf32> {
- %r = vector.mask %m0 { vector.transfer_write %val, %t[%idx] : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
+ %r = vector.mask %m0 { vector.transfer_write %val, %t[%idx] {in_bounds = [false]}: vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
return %r : tensor<?xf32>
}
diff --git a/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir b/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
index d7ff1ded9d933..a5789f58109ca 100644
--- a/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
@@ -381,7 +381,7 @@ module attributes {transform.with_named_sequence} {
func.func @test_vectorize_fill_scalar(%A : memref<f32>, %arg0 : f32) {
// CHECK-SAME: (%[[M:.*]]: memref<f32>, %[[val:.*]]: f32)
// CHECK: %[[VEC:.*]] = vector.broadcast %[[val]] : f32 to vector<f32>
- // CHECK: vector.transfer_write %[[VEC]], %[[M]][] : vector<f32>, memref<f32>
+ // CHECK: vector.transfer_write %[[VEC]], %[[M]][] {{.*}} : vector<f32>, memref<f32>
linalg.fill ins(%arg0 : f32) outs(%A : memref<f32>)
return
}
@@ -422,7 +422,7 @@ func.func @test_vectorize_copy_scalar(%A : memref<f32>, %B : memref<f32>) {
// CHECK: %[[V:.*]] = vector.transfer_read %[[A]][]{{.*}} : memref<f32>, vector<f32>
// CHECK: %[[val:.*]] = vector.extractelement %[[V]][] : vector<f32>
// CHECK: %[[VV:.*]] = vector.broadcast %[[val]] : f32 to vector<f32>
- // CHECK: vector.transfer_write %[[VV]], %[[B]][] : vector<f32>, memref<f32>
+ // CHECK: vector.transfer_write %[[VV]], %[[B]][] {{.*}} : vector<f32>, memref<f32>
memref.copy %A, %B : memref<f32> to memref<f32>
return
}
@@ -950,7 +950,7 @@ module attributes {transform.with_named_sequence} {
// CHECK-NOT: tensor.pad
// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[C5:.*]] = arith.constant 5.0
-// CHECK: %[[RESULT:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] : tensor<5x6xf32>, vector<7x9xf32>
+// CHECK: %[[RESULT:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] {in_bounds = [false, false]} : tensor<5x6xf32>, vector<7x9xf32>
// CHECK: return %[[RESULT]]
func.func @pad_and_transfer_read(%arg0: tensor<5x6xf32>) -> vector<7x9xf32> {
%c0 = arith.constant 0 : index
@@ -960,7 +960,7 @@ func.func @pad_and_transfer_read(%arg0: tensor<5x6xf32>) -> vector<7x9xf32> {
^bb0(%arg1: index, %arg2: index):
tensor.yield %c5 : f32
} : tensor<5x6xf32> to tensor<10x13xf32>
- %1 = vector.transfer_read %0[%c0, %c0], %c6
+ %1 = vector.transfer_read %0[%c0, %c0], %c6 {in_bounds = [true, true]}
: tensor<10x13xf32>, vector<7x9xf32>
return %1 : vector<7x9xf32>
}
@@ -984,7 +984,7 @@ func.func private @make_vector() -> vector<7x9xf32>
// CHECK-NOT: tensor.pad
// CHECK: %[[C0:.*]] = arith.constant 0 : index
// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32>
-// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[ARG0]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<5x6xf32>
+// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[ARG0]][%[[C0]], %[[C0]]] {in_bounds = [false, false]} : vector<7x9xf32>, tensor<5x6xf32>
// CHECK: return %[[RESULT]]
func.func @pad_and_transfer_write_static(
%arg0: tensor<5x6xf32>) -> tensor<5x6xf32> {
@@ -995,7 +995,7 @@ func.func @pad_and_transfer_write_static(
tensor.yield %c5 : f32
} : tensor<5x6xf32> to tensor<10x13xf32>
%1 = call @make_vector() : () -> vector<7x9xf32>
- %2 = vector.transfer_write %1, %0[%c0, %c0]
+ %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [false, false]}
: vector<7x9xf32>, tensor<10x13xf32>
%3 = tensor.extract_slice %2[0, 0] [5, 6] [1, 1] : tensor<10x13xf32> to tensor<5x6xf32>
return %3 : tensor<5x6xf32>
@@ -1021,7 +1021,7 @@ func.func private @make_vector() -> vector<7x9xf32>
// CHECK: %[[C0:.*]] = arith.constant 0 : index
// CHECK: %[[SUB:.*]] = tensor.extract_slice %[[ARG0]][0, 0] [%[[SIZE]], 6] [1, 1] : tensor<?x?xf32> to tensor<?x6xf32>
// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32>
-// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[SUB]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<?x6xf32>
+// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[SUB]][%[[C0]], %[[C0]]] {in_bounds = [false, false]} : vector<7x9xf32>, tensor<?x6xf32>
// CHECK: return %[[RESULT]]
func.func @pad_and_transfer_write_dynamic_static(
%arg0: tensor<?x?xf32>, %size: index, %padding: index) -> tensor<?x6xf32> {
@@ -1034,7 +1034,7 @@ func.func @pad_and_transfer_write_dynamic_static(
tensor.yield %c5 : f32
} : tensor<?x6xf32> to tensor<?x13xf32>
%1 = call @make_vector() : () -> vector<7x9xf32>
- %2 = vector.transfer_write %1, %0[%c0, %c0]
+ %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [false, false]}
: vector<7x9xf32>, tensor<?x13xf32>
%3 = tensor.extract_slice %2[0, 0] [%size, 6] [1, 1] : tensor<?x13xf32> to tensor<?x6xf32>
return %3 : tensor<?x6xf32>
@@ -1060,7 +1060,7 @@ func.func private @make_vector() -> tensor<12x13xf32>
// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[C5:.*]] = arith.constant 5.0
// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> tensor<12x13xf32>
-// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] : tensor<5x6xf32>, vector<7x9xf32>
+// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] {{.*}} : tensor<5x6xf32>, vector<7x9xf32>
// CHECK: %[[WRITE:.*]] = vector.transfer_write %[[READ]], %[[VEC0]][%[[C0]], %[[C0]]] {in_bounds = [true, true]} : vector<7x9xf32>, tensor<12x13xf32>
// CHECK: return %[[WRITE]]
func.func @pad_and_insert_slice_source(
diff --git a/mlir/test/Dialect/Linalg/vectorization.mlir b/mlir/test/Dialect/Linalg/vectorization.mlir
index bbeccc7fecd68..eeed9f27412be 100644
--- a/mlir/test/Dialect/Linalg/vectorization.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization.mlir
@@ -130,7 +130,7 @@ func.func @vectorize_dynamic_1d_broadcast(%arg0: tensor<?xf32>,
// CHECK-LABEL: @vectorize_dynamic_1d_broadcast
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
-// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
+// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {{.*}} : tensor<?xf32>, vector<4xf32>
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
diff --git a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
index 85e1c56dd45a0..c1afe0458779f 100644
--- a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
+++ b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
@@ -62,7 +62,7 @@ func.func @vectorize_nd_tensor_extract_constant_idx(%arg0: tensor<3x3xf32>, %arg
// CHECK-DAG: %[[C0_f32:.*]] = arith.constant 0.000000e+00 : f32
// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG_0]][%[[C1]], %[[C2]]], %[[C0_f32]] {in_bounds = [true, true, true], permutation_map = #[[$MAP]]} : tensor<3x3xf32>, vector<1x1x3xf32>
// CHECK: %[[C0_4:.*]] = arith.constant 0 : index
-// CHECK: vector.transfer_write %[[READ]], %[[ARG_1]][%[[C0_4]], %[[C0_4]], %[[C0_4]]] : vector<1x1x3xf32>, tensor<1x1x3xf32>
+// CHECK: vector.transfer_write %[[READ]], %[[ARG_1]][%[[C0_4]], %[[C0_4]], %[[C0_4]]] {{.*}} : vector<1x1x3xf32>, tensor<1x1x3xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
diff --git a/mlir/test/Dialect/MemRef/extract-address-computations.mlir b/mlir/test/Dialect/MemRef/extract-address-computations.mlir
index eec3d5c62983b..69a70e2d481a1 100644
--- a/mlir/test/Dialect/MemRef/extract-address-computations.mlir
+++ b/mlir/test/Dialect/MemRef/extract-address-computations.mlir
@@ -281,13 +281,13 @@ module attributes {transform.with_named_sequence} {
// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[CF0:.*]] = arith.constant 0.0{{0*e\+00}} : f16
// CHECK-DAG: %[[SUBVIEW:.*]] = memref.subview %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]] [%[[DYN_SIZE0]], %[[DYN_SIZE1]], %[[DYN_SIZE2]]] [1, 1, 1] : memref<?x?x?xf16> to memref<?x?x?xf16, strided<[?, ?, 1], offset: ?>>
-// CHECK: %[[LOADED_VAL:.*]] = vector.transfer_read %[[SUBVIEW]][%[[C0]], %[[C0]], %[[C0]]], %[[CF0]] {permutation_map = #[[$PERMUTATION_MAP]]} : memref<?x?x?xf16, strided<[?, ?, 1], offset: ?>>, vector<4x2xf16>
+// CHECK: %[[LOADED_VAL:.*]] = vector.transfer_read %[[SUBVIEW]][%[[C0]], %[[C0]], %[[C0]]], %[[CF0]] {in_bounds = [false, false], permutation_map = #[[$PERMUTATION_MAP]]} : memref<?x?x?xf16, strided<[?, ?, 1], offset: ?>>, vector<4x2xf16>
// CHECK: return %[[LOADED_VAL]] : vector<4x2xf16>
func.func @test_transfer_read_op(%base : memref<?x?x?xf16>,
%offset0 : index, %offset1: index, %offset2: index)
-> vector<4x2xf16> {
%cf0 = arith.constant 0.0 : f16
- %loaded_val = vector.transfer_read %base[%offset0, %offset1, %offset2], %cf0 { permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : memref<?x?x?xf16>, vector<4x2xf16>
+ %loaded_val = vector.transfer_read %base[%offset0, %offset1, %offset2], %cf0 { in_bounds = [false, false], permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : memref<?x?x?xf16>, vector<4x2xf16>
return %loaded_val : vector<4x2xf16>
}
@@ -313,13 +313,13 @@ module attributes {transform.with_named_sequence} {
// CHECK-SAME: %[[DYN_OFFSET1:[^:]*]]: index,
// CHECK-SAME: %[[DYN_OFFSET2:[^:]*]]: index)
// CHECK: %[[CF0:.*]] = arith.constant 0.0{{0*e\+00}} : f16
-// CHECK: %[[LOADED_VAL:.*]] = vector.transfer_read %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]], %[[CF0]] {permutation_map = #[[$PERMUTATION_MAP]]} : tensor<?x?x?xf16>, vector<4x2xf16>
+// CHECK: %[[LOADED_VAL:.*]] = vector.transfer_read %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]], %[[CF0]] {in_bounds = [false, false], permutation_map = #[[$PERMUTATION_MAP]]} : tensor<?x?x?xf16>, vector<4x2xf16>
// CHECK: return %[[LOADED_VAL]] : vector<4x2xf16>
func.func @test_transfer_read_op_with_tensor(%base : tensor<?x?x?xf16>,
%offset0 : index, %offset1: index, %offset2: index)
-> vector<4x2xf16> {
%cf0 = arith.constant 0.0 : f16
- %loaded_val = vector.transfer_read %base[%offset0, %offset1, %offset2], %cf0 { permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : tensor<?x?x?xf16>, vector<4x2xf16>
+ %loaded_val = vector.transfer_read %base[%offset0, %offset1, %offset2], %cf0 { in_bounds = [false, false], permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : tensor<?x?x?xf16>, vector<4x2xf16>
return %loaded_val : vector<4x2xf16>
}
@@ -352,12 +352,12 @@ module attributes {transform.with_named_sequence} {
// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[VCF0:.*]] = arith.constant dense<0.0{{0*e\+00}}> : vector<4x2xf16>
// CHECK-DAG: %[[SUBVIEW:.*]] = memref.subview %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]] [%[[DYN_SIZE0]], %[[DYN_SIZE1]], %[[DYN_SIZE2]]] [1, 1, 1] : memref<?x?x?xf16> to memref<?x?x?xf16, strided<[?, ?, 1], offset: ?>>
-// CHECK: vector.transfer_write %[[VCF0]], %[[SUBVIEW]][%[[C0]], %[[C0]], %[[C0]]] {permutation_map = #[[$PERMUTATION_MAP]]} : vector<4x2xf16>, memref<?x?x?xf16, strided<[?, ?, 1], offset: ?>>
+// CHECK: vector.transfer_write %[[VCF0]], %[[SUBVIEW]][%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [false, false], permutation_map = #[[$PERMUTATION_MAP]]} : vector<4x2xf16>, memref<?x?x?xf16, strided<[?, ?, 1], offset: ?>>
// CHECK: return
func.func @test_transfer_write_op(%base : memref<?x?x?xf16>,
%offset0 : index, %offset1: index, %offset2: index) {
%vcf0 = arith.constant dense<0.000000e+00> : vector<4x2xf16>
- vector.transfer_write %vcf0, %base[%offset0, %offset1, %offset2] { permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : vector<4x2xf16>, memref<?x?x?xf16>
+ vector.transfer_write %vcf0, %base[%offset0, %offset1, %offset2] { in_bounds = [false, false], permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : vector<4x2xf16>, memref<?x?x?xf16>
return
}
@@ -391,12 +391,12 @@ module attributes {transform.with_named_sequence} {
// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[VCF0:.*]] = arith.constant dense<0.0{{0*e\+00}}> : vector<4x2xf16>
// CHECK-DAG: %[[SUBVIEW:.*]] = memref.subview %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]] [%[[DYN_SIZE0]], %[[DYN_SIZE1]], %[[DYN_SIZE2]]] [1, 1, 1] : memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>> to memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>>
-// CHECK: vector.transfer_write %[[VCF0]], %[[SUBVIEW]][%[[C0]], %[[C0]], %[[C0]]] {permutation_map = #[[$PERMUTATION_MAP]]} : vector<4x2xf16>, memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>>
+// CHECK: vector.transfer_write %[[VCF0]], %[[SUBVIEW]][%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [false, false], permutation_map = #[[$PERMUTATION_MAP]]} : vector<4x2xf16>, memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>>
// CHECK: return
func.func @test_transfer_write_op_with_strides(%base : memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>>,
%offset0 : index, %offset1: index, %offset2: index) {
%vcf0 = arith.constant dense<0.000000e+00> : vector<4x2xf16>
- vector.transfer_write %vcf0, %base[%offset0, %offset1, %offset2] { permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : vector<4x2xf16>, memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>>
+ vector.transfer_write %vcf0, %base[%offset0, %offset1, %offset2] { in_bounds = [false, false], permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : vector<4x2xf16>, memref<?x?x?xf16, strided<[329, 26, 12], offset: ?>>
return
}
@@ -422,12 +422,12 @@ module attributes {transform.with_named_sequence} {
// CHECK-SAME: %[[DYN_OFFSET1:[^:]*]]: index,
// CHECK-SAME: %[[DYN_OFFSET2:[^:]*]]: index)
// CHECK-DAG: %[[VCF0:.*]] = arith.constant dense<0.0{{0*e\+00}}> : vector<4x2xf16>
-// CHECK: %[[RES:.*]] = vector.transfer_write %[[VCF0]], %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]] {permutation_map = #[[$PERMUTATION_MAP]]} : vector<4x2xf16>, tensor<?x?x?xf16>
+// CHECK: %[[RES:.*]] = vector.transfer_write %[[VCF0]], %[[BASE]][%[[DYN_OFFSET0]], %[[DYN_OFFSET1]], %[[DYN_OFFSET2]]] {in_bounds = [false, false], permutation_map = #[[$PERMUTATION_MAP]]} : vector<4x2xf16>, tensor<?x?x?xf16>
// CHECK: return %[[RES]] : tensor<?x?x?xf16>
func.func @test_transfer_write_op_with_tensor(%base : tensor<?x?x?xf16>,
%offset0 : index, %offset1: index, %offset2: index) -> tensor<?x?x?xf16> {
%vcf0 = arith.constant dense<0.000000e+00> : vector<4x2xf16>
- %res = vector.transfer_write %vcf0, %base[%offset0, %offset1, %offset2] { permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : vector<4x2xf16>, tensor<?x?x?xf16>
+ %res = vector.transfer_write %vcf0, %base[%offset0, %offset1, %offset2] { in_bounds = [false, false], permutation_map = affine_map<(d0,d1,d2) -> (d2,d0)> } : vector<4x2xf16>, tensor<?x?x?xf16>
return %res : tensor<?x?x?xf16>
}
diff --git a/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir b/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
index 327cacf7d9a20..6a91b9e22c44c 100644
--- a/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
+++ b/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
@@ -86,7 +86,7 @@ func.func @fold_subview_with_transfer_read_0d(
-> vector<f32> {
%f1 = arith.constant 1.0 : f32
%0 = memref.subview %arg0[%arg1, %arg2][1, 1][1, 1] : memref<12x32xf32> to memref<f32, strided<[], offset: ?>>
- %1 = vector.transfer_read %0[], %f1 : memref<f32, strided<[], offset: ?>>, vector<f32>
+ %1 = vector.transfer_read %0[], %f1 {in_bounds = []}: memref<f32, strided<[], offset: ?>>, vector<f32>
return %1 : vector<f32>
}
// CHECK: func @fold_subview_with_transfer_read_0d
diff --git a/mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir b/mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir
index e959949babd9e..5c89b748b807d 100644
--- a/mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir
+++ b/mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir
@@ -9,8 +9,8 @@ func.func @simple_depth_2_unpeeled(%global: memref<?xf32>, %result: memref<?xf32
// Predication is not currently implemented for transfer_read/write, so this is expected to fail.
// expected-note @below {{couldn't predicate}}
scf.for %i = %c0 to %c100 step %c4 iter_args(%accum = %c0f) -> f32 {
- %mem = vector.transfer_read %global[%i], %c0f : memref<?xf32>, vector<4xf32>
- vector.transfer_write %mem, %shared[%i] : vector<4xf32>, memref<?xf32, #gpu.address_space<workgroup>>
+ %mem = vector.transfer_read %global[%i], %c0f {in_bounds=[false]} : memref<?xf32>, vector<4xf32>
+ vector.transfer_write %mem, %shared[%i] {in_bounds=[false]} : vector<4xf32>, memref<?xf32, #gpu.address_space<workgroup>>
%0 = arith.addf %accum, %accum : f32
scf.yield %0 : f32
}
@@ -53,8 +53,8 @@ func.func @simple_depth_2_peeled(%global: memref<?xf32>) {
// CHECK: %[[LOCAL_LOADED:.+]] = vector.transfer_read %[[ARG]]
// CHECK: scf.yield %[[IA2]], %[[LOCAL_LOADED]]
scf.for %i = %c0 to %c100 step %c4 {
- %mem = vector.transfer_read %global[%i], %c0f : memref<?xf32>, vector<4xf32>
- vector.transfer_write %mem, %shared[%i] : vector<4xf32>, memref<?xf32, #gpu.address_space<workgroup>>
+ %mem = vector.transfer_read %global[%i], %c0f {in_bounds=[false]} : memref<?xf32>, vector<4xf32>
+ vector.transfer_write %mem, %shared[%i] {in_bounds=[false]} : vector<4xf32>, memref<?xf32, #gpu.address_space<workgroup>>
func.call @body(%i, %shared) : (index, memref<?xf32, #gpu.address_space<workgroup>>) -> ()
}
// CHECK: vector.transfer_write %[[LOOP]]#0
diff --git a/mlir/test/Dialect/SCF/one-shot-bufferize-analysis.mlir b/mlir/test/Dialect/SCF/one-shot-bufferize-analysis.mlir
index 9bb87ffbb2090..56e2d3064eefc 100644
--- a/mlir/test/Dialect/SCF/one-shot-bufferize-analysis.mlir
+++ b/mlir/test/Dialect/SCF/one-shot-bufferize-analysis.mlir
@@ -136,7 +136,7 @@ func.func @reading_scf_for(%t1: tensor<?xf32> {bufferization.writable = true},
// Write to %t1.
// CHECK: vector.transfer_write
// CHECK-SAME: __inplace_operands_attr__ = ["none", "false", "none"]
- %t3 = vector.transfer_write %v, %t1[%s] : vector<5xf32>, tensor<?xf32>
+ %t3 = vector.transfer_write %v, %t1[%s] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// Read the old value of %t1 inside the loop via an alias.
// CHECK: scf.for {{.*}} {
@@ -146,7 +146,7 @@ func.func @reading_scf_for(%t1: tensor<?xf32> {bufferization.writable = true},
%e = tensor.extract_slice %t2[%s][%s][1] : tensor<?xf32> to tensor<?xf32>
// Read from %t1 via alias %e.
- %v2 = vector.transfer_read %e[%s], %cst : tensor<?xf32>, vector<5xf32>
+ %v2 = vector.transfer_read %e[%s], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
scf.yield %t2, %v2 : tensor<?xf32>, vector<5xf32>
}
// CHECK: } {__inplace_operands_attr__ = ["none", "none", "none", "true", "none"]}
@@ -184,7 +184,7 @@ func.func @non_reading_scf_for(%t1: tensor<?xf32> {bufferization.writable = true
// Write to %t1.
// CHECK: vector.transfer_write
// CHECK-SAME: __inplace_operands_attr__ = ["none", "true", "none"]
- %t3 = vector.transfer_write %v, %t1[%s] : vector<5xf32>, tensor<?xf32>
+ %t3 = vector.transfer_write %v, %t1[%s] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// This loop does not read from %t1. It only writes to it.
// CHECK: scf.for
@@ -198,7 +198,7 @@ func.func @non_reading_scf_for(%t1: tensor<?xf32> {bufferization.writable = true
} -> (tensor<?xf32>)
// Read overwritten value. This is not a read of %t1.
- %v2 = vector.transfer_read %o2[%s], %cst : tensor<?xf32>, vector<5xf32>
+ %v2 = vector.transfer_read %o2[%s], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
scf.yield %o2, %v2 : tensor<?xf32>, vector<5xf32>
}
@@ -251,7 +251,7 @@ func.func @scf_if_inplace2(%t1: tensor<?xf32> {bufferization.writable = true},
} else {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t2 = vector.transfer_write %v, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
scf.yield %t2 : tensor<?xf32>
}
// CHECK: return
@@ -271,7 +271,7 @@ func.func @scf_if_inplace3(%t1: tensor<?xf32> {bufferization.writable = true},
%r = scf.if %cond -> (tensor<?xf32>) {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t2 = vector.transfer_write %v1, %e[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v1, %e[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t2 : tensor<?xf32>
@@ -279,7 +279,7 @@ func.func @scf_if_inplace3(%t1: tensor<?xf32> {bufferization.writable = true},
// Writing the same tensor through an alias. This is OK.
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t3 = vector.transfer_write %v2, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t3 = vector.transfer_write %v2, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t3 : tensor<?xf32>
@@ -301,7 +301,7 @@ func.func @scf_if_in_place4(%t1: tensor<?xf32> {bufferization.writable = true},
} else {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t2 = vector.transfer_write %v, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t2 : tensor<?xf32>
@@ -316,7 +316,7 @@ func.func @scf_if_in_place4(%t1: tensor<?xf32> {bufferization.writable = true},
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %r : tensor<?xf32>
}
- %v2 = vector.transfer_read %r_alias[%idx], %cst : tensor<?xf32>, vector<10xf32>
+ %v2 = vector.transfer_read %r_alias[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<10xf32>
// CHECK: return
// CHECK-SAME: __equivalent_func_args__ = [0, -1]
@@ -367,14 +367,14 @@ func.func @scf_if_inplace6(%t1: tensor<?xf32> {bufferization.writable = true},
%t2 = scf.if %cond2 -> (tensor<?xf32>) {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t3 = vector.transfer_write %v1, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t3 = vector.transfer_write %v1, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t3 : tensor<?xf32>
} else {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t4 = vector.transfer_write %v3, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t4 = vector.transfer_write %v3, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t4 : tensor<?xf32>
@@ -385,7 +385,7 @@ func.func @scf_if_inplace6(%t1: tensor<?xf32> {bufferization.writable = true},
} else {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t3 = vector.transfer_write %v2, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t3 = vector.transfer_write %v2, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t3 : tensor<?xf32>
@@ -406,7 +406,7 @@ func.func @scf_if_inplace7(%t1: tensor<?xf32> {bufferization.writable = true},
%r, %v_r2 = scf.if %cond -> (tensor<?xf32>, vector<5xf32>) {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "true", "none"]
- %t2 = vector.transfer_write %v1, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v1, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true", "none"]}
scf.yield %t2, %v1 : tensor<?xf32>, vector<5xf32>
@@ -414,11 +414,11 @@ func.func @scf_if_inplace7(%t1: tensor<?xf32> {bufferization.writable = true},
// Writing the same tensor through an alias.
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "false", "none"]
- %t3 = vector.transfer_write %v2, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t3 = vector.transfer_write %v2, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// Read the original value of %t1. This requires the write in this branch
// to be out-of-place. But the write in the other branch can still be
// inplace.
- %v_r = vector.transfer_read %t1[%idx2], %cst : tensor<?xf32>, vector<5xf32>
+ %v_r = vector.transfer_read %t1[%idx2], %cst {in_bounds=[false]} : tensor<?xf32>, vector<5xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true", "none"]}
scf.yield %t3, %v_r : tensor<?xf32>, vector<5xf32>
@@ -532,7 +532,7 @@ func.func @scf_if_out_of_place2(%t1: tensor<?xf32> {bufferization.writable = tru
} else {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "false", "none"]
- %t2 = vector.transfer_write %v, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t2 : tensor<?xf32>
@@ -540,7 +540,7 @@ func.func @scf_if_out_of_place2(%t1: tensor<?xf32> {bufferization.writable = tru
// Read the old value of %t1. Forces the transfer_write to bufferize
// out-of-place.
- %v2 = vector.transfer_read %t1[%idx], %cst : tensor<?xf32>, vector<10xf32>
+ %v2 = vector.transfer_read %t1[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<10xf32>
return %r, %v2 : tensor<?xf32>, vector<10xf32>
}
@@ -556,7 +556,7 @@ func.func @scf_if_out_of_place3(%t1: tensor<?xf32> {bufferization.writable = tru
} else {
// CHECK: vector.transfer_write
// CHECK-SAME: {__inplace_operands_attr__ = ["none", "false", "none"]
- %t2 = vector.transfer_write %v, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
// CHECK: scf.yield
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t2 : tensor<?xf32>
@@ -571,7 +571,7 @@ func.func @scf_if_out_of_place3(%t1: tensor<?xf32> {bufferization.writable = tru
// CHECK-SAME: {__inplace_operands_attr__ = ["true"]}
scf.yield %t1 : tensor<?xf32>
}
- %v2 = vector.transfer_read %t1_alias[%idx], %cst : tensor<?xf32>, vector<10xf32>
+ %v2 = vector.transfer_read %t1_alias[%idx], %cst {in_bounds=[false]} : tensor<?xf32>, vector<10xf32>
return %r, %v2 : tensor<?xf32>, vector<10xf32>
}
diff --git a/mlir/test/Dialect/SCF/one-shot-bufferize.mlir b/mlir/test/Dialect/SCF/one-shot-bufferize.mlir
index bb9f7dfdba83f..6fd3de405ce2e 100644
--- a/mlir/test/Dialect/SCF/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/SCF/one-shot-bufferize.mlir
@@ -75,7 +75,7 @@ func.func @nested_scf_for(%A : tensor<?xf32> {bufferization.writable = true},
%c10 = arith.constant 10 : index
%r1 = scf.for %i = %c0 to %c10 step %c1 iter_args(%B = %A) -> tensor<?xf32> {
%r2 = scf.for %j = %c0 to %c10 step %c1 iter_args(%C = %B) -> tensor<?xf32> {
- %w = vector.transfer_write %v, %C[%c0] : vector<5xf32>, tensor<?xf32>
+ %w = vector.transfer_write %v, %C[%c0] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
scf.yield %w : tensor<?xf32>
}
scf.yield %r2 : tensor<?xf32>
@@ -162,12 +162,13 @@ func.func @scf_if_inplace(%cond: i1,
// CHECK: scf.if %[[cond]] {
// CHECK-NEXT: } else {
// CHECK-NEXT: vector.transfer_write %[[v]], %[[t1]]
+ // CHECK-SAME: {in_bounds = [false]}
// CHECK-NEXT: }
// CHECK-NEXT: return
%r = scf.if %cond -> (tensor<?xf32>) {
scf.yield %t1 : tensor<?xf32>
} else {
- %t2 = vector.transfer_write %v, %t1[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v, %t1[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
scf.yield %t2 : tensor<?xf32>
}
return %r : tensor<?xf32>
@@ -198,7 +199,7 @@ func.func @scf_if_inside_scf_for(
%r2 = scf.if %cond -> (tensor<?xf32>) {
scf.yield %bb : tensor<?xf32>
} else {
- %t2 = vector.transfer_write %v, %bb[%idx] : vector<5xf32>, tensor<?xf32>
+ %t2 = vector.transfer_write %v, %bb[%idx] {in_bounds=[false]} : vector<5xf32>, tensor<?xf32>
scf.yield %t2 : tensor<?xf32>
}
scf.yield %r2 : tensor<?xf32>
diff --git a/mlir/test/Dialect/Tensor/fold-tensor-subset-ops.mlir b/mlir/test/Dialect/Tensor/fold-tensor-subset-ops.mlir
index 1a84e14104932..beda0dc46f362 100644
--- a/mlir/test/Dialect/Tensor/fold-tensor-subset-ops.mlir
+++ b/mlir/test/Dialect/Tensor/fold-tensor-subset-ops.mlir
@@ -69,7 +69,7 @@ func.func @fold_extract_slice_with_transfer_read_0d(
-> vector<f32> {
%f1 = arith.constant 1.0 : f32
%0 = tensor.extract_slice %arg0[%arg1, %arg2][1, 1][1, 1] : tensor<12x32xf32> to tensor<f32>
- %1 = vector.transfer_read %0[], %f1 : tensor<f32>, vector<f32>
+ %1 = vector.transfer_read %0[], %f1 {in_bounds = []} : tensor<f32>, vector<f32>
return %1 : vector<f32>
}
// CHECK: func @fold_extract_slice_with_transfer_read_0d
diff --git a/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir b/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir
index e2169fe1404c8..12947087f6a6f 100644
--- a/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Tensor/one-shot-bufferize.mlir
@@ -265,7 +265,7 @@ func.func @insert_slice_regression(%t: tensor<10xf32>, %b: tensor<5xf32>) -> ten
%1 = linalg.fill ins(%cst : f32) outs(%t : tensor<10xf32>) -> tensor<10xf32>
// Read %1 so that it does not DCE away.
- %vec = vector.transfer_read %1[%c0], %cst : tensor<10xf32>, vector<10xf32>
+ %vec = vector.transfer_read %1[%c0], %cst {in_bounds=[false]} : tensor<10xf32>, vector<10xf32>
vector.print %vec : vector<10xf32>
// Write back a different value (not %1).
@@ -286,7 +286,7 @@ func.func @insert_slice_full_overwrite(%t: tensor<10xf32>, %b: tensor<10xf32>) -
%1 = linalg.fill ins(%cst : f32) outs(%t : tensor<10xf32>) -> tensor<10xf32>
// Read %1 so that it does not DCE away.
- %vec = vector.transfer_read %1[%c0], %cst : tensor<10xf32>, vector<10xf32>
+ %vec = vector.transfer_read %1[%c0], %cst {in_bounds=[false]} : tensor<10xf32>, vector<10xf32>
vector.print %vec : vector<10xf32>
// Write back a different value (not %1).
diff --git a/mlir/test/Dialect/Vector/bufferize-invalid.mlir b/mlir/test/Dialect/Vector/bufferize-invalid.mlir
index bcca50a0fe79a..dde79d643b2c2 100644
--- a/mlir/test/Dialect/Vector/bufferize-invalid.mlir
+++ b/mlir/test/Dialect/Vector/bufferize-invalid.mlir
@@ -3,6 +3,6 @@
// CHECK-LABEL: func @mask(
func.func @mask(%t0: tensor<?xf32>, %val: vector<16xf32>, %idx: index, %m0: vector<16xi1>) -> tensor<?xf32> {
// expected-error @+1 {{'vector.mask' op body must bufferize in-place}}
- %0 = vector.mask %m0 { vector.transfer_write %val, %t0[%idx] : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
+ %0 = vector.mask %m0 { vector.transfer_write %val, %t0[%idx] {in_bounds = [false]} : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
return %0 : tensor<?xf32>
}
diff --git a/mlir/test/Dialect/Vector/canonicalize.mlir b/mlir/test/Dialect/Vector/canonicalize.mlir
index fc5651f5bb02f..dc13fcb71a9e8 100644
--- a/mlir/test/Dialect/Vector/canonicalize.mlir
+++ b/mlir/test/Dialect/Vector/canonicalize.mlir
@@ -443,10 +443,10 @@ func.func @cast_transfers(%A: memref<4x8xf32>) -> (vector<4x8xf32>) {
%0 = memref.cast %A : memref<4x8xf32> to memref<?x?xf32>
// CHECK: vector.transfer_read %{{.*}} {in_bounds = [true, true]} : memref<4x8xf32>, vector<4x8xf32>
- %1 = vector.transfer_read %0[%c0, %c0], %f0 : memref<?x?xf32>, vector<4x8xf32>
+ %1 = vector.transfer_read %0[%c0, %c0], %f0 {in_bounds=[false, false]} : memref<?x?xf32>, vector<4x8xf32>
// CHECK: vector.transfer_write %{{.*}} {in_bounds = [true, true]} : vector<4x8xf32>, memref<4x8xf32>
- vector.transfer_write %1, %0[%c0, %c0] : vector<4x8xf32>, memref<?x?xf32>
+ vector.transfer_write %1, %0[%c0, %c0] {in_bounds=[false, false]} : vector<4x8xf32>, memref<?x?xf32>
return %1 : vector<4x8xf32>
}
@@ -459,7 +459,7 @@ func.func @cast_transfers(%A: tensor<4x8xf32>) -> (vector<4x8xf32>) {
%0 = tensor.cast %A : tensor<4x8xf32> to tensor<?x?xf32>
// CHECK: vector.transfer_read %{{.*}} {in_bounds = [true, true]} : tensor<4x8xf32>, vector<4x8xf32>
- %1 = vector.transfer_read %0[%c0, %c0], %f0 : tensor<?x?xf32>, vector<4x8xf32>
+ %1 = vector.transfer_read %0[%c0, %c0], %f0 {in_bounds=[false, false]} : tensor<?x?xf32>, vector<4x8xf32>
return %1 : vector<4x8xf32>
}
@@ -878,19 +878,19 @@ func.func @fold_vector_transfer_masks(%A: memref<?x?xf32>) -> (vector<4x8xf32>,
%arith_all_true_mask = arith.constant dense<true> : vector<4x[4]xi1>
- // CHECK: vector.transfer_read %{{.*}}, %[[F0]] {permutation_map
+ // CHECK: vector.transfer_read %{{.*}}, %[[F0]] {in_bounds
%1 = vector.transfer_read %A[%c0, %c0], %f0, %mask
- {permutation_map = affine_map<(d0, d1) -> (d1, d0)>} : memref<?x?xf32>, vector<4x8xf32>
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d1, d0)>} : memref<?x?xf32>, vector<4x8xf32>
- // CHECK: vector.transfer_write {{.*}}[%[[C0]], %[[C0]]] {permutation_map
+ // CHECK: vector.transfer_write {{.*}}[%[[C0]], %[[C0]]] {in_bounds
vector.transfer_write %1, %A[%c0, %c0], %mask
- {permutation_map = affine_map<(d0, d1) -> (d1, d0)>} : vector<4x8xf32>, memref<?x?xf32>
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d1, d0)>} : vector<4x8xf32>, memref<?x?xf32>
- // CHECK: vector.transfer_read %{{.*}}, %[[F0]] :
- %2 = vector.transfer_read %A[%c0, %c0], %f0, %arith_all_true_mask : memref<?x?xf32>, vector<4x[4]xf32>
+ // CHECK: vector.transfer_read %{{.*}}, %[[F0]] {in_bounds
+ %2 = vector.transfer_read %A[%c0, %c0], %f0, %arith_all_true_mask {in_bounds = [false, false]} : memref<?x?xf32>, vector<4x[4]xf32>
- // CHECK: vector.transfer_write {{.*}}[%[[C0]], %[[C0]]] :
- vector.transfer_write %2, %A[%c0, %c0], %arith_all_true_mask : vector<4x[4]xf32>, memref<?x?xf32>
+ // CHECK: vector.transfer_write {{.*}}[%[[C0]], %[[C0]]] {in_bounds
+ vector.transfer_write %2, %A[%c0, %c0], %arith_all_true_mask {in_bounds = [false, false]} : vector<4x[4]xf32>, memref<?x?xf32>
// CHECK: return
return %1, %2 : vector<4x8xf32>, vector<4x[4]xf32>
@@ -904,20 +904,20 @@ func.func @fold_vector_transfers(%A: memref<?x8xf32>) -> (vector<4x8xf32>, vecto
%f0 = arith.constant 0.0 : f32
// CHECK: vector.transfer_read %{{.*}} {in_bounds = [false, true]}
- %1 = vector.transfer_read %A[%c0, %c0], %f0 : memref<?x8xf32>, vector<4x8xf32>
+ %1 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds=[false, false]} : memref<?x8xf32>, vector<4x8xf32>
// CHECK: vector.transfer_write %{{.*}} {in_bounds = [false, true]}
- vector.transfer_write %1, %A[%c0, %c0] : vector<4x8xf32>, memref<?x8xf32>
+ vector.transfer_write %1, %A[%c0, %c0] {in_bounds=[false, false]} : vector<4x8xf32>, memref<?x8xf32>
// Both dims may be out-of-bounds, attribute is elided.
// CHECK: vector.transfer_read %{{.*}}
// CHECK-NOT: in_bounds
- %2 = vector.transfer_read %A[%c0, %c0], %f0 : memref<?x8xf32>, vector<4x9xf32>
+ %2 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds=[false, false]} : memref<?x8xf32>, vector<4x9xf32>
// Both dims may be out-of-bounds, attribute is elided.
// CHECK: vector.transfer_write %{{.*}}
// CHECK-NOT: in_bounds
- vector.transfer_write %2, %A[%c0, %c0] : vector<4x9xf32>, memref<?x8xf32>
+ vector.transfer_write %2, %A[%c0, %c0] {in_bounds=[false, false]} : vector<4x9xf32>, memref<?x8xf32>
// CHECK: return
return %1, %2 : vector<4x8xf32>, vector<4x9xf32>
@@ -1109,9 +1109,9 @@ func.func @dead_transfer_op(%arg0 : tensor<4x4xf32>, %arg1 : memref<4x4xf32>,
%v0 : vector<1x4xf32>) {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %r = vector.transfer_read %arg1[%c0, %c0], %cf0 :
+ %r = vector.transfer_read %arg1[%c0, %c0], %cf0 {in_bounds=[false, false]} :
memref<4x4xf32>, vector<1x4xf32>
- %w = vector.transfer_write %v0, %arg0[%c0, %c0] :
+ %w = vector.transfer_write %v0, %arg0[%c0, %c0] {in_bounds=[false, false]} :
vector<1x4xf32>, tensor<4x4xf32>
return
}
@@ -1221,9 +1221,9 @@ func.func @store_after_load_tensor(%arg0 : tensor<4x4xf32>) -> tensor<4x4xf32> {
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c1, %c0], %cf0 :
+ %0 = vector.transfer_read %arg0[%c1, %c0], %cf0 {in_bounds=[false, false]} :
tensor<4x4xf32>, vector<1x4xf32>
- %w0 = vector.transfer_write %0, %arg0[%c1, %c0] :
+ %w0 = vector.transfer_write %0, %arg0[%c1, %c0] {in_bounds=[false, false]} :
vector<1x4xf32>, tensor<4x4xf32>
return %w0 : tensor<4x4xf32>
}
@@ -1238,9 +1238,9 @@ func.func @store_after_load_tensor_negative(%arg0 : tensor<4x4xf32>) -> tensor<4
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c1, %c0], %cf0 :
+ %0 = vector.transfer_read %arg0[%c1, %c0], %cf0 {in_bounds=[false, false]} :
tensor<4x4xf32>, vector<1x4xf32>
- %w0 = vector.transfer_write %0, %arg0[%c0, %c0] :
+ %w0 = vector.transfer_write %0, %arg0[%c0, %c0] {in_bounds=[false, false]} :
vector<1x4xf32>, tensor<4x4xf32>
return %w0 : tensor<4x4xf32>
}
@@ -2489,7 +2489,7 @@ func.func @all_true_vector_mask_no_result(%a : vector<3x4xf32>, %m : memref<3x4x
// CHECK: vector.transfer_write
%c0 = arith.constant 0 : index
%all_true = vector.constant_mask [3, 4] : vector<3x4xi1>
- vector.mask %all_true { vector.transfer_write %a, %m[%c0, %c0] : vector<3x4xf32>, memref<3x4xf32> } : vector<3x4xi1>
+ vector.mask %all_true { vector.transfer_write %a, %m[%c0, %c0] {in_bounds = [false, false]} : vector<3x4xf32>, memref<3x4xf32> } : vector<3x4xi1>
return
}
diff --git a/mlir/test/Dialect/Vector/invalid.mlir b/mlir/test/Dialect/Vector/invalid.mlir
index d0eaed8f98cc5..1a6a4fab4c76c 100644
--- a/mlir/test/Dialect/Vector/invalid.mlir
+++ b/mlir/test/Dialect/Vector/invalid.mlir
@@ -332,7 +332,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires two types}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %cst { permutation_map = affine_map<()->(0)> } : memref<?x?xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %cst { in_bounds=[false, false], permutation_map = affine_map<()->(0)> } : memref<?x?xf32>
}
// -----
@@ -342,7 +342,7 @@ func.func @main(%m: memref<1xi32>, %2: vector<1x32xi1>) -> vector<1x32xi32> {
%0 = arith.constant 1 : index
%1 = arith.constant 1 : i32
// expected-error at +1 {{expected the same rank for the vector and the results of the permutation map}}
- %3 = vector.transfer_read %m[%0], %1, %2 { permutation_map = #map1 } : memref<1xi32>, vector<1x32xi32>
+ %3 = vector.transfer_read %m[%0], %1, %2 { in_bounds=[false, false], permutation_map = #map1 } : memref<1xi32>, vector<1x32xi32>
return %3 : vector<1x32xi32>
}
@@ -353,7 +353,7 @@ func.func @test_vector.transfer_write(%m: memref<1xi32>, %2: vector<1x32xi32>)
%0 = arith.constant 1 : index
%1 = arith.constant 1 : i32
// expected-error at +1 {{expected the same rank for the vector and the results of the permutation map}}
- %3 = vector.transfer_write %2, %m[%0], %1 { permutation_map = #map1 } : vector<1x32xi32>, memref<1xi32>
+ %3 = vector.transfer_write %2, %m[%0], %1 { in_bounds=[false, false], permutation_map = #map1 } : vector<1x32xi32>, memref<1xi32>
return %3 : vector<1x32xi32>
}
@@ -364,7 +364,7 @@ func.func @test_vector.transfer_read(%arg0: vector<4x3xf32>) {
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<4x3xf32>
// expected-error at +1 {{ requires memref or ranked tensor type}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 : vector<4x3xf32>, vector<1x1x2x3xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 { in_bounds=[false, false] } : vector<4x3xf32>, vector<1x1x2x3xf32>
}
// -----
@@ -374,7 +374,7 @@ func.func @test_vector.transfer_read(%arg0: memref<4x3xf32>) {
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<4x3xf32>
// expected-error at +1 {{ requires vector type}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 : memref<4x3xf32>, f32
+ %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 { in_bounds=[] } : memref<4x3xf32>, f32
}
// -----
@@ -383,7 +383,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires 2 indices}}
- %0 = vector.transfer_read %arg0[%c3, %c3, %c3], %cst { permutation_map = affine_map<()->(0)> } : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3, %c3], %cst { in_bounds=[false, false], permutation_map = affine_map<()->(0)> } : memref<?x?xf32>, vector<128xf32>
}
// -----
@@ -392,7 +392,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires 2 indices}}
- %0 = vector.transfer_read %arg0[%c3], %cst { permutation_map = affine_map<()->(0)> } : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3], %cst { in_bounds=[false], permutation_map = affine_map<()->(0)> } : memref<?x?xf32>, vector<128xf32>
}
// -----
@@ -401,7 +401,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires a permutation_map with input dims of the same rank as the source type}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0)->(d0)>} : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds=[false], permutation_map = affine_map<(d0)->(d0)>} : memref<?x?xf32>, vector<128xf32>
}
// -----
@@ -410,7 +410,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires a permutation_map with result dims of the same rank as the vector type}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds=[false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xf32>, vector<128xf32>
}
// -----
@@ -419,7 +419,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d0 + d1)>} : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds=[false], permutation_map = affine_map<(d0, d1)->(d0 + d1)>} : memref<?x?xf32>, vector<128xf32>
}
// -----
@@ -428,7 +428,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d0 + 1)>} : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds=[false], permutation_map = affine_map<(d0, d1)->(d0 + 1)>} : memref<?x?xf32>, vector<128xf32>
}
// -----
@@ -437,7 +437,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{requires a permutation_map that is a permutation (found one dim used more than once)}}
- %0 = vector.transfer_read %arg0[%c3, %c3, %c3], %cst {permutation_map = affine_map<(d0, d1, d2)->(d0, d0)>} : memref<?x?x?xf32>, vector<3x7xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3, %c3], %cst {in_bounds=[false, false], permutation_map = affine_map<(d0, d1, d2)->(d0, d0)>} : memref<?x?x?xf32>, vector<3x7xf32>
}
// -----
@@ -449,7 +449,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?x?xf32>) {
// expected-note at +1 {{prior use here}}
%mask = vector.splat %c1 : vector<3x8x7xi1>
// expected-error at +1 {{expects different type than prior uses: 'vector<3x7xi1>' vs 'vector<3x8x7xi1>'}}
- %0 = vector.transfer_read %arg0[%c3, %c3, %c3], %cst, %mask {permutation_map = affine_map<(d0, d1, d2)->(d0, 0, d2)>} : memref<?x?x?xf32>, vector<3x8x7xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3, %c3], %cst, %mask {in_bounds=[false, false, false], permutation_map = affine_map<(d0, d1, d2)->(d0, 0, d2)>} : memref<?x?x?xf32>, vector<3x8x7xf32>
}
// -----
@@ -459,7 +459,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<4x3xf32>>) {
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<4x3xf32>
// expected-error at +1 {{requires source vector element and vector result ranks to match}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<4x3xf32>>, vector<3xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<4x3xf32>>, vector<3xf32>
}
// -----
@@ -469,7 +469,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<6xf32>>) {
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<6xf32>
// expected-error at +1 {{requires the bitwidth of the minor 1-D vector to be an integral multiple of the bitwidth of the minor 1-D vector of the source}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 : memref<?x?xvector<6xf32>>, vector<3xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {in_bounds = [false]} : memref<?x?xvector<6xf32>>, vector<3xf32>
}
// -----
@@ -478,12 +478,13 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<2x3xf32>
- // expected-error at +1 {{ expects the optional in_bounds attr of same rank as permutation_map results: affine_map<(d0, d1) -> (d0, d1)>}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {in_bounds = [true], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<2x3xf32>>, vector<1x1x2x3xf32>
+ // expected-error at +1 {{ expects the in_bounds attr of same rank as permutation_map results: affine_map<(d0, d1) -> (d0, d1)>}}
+ %0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<2x3xf32>>, vector<1x1x2x3xf32>
}
// -----
+//FIXME - doesn't trigger the expected error
func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.0 : f32
@@ -500,7 +501,7 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%vf0 = vector.splat %f0 : vector<2x3xf32>
%mask = vector.splat %c1 : vector<2x3xi1>
// expected-error at +1 {{does not support masks with vector element type}}
- %0 = vector.transfer_read %arg0[%c3, %c3], %vf0, %mask {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<2x3xf32>>, vector<1x1x2x3xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %vf0, %mask {in_bounds = [false, false, false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<2x3xf32>>, vector<1x1x2x3xf32>
}
// -----
@@ -547,7 +548,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error at +1 {{requires 2 indices}}
- vector.transfer_write %cst, %arg0[%c3, %c3, %c3] {permutation_map = affine_map<()->(0)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3, %c3, %c3] {in_bounds = [false], permutation_map = affine_map<()->(0)>} : vector<128xf32>, memref<?x?xf32>
}
// -----
@@ -556,7 +557,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error at +1 {{requires 2 indices}}
- vector.transfer_write %cst, %arg0[%c3] {permutation_map = affine_map<()->(0)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3] {in_bounds = [false], permutation_map = affine_map<()->(0)>} : vector<128xf32>, memref<?x?xf32>
}
// -----
@@ -565,7 +566,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error at +1 {{requires a permutation_map with input dims of the same rank as the source type}}
- vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map = affine_map<(d0)->(d0)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3, %c3] {in_bounds = [false], permutation_map = affine_map<(d0)->(d0)>} : vector<128xf32>, memref<?x?xf32>
}
// -----
@@ -574,7 +575,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error at +1 {{requires a permutation_map with result dims of the same rank as the vector type}}
- vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3, %c3] {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : vector<128xf32>, memref<?x?xf32>
}
// -----
@@ -583,7 +584,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error at +1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
- vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0 + d1)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3, %c3] {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0 + d1)>} : vector<128xf32>, memref<?x?xf32>
}
// -----
@@ -592,7 +593,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error at +1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
- vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0 + 1)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3, %c3] {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0 + 1)>} : vector<128xf32>, memref<?x?xf32>
}
// -----
@@ -601,7 +602,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<3 x 7 x f32>
// expected-error at +1 {{requires a permutation_map that is a permutation (found one dim used more than once)}}
- vector.transfer_write %cst, %arg0[%c3, %c3, %c3] {permutation_map = affine_map<(d0, d1, d2)->(d0, d0)>} : vector<3x7xf32>, memref<?x?x?xf32>
+ vector.transfer_write %cst, %arg0[%c3, %c3, %c3] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1, d2)->(d0, d0)>} : vector<3x7xf32>, memref<?x?x?xf32>
}
// -----
@@ -611,7 +612,7 @@ func.func @test_vector.transfer_write(%arg0: memref<?xf32>, %arg1: vector<7xf32>
%cst = arith.constant 3.0 : f32
// expected-error at +1 {{should not have broadcast dimensions}}
vector.transfer_write %arg1, %arg0[%c3]
- {permutation_map = affine_map<(d0) -> (0)>}
+ {in_bounds = [false], permutation_map = affine_map<(d0) -> (0)>}
: vector<7xf32>, memref<?xf32>
}
diff --git a/mlir/test/Dialect/Vector/lower-vector-mask.mlir b/mlir/test/Dialect/Vector/lower-vector-mask.mlir
index a8a1164e2f762..8f9411adcd9e5 100644
--- a/mlir/test/Dialect/Vector/lower-vector-mask.mlir
+++ b/mlir/test/Dialect/Vector/lower-vector-mask.mlir
@@ -2,7 +2,7 @@
func.func @vector_transfer_read(%t0: tensor<?xf32>, %idx: index, %m0: vector<16xi1>) -> vector<16xf32> {
%ft0 = arith.constant 0.0 : f32
- %0 = vector.mask %m0 { vector.transfer_read %t0[%idx], %ft0 : tensor<?xf32>, vector<16xf32> } : vector<16xi1> -> vector<16xf32>
+ %0 = vector.mask %m0 { vector.transfer_read %t0[%idx], %ft0 {in_bounds = [false]} : tensor<?xf32>, vector<16xf32> } : vector<16xi1> -> vector<16xf32>
return %0 : vector<16xf32>
}
@@ -11,14 +11,14 @@ func.func @vector_transfer_read(%t0: tensor<?xf32>, %idx: index, %m0: vector<16x
// CHECK-SAME: %[[VAL_1:.*]]: index,
// CHECK-SAME: %[[VAL_2:.*]]: vector<16xi1>) -> vector<16xf32> {
// CHECK-NOT: vector.mask
-// CHECK: %[[VAL_4:.*]] = vector.transfer_read {{.*}}, %[[VAL_2]] : tensor<?xf32>, vector<16xf32>
+// CHECK: %[[VAL_4:.*]] = vector.transfer_read {{.*}}, %[[VAL_2]] {{.*}} : tensor<?xf32>, vector<16xf32>
// CHECK: return %[[VAL_4]] : vector<16xf32>
// CHECK: }
// -----
func.func @vector_transfer_write_on_memref(%val: vector<16xf32>, %t0: memref<?xf32>, %idx: index, %m0: vector<16xi1>) {
- vector.mask %m0 { vector.transfer_write %val, %t0[%idx] : vector<16xf32>, memref<?xf32> } : vector<16xi1>
+ vector.mask %m0 { vector.transfer_write %val, %t0[%idx] {in_bounds = [false]} : vector<16xf32>, memref<?xf32> } : vector<16xi1>
return
}
@@ -28,14 +28,14 @@ func.func @vector_transfer_write_on_memref(%val: vector<16xf32>, %t0: memref<?xf
// CHECK-SAME: %[[VAL_2:.*]]: index,
// CHECK-SAME: %[[VAL_3:.*]]: vector<16xi1>) {
//CHECK-NOT: vector.mask
-// CHECK: vector.transfer_write %[[VAL_0]], {{.*}}, %[[VAL_3]] : vector<16xf32>, memref<?xf32>
+// CHECK: vector.transfer_write %[[VAL_0]], {{.*}}, %[[VAL_3]] {{.*}} : vector<16xf32>, memref<?xf32>
// CHECK: return
// CHECK: }
// -----
func.func @vector_transfer_write_on_tensor(%val: vector<16xf32>, %t0: tensor<?xf32>, %idx: index, %m0: vector<16xi1>) -> tensor<?xf32> {
- %res = vector.mask %m0 { vector.transfer_write %val, %t0[%idx] : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
+ %res = vector.mask %m0 { vector.transfer_write %val, %t0[%idx] {in_bounds = [false]} : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
return %res : tensor<?xf32>
}
@@ -44,7 +44,7 @@ func.func @vector_transfer_write_on_tensor(%val: vector<16xf32>, %t0: tensor<?xf
// CHECK-SAME: %[[VAL_1:.*]]: tensor<?xf32>,
// CHECK-SAME: %[[VAL_2:.*]]: index,
// CHECK-SAME: %[[VAL_3:.*]]: vector<16xi1>) -> tensor<?xf32> {
-// CHECK: %[[VAL_4:.*]] = vector.transfer_write %[[VAL_0]], {{.*}}, %[[VAL_3]] : vector<16xf32>, tensor<?xf32>
+// CHECK: %[[VAL_4:.*]] = vector.transfer_write %[[VAL_0]], {{.*}}, %[[VAL_3]] {{.*}} : vector<16xf32>, tensor<?xf32>
// CHECK: return %[[VAL_4]] : tensor<?xf32>
// CHECK: }
diff --git a/mlir/test/Dialect/Vector/one-shot-bufferize.mlir b/mlir/test/Dialect/Vector/one-shot-bufferize.mlir
index 64238c3c08a6f..7651a31c2c1cb 100644
--- a/mlir/test/Dialect/Vector/one-shot-bufferize.mlir
+++ b/mlir/test/Dialect/Vector/one-shot-bufferize.mlir
@@ -6,8 +6,8 @@
func.func @mask(%t0: tensor<?xf32>, %val: vector<16xf32>, %idx: index, %m0: vector<16xi1>) -> tensor<?xf32> {
// CHECK-NOT: alloc
// CHECK-NOT: copy
- // CHECK: vector.mask %{{.*}} { vector.transfer_write %{{.*}}, %[[t0]][%{{.*}}] : vector<16xf32>, memref<?xf32, strided<[?], offset: ?>> } : vector<16xi1>
- %0 = vector.mask %m0 { vector.transfer_write %val, %t0[%idx] : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
+ // CHECK: vector.mask %{{.*}} { vector.transfer_write %{{.*}}, %[[t0]][%{{.*}}] {{.*}}: vector<16xf32>, memref<?xf32, strided<[?], offset: ?>> } : vector<16xi1>
+ %0 = vector.mask %m0 { vector.transfer_write %val, %t0[%idx] {in_bounds = [false]} : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
// CHECK: return %[[t0]]
return %0 : tensor<?xf32>
}
@@ -18,7 +18,7 @@ func.func @mask(%t0: tensor<?xf32>, %val: vector<16xf32>, %idx: index, %m0: vect
// CHECK-ANALYSIS-SAME: tensor<5x10xf32> {bufferization.access = "write"}
func.func @non_reading_xfer_write(%t: tensor<5x10xf32>, %v: vector<6x11xf32>) -> tensor<5x10xf32> {
%c0 = arith.constant 0 : index
- %1 = vector.transfer_write %v, %t[%c0, %c0] : vector<6x11xf32>, tensor<5x10xf32>
+ %1 = vector.transfer_write %v, %t[%c0, %c0] {in_bounds = [false, false]} : vector<6x11xf32>, tensor<5x10xf32>
return %1 : tensor<5x10xf32>
}
// -----
@@ -27,6 +27,6 @@ func.func @non_reading_xfer_write(%t: tensor<5x10xf32>, %v: vector<6x11xf32>) ->
// CHECK-ANALYSIS-SAME: tensor<5x10xf32> {bufferization.access = "read-write"}
func.func @reading_xfer_write(%t: tensor<5x10xf32>, %v: vector<4x11xf32>) -> tensor<5x10xf32> {
%c0 = arith.constant 0 : index
- %1 = vector.transfer_write %v, %t[%c0, %c0] : vector<4x11xf32>, tensor<5x10xf32>
+ %1 = vector.transfer_write %v, %t[%c0, %c0] {in_bounds = [false, false]} : vector<4x11xf32>, tensor<5x10xf32>
return %1 : tensor<5x10xf32>
}
diff --git a/mlir/test/Dialect/Vector/ops.mlir b/mlir/test/Dialect/Vector/ops.mlir
index 4da09584db88b..31b36bb510d93 100644
--- a/mlir/test/Dialect/Vector/ops.mlir
+++ b/mlir/test/Dialect/Vector/ops.mlir
@@ -4,13 +4,13 @@
func.func @vector_transfer_ops_0d(%arg0: tensor<f32>, %arg1: memref<f32>)
-> tensor<f32> {
%f0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[], %f0 {permutation_map = affine_map<()->()>} :
+ %0 = vector.transfer_read %arg0[], %f0 {in_bounds = [], permutation_map = affine_map<()->()>} :
tensor<f32>, vector<f32>
- %1 = vector.transfer_write %0, %arg0[] {permutation_map = affine_map<()->()>} :
+ %1 = vector.transfer_write %0, %arg0[] {in_bounds = [], permutation_map = affine_map<()->()>} :
vector<f32>, tensor<f32>
- %2 = vector.transfer_read %arg1[], %f0 {permutation_map = affine_map<()->()>} :
+ %2 = vector.transfer_read %arg1[], %f0 {in_bounds = [], permutation_map = affine_map<()->()>} :
memref<f32>, vector<f32>
- vector.transfer_write %2, %arg1[] {permutation_map = affine_map<()->()>} :
+ vector.transfer_write %2, %arg1[] {in_bounds = [], permutation_map = affine_map<()->()>} :
vector<f32>, memref<f32>
return %1: tensor<f32>
}
@@ -20,13 +20,13 @@ func.func @vector_transfer_ops_0d_from_higher_d(%arg0: tensor<?xf32>, %arg1: mem
-> tensor<?xf32> {
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0], %f0 {permutation_map = affine_map<(d0)->()>} :
+ %0 = vector.transfer_read %arg0[%c0], %f0 {in_bounds = [], permutation_map = affine_map<(d0)->()>} :
tensor<?xf32>, vector<f32>
- %1 = vector.transfer_write %0, %arg0[%c0] {permutation_map = affine_map<(d0)->()>} :
+ %1 = vector.transfer_write %0, %arg0[%c0] {in_bounds = [], permutation_map = affine_map<(d0)->()>} :
vector<f32>, tensor<?xf32>
- %2 = vector.transfer_read %arg1[%c0, %c0], %f0 {permutation_map = affine_map<(d0, d1)->()>} :
+ %2 = vector.transfer_read %arg1[%c0, %c0], %f0 {in_bounds = [], permutation_map = affine_map<(d0, d1)->()>} :
memref<?x?xf32>, vector<f32>
- vector.transfer_write %2, %arg1[%c0, %c0] {permutation_map = affine_map<(d0, d1)->()>} :
+ vector.transfer_write %2, %arg1[%c0, %c0] {in_bounds = [], permutation_map = affine_map<(d0, d1)->()>} :
vector<f32>, memref<?x?xf32>
return %1: tensor<?xf32>
}
@@ -52,40 +52,40 @@ func.func @vector_transfer_ops(%arg0: memref<?x?xf32>,
%m2 = vector.splat %i1 : vector<4x5xi1>
//
// CHECK: vector.transfer_read
- %0 = vector.transfer_read %arg0[%c3, %c3], %f0 {permutation_map = affine_map<(d0, d1)->(d0)>} : memref<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %f0 {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0)>} : memref<?x?xf32>, vector<128xf32>
// CHECK: vector.transfer_read
- %1 = vector.transfer_read %arg0[%c3, %c3], %f0 {permutation_map = affine_map<(d0, d1)->(d1, d0)>} : memref<?x?xf32>, vector<3x7xf32>
+ %1 = vector.transfer_read %arg0[%c3, %c3], %f0 {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d1, d0)>} : memref<?x?xf32>, vector<3x7xf32>
// CHECK: vector.transfer_read
- %2 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d0)>} : memref<?x?xf32>, vector<128xf32>
+ %2 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0)>} : memref<?x?xf32>, vector<128xf32>
// CHECK: vector.transfer_read
- %3 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d1)>} : memref<?x?xf32>, vector<128xf32>
+ %3 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d1)>} : memref<?x?xf32>, vector<128xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : memref<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
- %4 = vector.transfer_read %arg1[%c3, %c3], %vf0 {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
+ %4 = vector.transfer_read %arg1[%c3, %c3], %vf0 {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : memref<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} {in_bounds = [false, true]} : memref<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
%5 = vector.transfer_read %arg1[%c3, %c3], %vf0 {in_bounds = [false, true]} : memref<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : memref<?x?xvector<4x3xi32>>, vector<5x24xi8>
- %6 = vector.transfer_read %arg2[%c3, %c3], %v0 : memref<?x?xvector<4x3xi32>>, vector<5x24xi8>
+ %6 = vector.transfer_read %arg2[%c3, %c3], %v0 {in_bounds = []} : memref<?x?xvector<4x3xi32>>, vector<5x24xi8>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : memref<?x?xvector<4x3xindex>>, vector<5x48xi8>
- %7 = vector.transfer_read %arg3[%c3, %c3], %vi0 : memref<?x?xvector<4x3xindex>>, vector<5x48xi8>
+ %7 = vector.transfer_read %arg3[%c3, %c3], %vi0 {in_bounds = []} : memref<?x?xvector<4x3xindex>>, vector<5x48xi8>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}}, %{{.*}} : memref<?x?xf32>, vector<5xf32>
- %8 = vector.transfer_read %arg0[%c3, %c3], %f0, %m : memref<?x?xf32>, vector<5xf32>
+ %8 = vector.transfer_read %arg0[%c3, %c3], %f0, %m {in_bounds=[false]} : memref<?x?xf32>, vector<5xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]], %[[C3]]], %{{.*}}, %{{.*}} : memref<?x?x?xf32>, vector<5x4x8xf32>
- %9 = vector.transfer_read %arg4[%c3, %c3, %c3], %f0, %m2 {permutation_map = affine_map<(d0, d1, d2)->(d1, d0, 0)>} : memref<?x?x?xf32>, vector<5x4x8xf32>
+ %9 = vector.transfer_read %arg4[%c3, %c3, %c3], %f0, %m2 {in_bounds = [false, false, true], permutation_map = affine_map<(d0, d1, d2)->(d1, d0, 0)>} : memref<?x?x?xf32>, vector<5x4x8xf32>
// CHECK: vector.transfer_write
- vector.transfer_write %0, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0)>} : vector<128xf32>, memref<?x?xf32>
+ vector.transfer_write %0, %arg0[%c3, %c3] {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0)>} : vector<128xf32>, memref<?x?xf32>
// CHECK: vector.transfer_write
- vector.transfer_write %1, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d1, d0)>} : vector<3x7xf32>, memref<?x?xf32>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
- vector.transfer_write %4, %arg1[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
+ vector.transfer_write %1, %arg0[%c3, %c3] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d1, d0)>} : vector<3x7xf32>, memref<?x?xf32>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {{.*}} : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
+ vector.transfer_write %4, %arg1[%c3, %c3] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {in_bounds = [false, false]} : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
vector.transfer_write %5, %arg1[%c3, %c3] {in_bounds = [false, false]} : vector<1x1x4x3xf32>, memref<?x?xvector<4x3xf32>>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<5x24xi8>, memref<?x?xvector<4x3xi32>>
- vector.transfer_write %6, %arg2[%c3, %c3] : vector<5x24xi8>, memref<?x?xvector<4x3xi32>>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<5x48xi8>, memref<?x?xvector<4x3xindex>>
- vector.transfer_write %7, %arg3[%c3, %c3] : vector<5x48xi8>, memref<?x?xvector<4x3xindex>>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {{.*}} : vector<5x24xi8>, memref<?x?xvector<4x3xi32>>
+ vector.transfer_write %6, %arg2[%c3, %c3] {in_bounds = []} : vector<5x24xi8>, memref<?x?xvector<4x3xi32>>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {{.*}} : vector<5x48xi8>, memref<?x?xvector<4x3xindex>>
+ vector.transfer_write %7, %arg3[%c3, %c3] {in_bounds = []} : vector<5x48xi8>, memref<?x?xvector<4x3xindex>>
// CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : vector<5xf32>, memref<?x?xf32>
- vector.transfer_write %8, %arg0[%c3, %c3], %m : vector<5xf32>, memref<?x?xf32>
+ vector.transfer_write %8, %arg0[%c3, %c3], %m {in_bounds = [false]} : vector<5xf32>, memref<?x?xf32>
return
}
@@ -112,35 +112,35 @@ func.func @vector_transfer_ops_tensor(%arg0: tensor<?x?xf32>,
//
// CHECK: vector.transfer_read
- %0 = vector.transfer_read %arg0[%c3, %c3], %f0 {permutation_map = affine_map<(d0, d1)->(d0)>} : tensor<?x?xf32>, vector<128xf32>
+ %0 = vector.transfer_read %arg0[%c3, %c3], %f0 {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0)>} : tensor<?x?xf32>, vector<128xf32>
// CHECK: vector.transfer_read
- %1 = vector.transfer_read %arg0[%c3, %c3], %f0 {permutation_map = affine_map<(d0, d1)->(d1, d0)>} : tensor<?x?xf32>, vector<3x7xf32>
+ %1 = vector.transfer_read %arg0[%c3, %c3], %f0 {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d1, d0)>} : tensor<?x?xf32>, vector<3x7xf32>
// CHECK: vector.transfer_read
- %2 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d0)>} : tensor<?x?xf32>, vector<128xf32>
+ %2 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0)>} : tensor<?x?xf32>, vector<128xf32>
// CHECK: vector.transfer_read
- %3 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(d1)>} : tensor<?x?xf32>, vector<128xf32>
+ %3 = vector.transfer_read %arg0[%c3, %c3], %cst {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d1)>} : tensor<?x?xf32>, vector<128xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : tensor<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
- %4 = vector.transfer_read %arg1[%c3, %c3], %vf0 {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : tensor<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
+ %4 = vector.transfer_read %arg1[%c3, %c3], %vf0 {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : tensor<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} {in_bounds = [false, true]} : tensor<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
%5 = vector.transfer_read %arg1[%c3, %c3], %vf0 {in_bounds = [false, true]} : tensor<?x?xvector<4x3xf32>>, vector<1x1x4x3xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : tensor<?x?xvector<4x3xi32>>, vector<5x24xi8>
- %6 = vector.transfer_read %arg2[%c3, %c3], %v0 : tensor<?x?xvector<4x3xi32>>, vector<5x24xi8>
+ %6 = vector.transfer_read %arg2[%c3, %c3], %v0 {in_bounds = []} : tensor<?x?xvector<4x3xi32>>, vector<5x24xi8>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}} : tensor<?x?xvector<4x3xindex>>, vector<5x48xi8>
- %7 = vector.transfer_read %arg3[%c3, %c3], %vi0 : tensor<?x?xvector<4x3xindex>>, vector<5x48xi8>
+ %7 = vector.transfer_read %arg3[%c3, %c3], %vi0 {in_bounds = []} : tensor<?x?xvector<4x3xindex>>, vector<5x48xi8>
// CHECK: vector.transfer_write
- %8 = vector.transfer_write %0, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0)>} : vector<128xf32>, tensor<?x?xf32>
+ %8 = vector.transfer_write %0, %arg0[%c3, %c3] {in_bounds = [false], permutation_map = affine_map<(d0, d1)->(d0)>} : vector<128xf32>, tensor<?x?xf32>
// CHECK: vector.transfer_write
- %9 = vector.transfer_write %1, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d1, d0)>} : vector<3x7xf32>, tensor<?x?xf32>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
- %10 = vector.transfer_write %4, %arg1[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0, d1)>} : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
+ %9 = vector.transfer_write %1, %arg0[%c3, %c3] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d1, d0)>} : vector<3x7xf32>, tensor<?x?xf32>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {{.*}} : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
+ %10 = vector.transfer_write %4, %arg1[%c3, %c3] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1)->(d0, d1)>} : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {in_bounds = [false, false]} : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
%11 = vector.transfer_write %5, %arg1[%c3, %c3] {in_bounds = [false, false]} : vector<1x1x4x3xf32>, tensor<?x?xvector<4x3xf32>>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<5x24xi8>, tensor<?x?xvector<4x3xi32>>
- %12 = vector.transfer_write %6, %arg2[%c3, %c3] : vector<5x24xi8>, tensor<?x?xvector<4x3xi32>>
- // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] : vector<5x48xi8>, tensor<?x?xvector<4x3xindex>>
- %13 = vector.transfer_write %7, %arg3[%c3, %c3] : vector<5x48xi8>, tensor<?x?xvector<4x3xindex>>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {{.*}} : vector<5x24xi8>, tensor<?x?xvector<4x3xi32>>
+ %12 = vector.transfer_write %6, %arg2[%c3, %c3] {in_bounds = []} : vector<5x24xi8>, tensor<?x?xvector<4x3xi32>>
+ // CHECK: vector.transfer_write %{{.*}}, %{{.*}}[%[[C3]], %[[C3]]] {{.*}} : vector<5x48xi8>, tensor<?x?xvector<4x3xindex>>
+ %13 = vector.transfer_write %7, %arg3[%c3, %c3] {in_bounds = []} : vector<5x48xi8>, tensor<?x?xvector<4x3xindex>>
return %8, %9, %10, %11, %12, %13 :
tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xvector<4x3xf32>>,
@@ -986,21 +986,21 @@ func.func @vector_mask(%a: vector<8xi32>, %m0: vector<8xi1>) -> i32 {
func.func @vector_mask_passthru(%t0: tensor<?xf32>, %idx: index, %m0: vector<16xi1>, %pt0: vector<16xf32>) -> vector<16xf32> {
%ft0 = arith.constant 0.0 : f32
// CHECK: %{{.*}} = vector.mask %{{.*}}, %{{.*}} { vector.transfer_read %{{.*}}[%{{.*}}], %{{.*}} : tensor<?xf32>, vector<16xf32> } : vector<16xi1> -> vector<16xf32>
- %0 = vector.mask %m0, %pt0 { vector.transfer_read %t0[%idx], %ft0 : tensor<?xf32>, vector<16xf32> } : vector<16xi1> -> vector<16xf32>
+ %0 = vector.mask %m0, %pt0 { vector.transfer_read %t0[%idx], %ft0{ in_bounds = [false]} : tensor<?xf32>, vector<16xf32> } : vector<16xi1> -> vector<16xf32>
return %0 : vector<16xf32>
}
// CHECK-LABEL: func @vector_mask_no_return
func.func @vector_mask_no_return(%val: vector<16xf32>, %t0: memref<?xf32>, %idx: index, %m0: vector<16xi1>) {
-// CHECK-NEXT: vector.mask %{{.*}} { vector.transfer_write %{{.*}}, %{{.*}}[%{{.*}}] : vector<16xf32>, memref<?xf32> } : vector<16xi1>
- vector.mask %m0 { vector.transfer_write %val, %t0[%idx] : vector<16xf32>, memref<?xf32> } : vector<16xi1>
+// CHECK-NEXT: vector.mask %{{.*}} { vector.transfer_write %{{.*}}, %{{.*}}[%{{.*}}] {{.*}} : vector<16xf32>, memref<?xf32> } : vector<16xi1>
+ vector.mask %m0 { vector.transfer_write %val, %t0[%idx] {in_bounds = [false]} : vector<16xf32>, memref<?xf32> } : vector<16xi1>
return
}
// CHECK-LABEL: func @vector_mask_tensor_return
func.func @vector_mask_tensor_return(%val: vector<16xf32>, %t0: tensor<?xf32>, %idx: index, %m0: vector<16xi1>) {
-// CHECK-NEXT: vector.mask %{{.*}} { vector.transfer_write %{{.*}}, %{{.*}}[%{{.*}}] : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
- vector.mask %m0 { vector.transfer_write %val, %t0[%idx] : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
+// CHECK-NEXT: vector.mask %{{.*}} { vector.transfer_write %{{.*}}, %{{.*}}[%{{.*}}] {{.*}} : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
+ vector.mask %m0 { vector.transfer_write %val, %t0[%idx] {in_bounds = [false]} : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
return
}
@@ -1171,4 +1171,4 @@ func.func @from_elements(%a: f32, %b: f32) -> (vector<f32>, vector<1xf32>, vecto
// CHECK: vector.from_elements %[[b]], %[[b]], %[[a]], %[[a]] : vector<2x2xf32>
%3 = vector.from_elements %b, %b, %a, %a : vector<2x2xf32>
return %0, %1, %2, %3 : vector<f32>, vector<1xf32>, vector<1x2xf32>, vector<2x2xf32>
-}
\ No newline at end of file
+}
diff --git a/mlir/test/Dialect/Vector/scalar-vector-transfer-to-memref.mlir b/mlir/test/Dialect/Vector/scalar-vector-transfer-to-memref.mlir
index c5cb09b9aa9f9..256519959d9f2 100644
--- a/mlir/test/Dialect/Vector/scalar-vector-transfer-to-memref.mlir
+++ b/mlir/test/Dialect/Vector/scalar-vector-transfer-to-memref.mlir
@@ -7,7 +7,7 @@
// CHECK: return %[[r]]
func.func @transfer_read_0d(%m: memref<?x?x?xf32>, %idx: index) -> f32 {
%cst = arith.constant 0.0 : f32
- %0 = vector.transfer_read %m[%idx, %idx, %idx], %cst : memref<?x?x?xf32>, vector<f32>
+ %0 = vector.transfer_read %m[%idx, %idx, %idx], %cst {in_bounds = []} : memref<?x?x?xf32>, vector<f32>
%1 = vector.extractelement %0[] : vector<f32>
return %1 : f32
}
@@ -36,7 +36,7 @@ func.func @transfer_read_1d(%m: memref<?x?x?xf32>, %idx: index, %idx2: index) ->
// CHECK: return %[[r]]
func.func @tensor_transfer_read_0d(%t: tensor<?x?x?xf32>, %idx: index) -> f32 {
%cst = arith.constant 0.0 : f32
- %0 = vector.transfer_read %t[%idx, %idx, %idx], %cst : tensor<?x?x?xf32>, vector<f32>
+ %0 = vector.transfer_read %t[%idx, %idx, %idx], %cst {in_bounds = []} : tensor<?x?x?xf32>, vector<f32>
%1 = vector.extractelement %0[] : vector<f32>
return %1 : f32
}
@@ -50,7 +50,7 @@ func.func @tensor_transfer_read_0d(%t: tensor<?x?x?xf32>, %idx: index) -> f32 {
// CHECK: memref.store %[[extract]], %[[m]][%[[idx]], %[[idx]], %[[idx]]]
func.func @transfer_write_0d(%m: memref<?x?x?xf32>, %idx: index, %f: f32) {
%0 = vector.broadcast %f : f32 to vector<f32>
- vector.transfer_write %0, %m[%idx, %idx, %idx] : vector<f32>, memref<?x?x?xf32>
+ vector.transfer_write %0, %m[%idx, %idx, %idx] {in_bounds = []} : vector<f32>, memref<?x?x?xf32>
return
}
@@ -61,7 +61,7 @@ func.func @transfer_write_0d(%m: memref<?x?x?xf32>, %idx: index, %f: f32) {
// CHECK: memref.store %[[f]], %[[m]][%[[idx]], %[[idx]], %[[idx]]]
func.func @transfer_write_1d(%m: memref<?x?x?xf32>, %idx: index, %f: f32) {
%0 = vector.broadcast %f : f32 to vector<1xf32>
- vector.transfer_write %0, %m[%idx, %idx, %idx] : vector<1xf32>, memref<?x?x?xf32>
+ vector.transfer_write %0, %m[%idx, %idx, %idx] {in_bounds = [false]} : vector<1xf32>, memref<?x?x?xf32>
return
}
@@ -75,7 +75,7 @@ func.func @transfer_write_1d(%m: memref<?x?x?xf32>, %idx: index, %f: f32) {
// CHECK: return %[[r]]
func.func @tensor_transfer_write_0d(%t: tensor<?x?x?xf32>, %idx: index, %f: f32) -> tensor<?x?x?xf32> {
%0 = vector.broadcast %f : f32 to vector<f32>
- %1 = vector.transfer_write %0, %t[%idx, %idx, %idx] : vector<f32>, tensor<?x?x?xf32>
+ %1 = vector.transfer_write %0, %t[%idx, %idx, %idx] {in_bounds = []} : vector<f32>, tensor<?x?x?xf32>
return %1 : tensor<?x?x?xf32>
}
@@ -106,7 +106,7 @@ func.func @transfer_read_2d_extract(%m: memref<?x?x?x?xf32>, %idx: index, %idx2:
// CHECK: memref.store %[[extract]], %[[m]][%[[idx]], %[[idx]], %[[idx]]]
func.func @transfer_write_arith_constant(%m: memref<?x?x?xf32>, %idx: index) {
%cst = arith.constant dense<5.000000e+00> : vector<1x1xf32>
- vector.transfer_write %cst, %m[%idx, %idx, %idx] : vector<1x1xf32>, memref<?x?x?xf32>
+ vector.transfer_write %cst, %m[%idx, %idx, %idx] {in_bounds = [false, false]} : vector<1x1xf32>, memref<?x?x?xf32>
return
}
diff --git a/mlir/test/Dialect/Vector/value-bounds-op-interface-impl.mlir b/mlir/test/Dialect/Vector/value-bounds-op-interface-impl.mlir
index c04c82970f9c0..d1ed61ebe1447 100644
--- a/mlir/test/Dialect/Vector/value-bounds-op-interface-impl.mlir
+++ b/mlir/test/Dialect/Vector/value-bounds-op-interface-impl.mlir
@@ -7,7 +7,7 @@
// CHECK: %[[dim:.*]] = tensor.dim %[[t]], %[[c0]]
// CHECK: return %[[dim]]
func.func @vector_transfer_write(%t: tensor<?xf32>, %v: vector<5xf32>, %pos: index) -> index {
- %0 = vector.transfer_write %v, %t[%pos] : vector<5xf32>, tensor<?xf32>
+ %0 = vector.transfer_write %v, %t[%pos] {in_bounds = [false]} : vector<5xf32>, tensor<?xf32>
%1 = "test.reify_bound"(%0) {dim = 0} : (tensor<?xf32>) -> (index)
return %1 : index
}
diff --git a/mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir b/mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir
index cba299b2a1d95..477e645c3907e 100644
--- a/mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir
+++ b/mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir
@@ -96,7 +96,7 @@ func.func @vector_transfer_read_i4(%arg1: index, %arg2: index) -> vector<8xi4> {
// CHECK: %[[ALLOC:.+]] = memref.alloc() : memref<12xi8>
// CHECK: %[[PAD:.+]] = arith.extui %[[CONST]] : i4 to i8
// CHECK: %[[INDEX:.+]] = affine.apply #[[MAP]]()[%[[ARG0]], %[[ARG1]]]
-// CHECK: %[[VEC:.+]] = vector.transfer_read %[[ALLOC]][%[[INDEX]]], %[[PAD]] : memref<12xi8>, vector<4xi8>
+// CHECK: %[[VEC:.+]] = vector.transfer_read %[[ALLOC]][%[[INDEX]]], %[[PAD]] {{.*}} : memref<12xi8>, vector<4xi8>
// CHECK: %[[VEC_I4:.+]] = vector.bitcast %[[VEC]] : vector<4xi8> to vector<8xi4>
// CHECK32-DAG: #[[MAP:.+]] = affine_map<()[s0, s1] -> (s0 + s1 floordiv 8)>
@@ -106,7 +106,7 @@ func.func @vector_transfer_read_i4(%arg1: index, %arg2: index) -> vector<8xi4> {
// CHECK32: %[[ALLOC:.+]] = memref.alloc() : memref<3xi32>
// CHECK32: %[[PAD:.+]] = arith.extui %[[CONST]] : i4 to i32
// CHECK32: %[[INDEX:.+]] = affine.apply #[[MAP]]()[%[[ARG0]], %[[ARG1]]]
-// CHECK32: %[[VEC:.+]] = vector.transfer_read %[[ALLOC]][%[[INDEX]]], %[[PAD]] : memref<3xi32>, vector<1xi32>
+// CHECK32: %[[VEC:.+]] = vector.transfer_read %[[ALLOC]][%[[INDEX]]], %[[PAD]] {{.*}} : memref<3xi32>, vector<1xi32>
// CHECK32: %[[VEC_I4:.+]] = vector.bitcast %[[VEC]] : vector<1xi32> to vector<8xi4>
// -----
diff --git a/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir b/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir
index bd6845d1c7cda..56f05bc5c7257 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir
@@ -116,7 +116,7 @@ func.func @contiguous_inner_most_outer_dim_dyn_scalable_inner_dim(%a: index, %b:
func.func @contiguous_inner_most_dim_non_zero_idx(%A: memref<16x1xf32>, %i:index) -> (vector<8x1xf32>) {
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f32
- %1 = vector.transfer_read %A[%i, %c0], %f0 : memref<16x1xf32>, vector<8x1xf32>
+ %1 = vector.transfer_read %A[%i, %c0], %f0 {in_bounds = [false, false]} : memref<16x1xf32>, vector<8x1xf32>
return %1 : vector<8x1xf32>
}
// CHECK: func @contiguous_inner_most_dim_non_zero_idx(%[[SRC:.+]]: memref<16x1xf32>, %[[I:.+]]: index) -> vector<8x1xf32>
@@ -129,7 +129,7 @@ func.func @contiguous_inner_most_dim_non_zero_idx(%A: memref<16x1xf32>, %i:index
// The index to be dropped is != 0 - this is currently not supported.
func.func @negative_contiguous_inner_most_dim_non_zero_idxs(%A: memref<16x1xf32>, %i:index) -> (vector<8x1xf32>) {
%f0 = arith.constant 0.0 : f32
- %1 = vector.transfer_read %A[%i, %i], %f0 : memref<16x1xf32>, vector<8x1xf32>
+ %1 = vector.transfer_read %A[%i, %i], %f0 {in_bounds = [false, false]} : memref<16x1xf32>, vector<8x1xf32>
return %1 : vector<8x1xf32>
}
// CHECK-LABEL: func @negative_contiguous_inner_most_dim_non_zero_idxs
@@ -138,12 +138,12 @@ func.func @negative_contiguous_inner_most_dim_non_zero_idxs(%A: memref<16x1xf32>
// Same as the top example within this split, but with the outer vector
// dim scalable. Note that this example only makes sense when "8 = [8]" (i.e.
-// vscale = 1). This is assumed (implicitly) via the `in_bounds` attribute.
+// vscale = 1). This is assumed via the `in_bounds` attribute.
func.func @contiguous_inner_most_dim_non_zero_idx_scalable_inner_dim(%A: memref<16x1xf32>, %i:index) -> (vector<[8]x1xf32>) {
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f32
- %1 = vector.transfer_read %A[%i, %c0], %f0 : memref<16x1xf32>, vector<[8]x1xf32>
+ %1 = vector.transfer_read %A[%i, %c0], %f0 {in_bounds = [true, true]} : memref<16x1xf32>, vector<[8]x1xf32>
return %1 : vector<[8]x1xf32>
}
// CHECK-LABEL: func @contiguous_inner_most_dim_non_zero_idx_scalable_inner_dim(
@@ -206,7 +206,7 @@ func.func @contiguous_inner_most_dim_with_subview_2d(%A: memref<1000x1x1xf32>, %
// Same as the top example within this split, but with the outer vector
// dim scalable. Note that this example only makes sense when "4 = [4]" (i.e.
-// vscale = 1). This is assumed (implicitly) via the `in_bounds` attribute.
+// vscale = 1). This is assumed via the `in_bounds` attribute.
func.func @contiguous_inner_most_dim_with_subview_2d_scalable_inner_dim(%A: memref<1000x1x1xf32>, %i:index, %ii:index) -> (vector<[4]x1x1xf32>) {
%c0 = arith.constant 0 : index
@@ -231,7 +231,7 @@ func.func @contiguous_inner_most_dim_with_subview_2d_scalable_inner_dim(%A: memr
func.func @negative_non_unit_inner_vec_dim(%arg0: memref<4x1xf32>) -> vector<4x8xf32> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cst : memref<4x1xf32>, vector<4x8xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [false, false]}: memref<4x1xf32>, vector<4x8xf32>
return %0 : vector<4x8xf32>
}
// CHECK: func.func @negative_non_unit_inner_vec_dim
@@ -243,7 +243,7 @@ func.func @negative_non_unit_inner_vec_dim(%arg0: memref<4x1xf32>) -> vector<4x8
func.func @negative_non_unit_inner_memref_dim(%arg0: memref<4x8xf32>) -> vector<4x1xf32> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cst : memref<4x8xf32>, vector<4x1xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [false, false]} : memref<4x8xf32>, vector<4x1xf32>
return %0 : vector<4x1xf32>
}
// CHECK: func.func @negative_non_unit_inner_memref_dim
diff --git a/mlir/test/Dialect/Vector/vector-transfer-drop-unit-dims-patterns.mlir b/mlir/test/Dialect/Vector/vector-transfer-drop-unit-dims-patterns.mlir
index e9d12b044e2c7..97c3179ccdced 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-drop-unit-dims-patterns.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-drop-unit-dims-patterns.mlir
@@ -4,7 +4,7 @@ func.func @transfer_read_rank_reducing(
%arg : memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>) -> vector<3x2xi8> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst {in_bounds=[false, false]} :
memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, vector<3x2xi8>
return %v : vector<3x2xi8>
}
@@ -16,7 +16,7 @@ func.func @transfer_read_rank_reducing(
func.func @transfer_write_rank_reducing(%arg : memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>, %vec : vector<3x2xi8>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0] {in_bounds = [false, false]}:
vector<3x2xi8>, memref<1x1x3x2xi8, strided<[6, 6, 2, 1], offset: ?>>
return
}
@@ -30,7 +30,7 @@ func.func @transfer_read_and_vector_rank_reducing(
%arg : memref<1x1x3x2x1xf32>) -> vector<3x2x1xf32> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0.0 : f32
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false]} :
memref<1x1x3x2x1xf32>, vector<3x2x1xf32>
return %v : vector<3x2x1xf32>
}
@@ -44,7 +44,7 @@ func.func @transfer_write_and_vector_rank_reducing(
%arg : memref<1x1x3x2x1xf32>,
%vec : vector<3x2x1xf32>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0, %c0] {in_bounds = [false, false, false]}:
vector<3x2x1xf32>, memref<1x1x3x2x1xf32>
return
}
@@ -58,7 +58,7 @@ func.func @transfer_read_and_vector_rank_reducing_to_0d(
%arg : memref<1x1x1x1x1xf32>) -> vector<1x1x1xf32> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0.0 : f32
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false]} :
memref<1x1x1x1x1xf32>, vector<1x1x1xf32>
return %v : vector<1x1x1xf32>
}
@@ -72,7 +72,7 @@ func.func @transfer_write_and_vector_rank_reducing_to_0d(
%arg : memref<1x1x1x1x1xf32>,
%vec : vector<1x1x1xf32>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0, %c0] {in_bounds = [false, false, false]} :
vector<1x1x1xf32>, memref<1x1x1x1x1xf32>
return
}
@@ -152,7 +152,7 @@ func.func @masked_transfer_write_and_vector_rank_reducing(
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%mask = vector.create_mask %c1, %mask_dim1, %c1, %mask_dim2, %c1 : vector<1x3x1x16x1xi1>
- vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0, %c0, %c0], %mask :
+ vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0, %c0, %c0], %mask {in_bounds = [false, false, false, false, false]} :
vector<1x3x1x16x1xf32>, memref<1x1x3x1x16x1xf32>
return
}
diff --git a/mlir/test/Dialect/Vector/vector-transfer-flatten.mlir b/mlir/test/Dialect/Vector/vector-transfer-flatten.mlir
index 3a5041fca53fc..4c57edad07d07 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-flatten.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-flatten.mlir
@@ -11,7 +11,7 @@ func.func @transfer_read_dims_match_contiguous(
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false, false]} :
memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>, vector<5x4x3x2xi8>
return %v : vector<5x4x3x2xi8>
}
@@ -33,7 +33,7 @@ func.func @transfer_read_dims_match_contiguous_empty_stride(
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false, false]} :
memref<5x4x3x2xi8>, vector<5x4x3x2xi8>
return %v : vector<5x4x3x2xi8>
}
@@ -58,7 +58,7 @@ func.func @transfer_read_dims_mismatch_contiguous(
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false, false]} :
memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>, vector<1x1x2x2xi8>
return %v : vector<1x1x2x2xi8>
}
@@ -163,7 +163,7 @@ func.func @transfer_read_dims_mismatch_non_contiguous_slice(
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false, false]} :
memref<5x4x3x2xi8>, vector<2x1x2x2xi8>
return %v : vector<2x1x2x2xi8>
}
@@ -181,7 +181,7 @@ func.func @transfer_read_0d(
%arg : memref<i8>) -> vector<i8> {
%cst = arith.constant 0 : i8
- %0 = vector.transfer_read %arg[], %cst : memref<i8>, vector<i8>
+ %0 = vector.transfer_read %arg[], %cst {in_bounds=[]} : memref<i8>, vector<i8>
return %0 : vector<i8>
}
@@ -202,7 +202,7 @@ func.func @transfer_read_non_contiguous_src(
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst :
+ %v = vector.transfer_read %arg[%c0, %c0, %c0, %c0], %cst {in_bounds=[false, false, false, false]} :
memref<5x4x3x2xi8, strided<[24, 8, 2, 1], offset: ?>>, vector<5x4x3x2xi8>
return %v : vector<5x4x3x2xi8>
}
@@ -227,7 +227,7 @@ func.func @transfer_write_dims_match_contiguous(
%vec : vector<5x4x3x2xi8>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] {in_bounds=[false, false, false, false]} :
vector<5x4x3x2xi8>, memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>
return
}
@@ -249,7 +249,7 @@ func.func @transfer_write_dims_match_contiguous_empty_stride(
%vec : vector<5x4x3x2xi8>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] {in_bounds=[false, false, false, false]} :
vector<5x4x3x2xi8>, memref<5x4x3x2xi8>
return
}
@@ -271,7 +271,7 @@ func.func @transfer_write_dims_mismatch_contiguous(
%vec : vector<1x1x2x2xi8>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg [%c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] {in_bounds=[false, false, false, false]} :
vector<1x1x2x2xi8>, memref<5x4x3x2xi8, strided<[24, 6, 2, 1], offset: ?>>
return
}
@@ -379,7 +379,7 @@ func.func @transfer_write_dims_mismatch_non_contiguous_slice(
%c0 = arith.constant 0 : index
%cst = arith.constant 0 : i8
- vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] {in_bounds=[false, false, false, false]} :
vector<2x1x2x2xi8>, memref<5x4x3x2xi8>
return
}
@@ -397,7 +397,7 @@ func.func @transfer_write_0d(
%arg : memref<i8>,
%vec : vector<i8>) {
- vector.transfer_write %vec, %arg[] : vector<i8>, memref<i8>
+ vector.transfer_write %vec, %arg[] {in_bounds=[]} : vector<i8>, memref<i8>
return
}
@@ -418,7 +418,7 @@ func.func @transfer_write_non_contiguous_src(
%vec : vector<5x4x3x2xi8>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] :
+ vector.transfer_write %vec, %arg[%c0, %c0, %c0, %c0] {in_bounds=[false, false, false, false]} :
vector<5x4x3x2xi8>, memref<5x4x3x2xi8, strided<[24, 8, 2, 1], offset: ?>>
return
}
diff --git a/mlir/test/Dialect/Vector/vector-transfer-full-partial-split-copy-transform.mlir b/mlir/test/Dialect/Vector/vector-transfer-full-partial-split-copy-transform.mlir
index 483147c6f6a40..8e1b8e0be9155 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-full-partial-split-copy-transform.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-full-partial-split-copy-transform.mlir
@@ -47,7 +47,7 @@ func.func @split_vector_transfer_read_2d(%A: memref<?x8xf32>, %i: index, %j: ind
// CHECK: }
// CHECK: %[[res:.*]] = vector.transfer_read %[[ifres]]#0[%[[ifres]]#1, %[[ifres]]#2], %cst
// CHECK-SAME: {in_bounds = [true, true]} : memref<?x8xf32>, vector<4x8xf32>
- %1 = vector.transfer_read %A[%i, %j], %f0 : memref<?x8xf32>, vector<4x8xf32>
+ %1 = vector.transfer_read %A[%i, %j], %f0 {in_bounds = [false, false]} : memref<?x8xf32>, vector<4x8xf32>
// CHECK: return %[[res]] : vector<4x8xf32>
return %1: vector<4x8xf32>
@@ -100,7 +100,7 @@ func.func @split_vector_transfer_read_strided_2d(
// CHECK: }
// CHECK: %[[res:.*]] = vector.transfer_read {{.*}} {in_bounds = [true, true]} :
// CHECK-SAME: memref<?x8xf32, strided<[?, 1], offset: ?>>, vector<4x8xf32>
- %1 = vector.transfer_read %A[%i, %j], %f0 :
+ %1 = vector.transfer_read %A[%i, %j], %f0 {in_bounds = [false, false]} :
memref<7x8xf32, strided<[?, 1], offset: ?>>, vector<4x8xf32>
return %1 : vector<4x8xf32>
@@ -119,7 +119,7 @@ module attributes {transform.with_named_sequence} {
// -----
func.func @split_vector_transfer_write_2d(%V: vector<4x8xf32>, %A: memref<?x8xf32>, %i: index, %j: index) {
- vector.transfer_write %V, %A[%i, %j] :
+ vector.transfer_write %V, %A[%i, %j] {in_bounds = [false, false]} :
vector<4x8xf32>, memref<?x8xf32>
return
}
@@ -185,7 +185,7 @@ module attributes {transform.with_named_sequence} {
func.func @split_vector_transfer_write_strided_2d(
%V: vector<4x8xf32>, %A: memref<7x8xf32, strided<[?, 1], offset: ?>>,
%i: index, %j: index) {
- vector.transfer_write %V, %A[%i, %j] :
+ vector.transfer_write %V, %A[%i, %j] {in_bounds = [false, false]} :
vector<4x8xf32>, memref<7x8xf32, strided<[?, 1], offset: ?>>
return
}
diff --git a/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir b/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
index a9c7bf8e8b327..8e2e66e34754e 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir
@@ -31,7 +31,7 @@ func.func @split_vector_transfer_read_2d(%A: memref<?x8xf32>, %i: index, %j: ind
// CHECK: scf.yield %[[A]], %[[i]], %[[j]] : memref<?x8xf32>, index, index
// CHECK: } else {
// slow path, fill tmp alloc and yield a memref_casted version of it
- // CHECK: %[[slow:.*]] = vector.transfer_read %[[A]][%[[i]], %[[j]]], %cst :
+ // CHECK: %[[slow:.*]] = vector.transfer_read %[[A]][%[[i]], %[[j]]], %cst {{.*}} :
// CHECK-SAME: memref<?x8xf32>, vector<4x8xf32>
// CHECK: %[[cast_alloc:.*]] = vector.type_cast %[[alloc]] :
// CHECK-SAME: memref<4x8xf32> to memref<vector<4x8xf32>>
@@ -44,7 +44,7 @@ func.func @split_vector_transfer_read_2d(%A: memref<?x8xf32>, %i: index, %j: ind
// CHECK: %[[res:.*]] = vector.transfer_read %[[ifres]]#0[%[[ifres]]#1, %[[ifres]]#2], %cst
// CHECK-SAME: {in_bounds = [true, true]} : memref<?x8xf32>, vector<4x8xf32>
- %1 = vector.transfer_read %A[%i, %j], %f0 : memref<?x8xf32>, vector<4x8xf32>
+ %1 = vector.transfer_read %A[%i, %j], %f0 {in_bounds = [false, false]}: memref<?x8xf32>, vector<4x8xf32>
return %1: vector<4x8xf32>
}
@@ -81,7 +81,7 @@ func.func @split_vector_transfer_read_strided_2d(
// CHECK-SAME: memref<?x8xf32, strided<[?, 1], offset: ?>>, index, index
// CHECK: } else {
// slow path, fill tmp alloc and yield a memref_casted version of it
- // CHECK: %[[slow:.*]] = vector.transfer_read %[[A]][%[[i]], %[[j]]], %cst :
+ // CHECK: %[[slow:.*]] = vector.transfer_read %[[A]][%[[i]], %[[j]]], %cst {{.*}} :
// CHECK-SAME: memref<7x8xf32, strided<[?, 1], offset: ?>>, vector<4x8xf32>
// CHECK: %[[cast_alloc:.*]] = vector.type_cast %[[alloc]] :
// CHECK-SAME: memref<4x8xf32> to memref<vector<4x8xf32>>
@@ -94,7 +94,7 @@ func.func @split_vector_transfer_read_strided_2d(
// CHECK: }
// CHECK: %[[res:.*]] = vector.transfer_read {{.*}} {in_bounds = [true, true]} :
// CHECK-SAME: memref<?x8xf32, strided<[?, 1], offset: ?>>, vector<4x8xf32>
- %1 = vector.transfer_read %A[%i, %j], %f0 :
+ %1 = vector.transfer_read %A[%i, %j], %f0 {in_bounds = [false, false]} :
memref<7x8xf32, strided<[?, 1], offset: ?>>, vector<4x8xf32>
// CHECK: return %[[res]] : vector<4x8xf32>
@@ -114,7 +114,7 @@ func.func @split_vector_transfer_read_mem_space(%A: memref<?x8xf32, 3>, %i: inde
// CHECK: scf.yield %[[cast]], {{.*}} : memref<?x8xf32, strided<[8, 1]>>, index, index
// CHECK: } else {
// slow path, fill tmp alloc and yield a memref_casted version of it
- // CHECK: %[[slow:.*]] = vector.transfer_read %[[A]][%[[i]], %[[j]]], %cst :
+ // CHECK: %[[slow:.*]] = vector.transfer_read %[[A]][%[[i]], %[[j]]], %cst {{.*}} :
// CHECK-SAME: memref<?x8xf32, 3>, vector<4x8xf32>
// CHECK: %[[cast_alloc:.*]] = vector.type_cast %[[alloc]] :
// CHECK-SAME: memref<4x8xf32> to memref<vector<4x8xf32>>
@@ -127,7 +127,7 @@ func.func @split_vector_transfer_read_mem_space(%A: memref<?x8xf32, 3>, %i: inde
// CHECK: %[[res:.*]] = vector.transfer_read %[[ifres]]#0[%[[ifres]]#1, %[[ifres]]#2], %cst
// CHECK-SAME: {in_bounds = [true, true]} : memref<?x8xf32, strided<[8, 1]>>, vector<4x8xf32>
- %1 = vector.transfer_read %A[%i, %j], %f0 : memref<?x8xf32, 3>, vector<4x8xf32>
+ %1 = vector.transfer_read %A[%i, %j], %f0 {in_bounds = [false, false]} : memref<?x8xf32, 3>, vector<4x8xf32>
return %1: vector<4x8xf32>
}
@@ -145,7 +145,7 @@ module attributes {transform.with_named_sequence} {
// -----
func.func @split_vector_transfer_write_2d(%V: vector<4x8xf32>, %A: memref<?x8xf32>, %i: index, %j: index) {
- vector.transfer_write %V, %A[%i, %j] :
+ vector.transfer_write %V, %A[%i, %j] {in_bounds = [false, false]} :
vector<4x8xf32>, memref<?x8xf32>
return
}
@@ -208,7 +208,7 @@ module attributes {transform.with_named_sequence} {
func.func @split_vector_transfer_write_strided_2d(
%V: vector<4x8xf32>, %A: memref<7x8xf32, strided<[?, 1], offset: ?>>,
%i: index, %j: index) {
- vector.transfer_write %V, %A[%i, %j] :
+ vector.transfer_write %V, %A[%i, %j] {in_bounds = [false, false]} :
vector<4x8xf32>, memref<7x8xf32, strided<[?, 1], offset: ?>>
return
}
@@ -271,7 +271,7 @@ module attributes {transform.with_named_sequence} {
// -----
func.func @split_vector_transfer_write_mem_space(%V: vector<4x8xf32>, %A: memref<?x8xf32, 3>, %i: index, %j: index) {
- vector.transfer_write %V, %A[%i, %j] :
+ vector.transfer_write %V, %A[%i, %j] {in_bounds = [false, false]} :
vector<4x8xf32>, memref<?x8xf32, 3>
return
}
@@ -317,7 +317,7 @@ func.func @transfer_read_within_async_execute(%A : memref<?x?xf32>) -> !async.to
// CHECK: async.execute
// CHECK: alloca
%token = async.execute {
- %0 = vector.transfer_read %A[%c0, %c0], %f0 : memref<?x?xf32>, vector<2x2xf32>
+ %0 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds = [false, false]} : memref<?x?xf32>, vector<2x2xf32>
func.call @fake_side_effecting_fun(%0) : (vector<2x2xf32>) -> ()
async.yield
}
@@ -334,7 +334,7 @@ func.func @transfer_read_within_scf_for(%A : memref<?x?xf32>, %lb : index, %ub :
// CHECK: scf.for
// CHECK-NOT: memref.alloca
scf.for %i = %lb to %ub step %step {
- %0 = vector.transfer_read %A[%c0, %c0], %f0 : memref<?x?xf32>, vector<2x2xf32>
+ %0 = vector.transfer_read %A[%c0, %c0], %f0 {in_bounds = [false, false]} : memref<?x?xf32>, vector<2x2xf32>
func.call @fake_side_effecting_fun(%0) : (vector<2x2xf32>) -> ()
}
return
diff --git a/mlir/test/Dialect/Vector/vector-transfer-permutation-lowering.mlir b/mlir/test/Dialect/Vector/vector-transfer-permutation-lowering.mlir
index 35418b38df9b2..cfee767dee58c 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-permutation-lowering.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-permutation-lowering.mlir
@@ -128,7 +128,7 @@ func.func @permutation_with_mask_xfer_write_scalable(%arg0: vector<4x[8]xi16>, %
// CHECK-NOT: vector.transpose
// CHECK: %[[RES:.*]] = vector.mask %[[MASK]] { vector.transfer_write %[[ARG_1]], %[[ARG_0]]{{.*}} vector<16xf32>, tensor<?x?xf32> } : vector<16xi1> -> tensor<?x?xf32>
func.func @masked_permutation_xfer_write_fixed_width(%t: tensor<?x?xf32>, %val: vector<16xf32>, %idx: index, %mask: vector<16xi1>) -> tensor<?x?xf32> {
- %r = vector.mask %mask { vector.transfer_write %val, %t[%idx, %idx] {permutation_map = affine_map<(d0, d1) -> (d0)>} : vector<16xf32>, tensor<?x?xf32> } : vector<16xi1> -> tensor<?x?xf32>
+ %r = vector.mask %mask { vector.transfer_write %val, %t[%idx, %idx] {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (d0)>} : vector<16xf32>, tensor<?x?xf32> } : vector<16xi1> -> tensor<?x?xf32>
return %r : tensor<?x?xf32>
}
@@ -228,7 +228,7 @@ func.func @permutation_with_mask_xfer_read_scalable(%mem: memref<?x?xf32>, %dim_
func.func @masked_permutation_xfer_read_fixed_width(%arg0: tensor<?x1xf32>, %mask : vector<4x1xi1>) {
%cst = arith.constant 0.000000e+00 : f32
%c0 = arith.constant 0 : index
- %3 = vector.mask %mask { vector.transfer_read %arg0[%c0, %c0], %cst {permutation_map = affine_map<(d0, d1) -> (d1, 0, d0)>} : tensor<?x1xf32>, vector<1x4x4xf32> } : vector<4x1xi1> -> vector<1x4x4xf32>
+ %3 = vector.mask %mask { vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [false, true, false], permutation_map = affine_map<(d0, d1) -> (d1, 0, d0)>} : tensor<?x1xf32>, vector<1x4x4xf32> } : vector<4x1xi1> -> vector<1x4x4xf32>
call @test.some_use(%3) : (vector<1x4x4xf32>) -> ()
return
}
diff --git a/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir b/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
index d169e6d5878e2..5f1db9be09371 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
@@ -8,11 +8,11 @@ func.func @vector_transfer_ops_0d_memref(%M: memref<f32>, %v: vector<1x1x1xf32>)
// CHECK-NEXT: %[[s:.*]] = memref.load %[[MEM]][] : memref<f32>
// CHECK-NEXT: %[[V:.*]] = vector.broadcast %[[s]] : f32 to vector<f32>
- %0 = vector.transfer_read %M[], %f0 : memref<f32>, vector<f32>
+ %0 = vector.transfer_read %M[], %f0 {in_bounds = []} : memref<f32>, vector<f32>
// CHECK-NEXT: %[[ss:.*]] = vector.extractelement %[[V]][] : vector<f32>
// CHECK-NEXT: memref.store %[[ss]], %[[MEM]][] : memref<f32>
- vector.transfer_write %0, %M[] : vector<f32>, memref<f32>
+ vector.transfer_write %0, %M[] {in_bounds = []} : vector<f32>, memref<f32>
// CHECK-NEXT: %[[VV:.*]] = vector.extract %arg1[0, 0, 0] : f32 from vector<1x1x1xf32>
// CHECK-NEXT: memref.store %[[VV]], %[[MEM]][] : memref<f32>
@@ -28,7 +28,7 @@ func.func @vector_transfer_ops_0d_tensor(%M: tensor<f32>) -> vector<1xf32> {
// CHECK-NEXT: %[[S:.*]] = tensor.extract %[[SOURCE]][] : tensor<f32>
// CHECK-NEXT: %[[V:.*]] = vector.broadcast %[[S]] : f32 to vector<1xf32>
- %0 = vector.transfer_read %M[], %f0 {permutation_map = affine_map<()->(0)>} :
+ %0 = vector.transfer_read %M[], %f0 {in_bounds = [true], permutation_map = affine_map<()->(0)>} :
tensor<f32>, vector<1xf32>
// CHECK-NEXT: return %[[V]]
@@ -95,8 +95,8 @@ func.func @transfer_2D(%mem : memref<8x8xf32>, %i : index) -> vector<2x4xf32> {
func.func @transfer_vector_element(%mem : memref<8x8xvector<2x4xf32>>, %i : index) -> vector<2x4xf32> {
%cf0 = arith.constant dense<0.0> : vector<2x4xf32>
- %res = vector.transfer_read %mem[%i, %i], %cf0 : memref<8x8xvector<2x4xf32>>, vector<2x4xf32>
- vector.transfer_write %res, %mem[%i, %i] : vector<2x4xf32>, memref<8x8xvector<2x4xf32>>
+ %res = vector.transfer_read %mem[%i, %i], %cf0 {in_bounds = []} : memref<8x8xvector<2x4xf32>>, vector<2x4xf32>
+ vector.transfer_write %res, %mem[%i, %i] {in_bounds = []} : vector<2x4xf32>, memref<8x8xvector<2x4xf32>>
return %res : vector<2x4xf32>
}
@@ -142,15 +142,15 @@ func.func @transfer_2D_not_inbounds(%mem : memref<8x8xf32>, %i : index) -> vecto
// CHECK-SAME: %[[MEM:.*]]: memref<8x8xf32>,
// CHECK-SAME: %[[IDX:.*]]: index) -> vector<4xf32> {
// CHECK-NEXT: %[[CF0:.*]] = arith.constant 0.000000e+00 : f32
-// CHECK-NEXT: %[[RES:.*]] = vector.transfer_read %[[MEM]][%[[IDX]], %[[IDX]]], %[[CF0]] : memref<8x8xf32>, vector<4xf32>
-// CHECK-NEXT: vector.transfer_write %[[RES]], %[[MEM]][%[[IDX]], %[[IDX]]] : vector<4xf32>, memref<8x8xf32>
+// CHECK-NEXT: %[[RES:.*]] = vector.transfer_read %[[MEM]][%[[IDX]], %[[IDX]]], %[[CF0]] {{.*}} : memref<8x8xf32>, vector<4xf32>
+// CHECK-NEXT: vector.transfer_write %[[RES]], %[[MEM]][%[[IDX]], %[[IDX]]] {{.*}} : vector<4xf32>, memref<8x8xf32>
// CHECK-NEXT: return %[[RES]] : vector<4xf32>
// CHECK-NEXT: }
func.func @transfer_not_inbounds(%mem : memref<8x8xf32>, %i : index) -> vector<4xf32> {
%cf0 = arith.constant 0.0 : f32
- %res = vector.transfer_read %mem[%i, %i], %cf0 : memref<8x8xf32>, vector<4xf32>
- vector.transfer_write %res, %mem[%i, %i] : vector<4xf32>, memref<8x8xf32>
+ %res = vector.transfer_read %mem[%i, %i], %cf0 {in_bounds = [false]} : memref<8x8xf32>, vector<4xf32>
+ vector.transfer_write %res, %mem[%i, %i] {in_bounds = [false]} : vector<4xf32>, memref<8x8xf32>
return %res : vector<4xf32>
}
@@ -296,8 +296,8 @@ func.func @transfer_read_permutations(%arg0 : memref<?x?xf32>, %arg1 : memref<?x
// CHECK: %[[MASK1:.*]] = vector.splat %{{.*}} : vector<16x14xi1>
%mask1 = vector.splat %m : vector<16x14xi1>
- %1 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst, %mask1 {permutation_map = #map1} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
-// CHECK: vector.transfer_read {{.*}} %[[MASK1]] {permutation_map = #[[$MAP0]]} : memref<?x?x?x?xf32>, vector<16x14x7x8xf32>
+ %1 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst, %mask1 {in_bounds = [true, false, true, false], permutation_map = #map1} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
+// CHECK: vector.transfer_read {{.*}} %[[MASK1]] {{{.*}}, permutation_map = #[[$MAP0]]} : memref<?x?x?x?xf32>, vector<16x14x7x8xf32>
// CHECK: vector.transpose %{{.*}}, [2, 1, 3, 0] : vector<16x14x7x8xf32> to vector<7x14x8x16xf32>
// CHECK: %[[MASK3:.*]] = vector.splat %{{.*}} : vector<14x7xi1>
@@ -307,21 +307,21 @@ func.func @transfer_read_permutations(%arg0 : memref<?x?xf32>, %arg1 : memref<?x
// CHECK: vector.broadcast %{{.*}} : vector<14x16x7xf32> to vector<8x14x16x7xf32>
// CHECK: vector.transpose %{{.*}}, [3, 1, 0, 2] : vector<8x14x16x7xf32> to vector<7x14x8x16xf32>
- %3 = vector.transfer_read %arg0[%c0, %c0], %cst {permutation_map = #map3} : memref<?x?xf32>, vector<7x14x8x16xf32>
-// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]]], %[[CF0]] : memref<?x?xf32>, vector<14x7xf32>
+ %3 = vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [false, false, true, true], permutation_map = #map3} : memref<?x?xf32>, vector<7x14x8x16xf32>
+// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]]], %[[CF0]] {{.*}} : memref<?x?xf32>, vector<14x7xf32>
// CHECK: vector.broadcast %{{.*}} : vector<14x7xf32> to vector<8x16x14x7xf32>
// CHECK: vector.transpose %{{.*}}, [3, 2, 0, 1] : vector<8x16x14x7xf32> to vector<7x14x8x16xf32>
- %4 = vector.transfer_read %arg0[%c0, %c0], %cst {permutation_map = #map4} : memref<?x?xf32>, vector<7x14x8x16xf32>
-// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]]], %[[CF0]] : memref<?x?xf32>, vector<16x14xf32>
+ %4 = vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [true, false, true, false], permutation_map = #map4} : memref<?x?xf32>, vector<7x14x8x16xf32>
+// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]]], %[[CF0]] {{.*}} : memref<?x?xf32>, vector<16x14xf32>
// CHECK: vector.broadcast %{{.*}} : vector<16x14xf32> to vector<7x8x16x14xf32>
// CHECK: vector.transpose %{{.*}}, [0, 3, 1, 2] : vector<7x8x16x14xf32> to vector<7x14x8x16xf32>
- %5 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst {permutation_map = #map5} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
-// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]], %[[C0]], %[[C0]]], %[[CF0]] : memref<?x?x?x?xf32>, vector<16x14x7x8xf32>
+ %5 = vector.transfer_read %arg1[%c0, %c0, %c0, %c0], %cst {in_bounds = [false, false, false, false], permutation_map = #map5} : memref<?x?x?x?xf32>, vector<7x14x8x16xf32>
+// CHECK: vector.transfer_read %{{.*}}[%[[C0]], %[[C0]], %[[C0]], %[[C0]]], %[[CF0]] {{.*}} : memref<?x?x?x?xf32>, vector<16x14x7x8xf32>
// CHECK: vector.transpose %{{.*}}, [2, 1, 3, 0] : vector<16x14x7x8xf32> to vector<7x14x8x16xf32>
- %6 = vector.transfer_read %arg0[%c0, %c0], %cst {permutation_map = #map6} : memref<?x?xf32>, vector<8xf32>
+ %6 = vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [true], permutation_map = #map6} : memref<?x?xf32>, vector<8xf32>
// CHECK: memref.load %{{.*}}[%[[C0]], %[[C0]]] : memref<?x?xf32>
// CHECK: vector.broadcast %{{.*}} : f32 to vector<8xf32>
@@ -348,9 +348,9 @@ func.func @transfer_write_permutations(
// CHECK: %[[NEW_VEC0:.*]] = vector.transpose %{{.*}} [3, 1, 0, 2] : vector<7x14x8x16xf32> to vector<16x14x7x8xf32>
// CHECK: %[[NEW_RES0:.*]] = vector.transfer_write %[[NEW_VEC0]], %[[ARG1]][%c0, %c0, %c0, %c0], %[[MASK]] {in_bounds = [true, false, true, false]} : vector<16x14x7x8xf32>, tensor<?x?x?x?xf32>
- vector.transfer_write %v2, %arg0[%c0, %c0, %c0, %c0] {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, d2)>} : vector<8x16xf32>, memref<?x?x?x?xf32>
+ vector.transfer_write %v2, %arg0[%c0, %c0, %c0, %c0] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, d2)>} : vector<8x16xf32>, memref<?x?x?x?xf32>
// CHECK: %[[NEW_VEC1:.*]] = vector.transpose %{{.*}} [1, 0] : vector<8x16xf32> to vector<16x8xf32>
- // CHECK: vector.transfer_write %[[NEW_VEC1]], %[[ARG0]][%c0, %c0, %c0, %c0] : vector<16x8xf32>, memref<?x?x?x?xf32>
+ // CHECK: vector.transfer_write %[[NEW_VEC1]], %[[ARG0]][%c0, %c0, %c0, %c0] {{.*}} : vector<16x8xf32>, memref<?x?x?x?xf32>
return %0 : tensor<?x?x?x?xf32>
}
@@ -372,7 +372,7 @@ func.func @transfer_write_broadcast_unit_dim(
// CHECK: %[[NEW_VEC1:.*]] = vector.transpose %[[NEW_VEC0]], [1, 2, 0, 3] : vector<1x14x8x16xf32> to vector<14x8x1x16xf32>
// CHECK: %[[NEW_RES0:.*]] = vector.transfer_write %[[NEW_VEC1]], %[[ARG1]][%[[C0]], %[[C0]], %[[C0]], %[[C0]]] {in_bounds = [false, false, true, true]} : vector<14x8x1x16xf32>, tensor<?x?x?x?xf32>
- vector.transfer_write %v2, %arg0[%c0, %c0, %c0, %c0] {permutation_map = affine_map<(d0, d1, d2, d3) -> (d1, d2)>} : vector<8x16xf32>, memref<?x?x?x?xf32>
+ vector.transfer_write %v2, %arg0[%c0, %c0, %c0, %c0] {in_bounds = [false, false], permutation_map = affine_map<(d0, d1, d2, d3) -> (d1, d2)>} : vector<8x16xf32>, memref<?x?x?x?xf32>
// CHECK: %[[NEW_VEC2:.*]] = vector.broadcast %{{.*}} : vector<8x16xf32> to vector<1x8x16xf32>
// CHECK: %[[NEW_VEC3:.*]] = vector.transpose %[[NEW_VEC2]], [1, 2, 0] : vector<1x8x16xf32> to vector<8x16x1xf32>
// CHECK: vector.transfer_write %[[NEW_VEC3]], %[[ARG0]][%[[C0]], %[[C0]], %[[C0]], %[[C0]]] {in_bounds = [false, false, true]} : vector<8x16x1xf32>, memref<?x?x?x?xf32>
diff --git a/mlir/test/Dialect/Vector/vector-transfer-unroll.mlir b/mlir/test/Dialect/Vector/vector-transfer-unroll.mlir
index 578d845a27ad4..734da4375e00e 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-unroll.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-unroll.mlir
@@ -30,7 +30,7 @@
func.func @transfer_read_unroll(%arg0 : memref<4x4xf32>) -> vector<4x4xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 : memref<4x4xf32>, vector<4x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} : memref<4x4xf32>, vector<4x4xf32>
return %0 : vector<4x4xf32>
}
@@ -62,7 +62,7 @@ func.func @transfer_read_unroll(%arg0 : memref<4x4xf32>) -> vector<4x4xf32> {
func.func @transfer_write_unroll(%arg0 : memref<4x4xf32>, %arg1 : vector<4x4xf32>) {
%c0 = arith.constant 0 : index
- vector.transfer_write %arg1, %arg0[%c0, %c0] : vector<4x4xf32>, memref<4x4xf32>
+ vector.transfer_write %arg1, %arg0[%c0, %c0] {in_bounds = [false, false]} : vector<4x4xf32>, memref<4x4xf32>
return
}
@@ -82,8 +82,8 @@ func.func @transfer_write_unroll(%arg0 : memref<4x4xf32>, %arg1 : vector<4x4xf32
func.func @transfer_readwrite_unroll(%arg0 : memref<4x4xf32>) {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 : memref<4x4xf32>, vector<4x4xf32>
- vector.transfer_write %0, %arg0[%c0, %c0] : vector<4x4xf32>, memref<4x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} : memref<4x4xf32>, vector<4x4xf32>
+ vector.transfer_write %0, %arg0[%c0, %c0] {in_bounds = [false, false]} : vector<4x4xf32>, memref<4x4xf32>
return
}
@@ -103,7 +103,7 @@ func.func @transfer_readwrite_unroll(%arg0 : memref<4x4xf32>) {
func.func @transfer_read_unroll_tensor(%arg0 : tensor<4x4xf32>) -> vector<4x4xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 : tensor<4x4xf32>, vector<4x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} : tensor<4x4xf32>, vector<4x4xf32>
return %0 : vector<4x4xf32>
}
@@ -123,7 +123,7 @@ func.func @transfer_read_unroll_tensor(%arg0 : tensor<4x4xf32>) -> vector<4x4xf3
func.func @transfer_write_unroll_tensor(%arg0 : tensor<4x4xf32>,
%arg1 : vector<4x4xf32>) -> tensor<4x4xf32> {
%c0 = arith.constant 0 : index
- %r = vector.transfer_write %arg1, %arg0[%c0, %c0] :
+ %r = vector.transfer_write %arg1, %arg0[%c0, %c0] {in_bounds = [false, false]} :
vector<4x4xf32>, tensor<4x4xf32>
return %r: tensor<4x4xf32>
}
@@ -145,8 +145,8 @@ func.func @transfer_readwrite_unroll_tensor(%arg0 : tensor<4x4xf32>, %arg1 : ten
tensor<4x4xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 : tensor<4x4xf32>, vector<4x4xf32>
- %r = vector.transfer_write %0, %arg1[%c0, %c0] : vector<4x4xf32>, tensor<4x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} : tensor<4x4xf32>, vector<4x4xf32>
+ %r = vector.transfer_write %0, %arg1[%c0, %c0] {in_bounds = [false, false]} : vector<4x4xf32>, tensor<4x4xf32>
return %r: tensor<4x4xf32>
}
@@ -173,7 +173,7 @@ func.func @transfer_readwrite_unroll_tensor(%arg0 : tensor<4x4xf32>, %arg1 : ten
func.func @transfer_read_unroll_permutation(%arg0 : memref<6x4xf32>) -> vector<4x6xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {permutation_map = #map0} : memref<6x4xf32>, vector<4x6xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false], permutation_map = #map0} : memref<6x4xf32>, vector<4x6xf32>
return %0 : vector<4x6xf32>
}
@@ -199,7 +199,7 @@ func.func @transfer_read_unroll_permutation(%arg0 : memref<6x4xf32>) -> vector<4
func.func @transfer_read_unroll_broadcast(%arg0 : memref<6x4xf32>) -> vector<6x4xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {permutation_map = #map0} : memref<6x4xf32>, vector<6x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [true, false], permutation_map = #map0} : memref<6x4xf32>, vector<6x4xf32>
return %0 : vector<6x4xf32>
}
@@ -226,7 +226,7 @@ func.func @transfer_read_unroll_broadcast(%arg0 : memref<6x4xf32>) -> vector<6x4
func.func @transfer_read_unroll_broadcast_permuation(%arg0 : memref<6x4xf32>) -> vector<4x6xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {permutation_map = #map0} : memref<6x4xf32>, vector<4x6xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [true, false], permutation_map = #map0} : memref<6x4xf32>, vector<4x6xf32>
return %0 : vector<4x6xf32>
}
@@ -272,7 +272,7 @@ func.func @transfer_read_unroll_broadcast_permuation(%arg0 : memref<6x4xf32>) ->
func.func @transfer_read_unroll_different_rank(%arg0 : memref<?x?x?xf32>) -> vector<6x4xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %cf0 {permutation_map = #map0} : memref<?x?x?xf32>, vector<6x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %cf0 {in_bounds = [false, false], permutation_map = #map0} : memref<?x?x?xf32>, vector<6x4xf32>
return %0 : vector<6x4xf32>
}
diff --git a/mlir/test/Dialect/Vector/vector-transforms.mlir b/mlir/test/Dialect/Vector/vector-transforms.mlir
index eda6a5cc40d99..5da1c6086deef 100644
--- a/mlir/test/Dialect/Vector/vector-transforms.mlir
+++ b/mlir/test/Dialect/Vector/vector-transforms.mlir
@@ -140,22 +140,22 @@ func.func @contraction4x4_ikj_xfer_read(%arg0 : memref<4x2xf32>,
%cf0 = arith.constant 0.0 : f32
%0 = vector.transfer_read %arg0[%c0, %c0], %cf0
- { permutation_map = affine_map<(d0, d1) -> (d0, d1)> }
+ { permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]}
: memref<4x2xf32>, vector<4x2xf32>
%1 = vector.transfer_read %arg1[%c0, %c0], %cf0
- { permutation_map = affine_map<(d0, d1) -> (d0, d1)> }
+ { permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]}
: memref<2x4xf32>, vector<2x4xf32>
%2 = vector.transfer_read %arg2[%c0, %c0], %cf0
- { permutation_map = affine_map<(d0, d1) -> (d0, d1)> }
+ { permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]}
: memref<4x4xf32>, vector<4x4xf32>
%3 = vector.contract #contraction_trait1 %0, %1, %2
: vector<4x2xf32>, vector<2x4xf32> into vector<4x4xf32>
vector.transfer_write %3, %arg2[%c0, %c0]
- {permutation_map = affine_map<(d0, d1) -> (d0, d1)>}
+ {permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]}
: vector<4x4xf32>, memref<4x4xf32>
return
}
@@ -175,10 +175,10 @@ func.func @vector_transfers(%arg0: index, %arg1: index) {
%cst_1 = arith.constant 2.000000e+00 : f32
affine.for %arg2 = 0 to %arg0 step 4 {
affine.for %arg3 = 0 to %arg1 step 4 {
- %4 = vector.transfer_read %0[%arg2, %arg3], %cst {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} : memref<?x?xf32>, vector<4x4xf32>
- %5 = vector.transfer_read %1[%arg2, %arg3], %cst {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} : memref<?x?xf32>, vector<4x4xf32>
+ %4 = vector.transfer_read %0[%arg2, %arg3], %cst {permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]} : memref<?x?xf32>, vector<4x4xf32>
+ %5 = vector.transfer_read %1[%arg2, %arg3], %cst {permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]} : memref<?x?xf32>, vector<4x4xf32>
%6 = arith.addf %4, %5 : vector<4x4xf32>
- vector.transfer_write %6, %2[%arg2, %arg3] {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} : vector<4x4xf32>, memref<?x?xf32>
+ vector.transfer_write %6, %2[%arg2, %arg3] {permutation_map = affine_map<(d0, d1) -> (d0, d1)>, in_bounds = [false, false]} : vector<4x4xf32>, memref<?x?xf32>
}
}
return
@@ -228,14 +228,14 @@ func.func @cancelling_shape_cast_ops(%arg0 : vector<2x4xf32>) -> vector<2x4xf32>
func.func @elementwise_unroll(%arg0 : memref<4x4xf32>, %arg1 : memref<4x4xf32>) {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 : memref<4x4xf32>, vector<4x4xf32>
- %1 = vector.transfer_read %arg1[%c0, %c0], %cf0 : memref<4x4xf32>, vector<4x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} : memref<4x4xf32>, vector<4x4xf32>
+ %1 = vector.transfer_read %arg1[%c0, %c0], %cf0 {in_bounds = [false, false]} : memref<4x4xf32>, vector<4x4xf32>
%cond = arith.cmpf ult, %0, %1 : vector<4x4xf32>
// Vector transfer split pattern only support single user right now.
- %2 = vector.transfer_read %arg0[%c0, %c0], %cf0 : memref<4x4xf32>, vector<4x4xf32>
- %3 = vector.transfer_read %arg1[%c0, %c0], %cf0 : memref<4x4xf32>, vector<4x4xf32>
+ %2 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} : memref<4x4xf32>, vector<4x4xf32>
+ %3 = vector.transfer_read %arg1[%c0, %c0], %cf0 {in_bounds = [false, false]} : memref<4x4xf32>, vector<4x4xf32>
%4 = arith.select %cond, %2, %3 : vector<4x4xi1>, vector<4x4xf32>
- vector.transfer_write %4, %arg0[%c0, %c0] : vector<4x4xf32>, memref<4x4xf32>
+ vector.transfer_write %4, %arg0[%c0, %c0] {in_bounds = [false, false]} : vector<4x4xf32>, memref<4x4xf32>
return
}
@@ -268,16 +268,16 @@ func.func @contraction4x4_ikj_xfer_read_tensor(%arg0 : tensor<4x2xf32>,
tensor<4x4xf32> {
%c0 = arith.constant 0 : index
%cf0 = arith.constant 0.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 :
+ %0 = vector.transfer_read %arg0[%c0, %c0], %cf0 {in_bounds = [false, false]} :
tensor<4x2xf32>, vector<4x2xf32>
- %1 = vector.transfer_read %arg1[%c0, %c0], %cf0 :
+ %1 = vector.transfer_read %arg1[%c0, %c0], %cf0 {in_bounds = [false, false]} :
tensor<2x4xf32>, vector<2x4xf32>
- %2 = vector.transfer_read %arg2[%c0, %c0], %cf0 :
+ %2 = vector.transfer_read %arg2[%c0, %c0], %cf0 {in_bounds = [false, false]} :
tensor<4x4xf32>, vector<4x4xf32>
- %3 = vector.contract #contraction_trait1 %0, %1, %2
- : vector<4x2xf32>, vector<2x4xf32> into vector<4x4xf32>
- %r = vector.transfer_write %3, %arg2[%c0, %c0]
- : vector<4x4xf32>, tensor<4x4xf32>
+ %3 = vector.contract #contraction_trait1 %0, %1, %2 :
+ vector<4x2xf32>, vector<2x4xf32> into vector<4x4xf32>
+ %r = vector.transfer_write %3, %arg2[%c0, %c0] {in_bounds = [false, false]} :
+ vector<4x4xf32>, tensor<4x4xf32>
return %r : tensor<4x4xf32>
}
diff --git a/mlir/test/Dialect/Vector/vector-warp-distribute.mlir b/mlir/test/Dialect/Vector/vector-warp-distribute.mlir
index bf90c4a6ebb3c..65050f4dd928d 100644
--- a/mlir/test/Dialect/Vector/vector-warp-distribute.mlir
+++ b/mlir/test/Dialect/Vector/vector-warp-distribute.mlir
@@ -106,14 +106,14 @@ func.func @warp(%laneid: index, %arg1: memref<1024xf32>, %arg2: memref<1024xf32>
%c0 = arith.constant 0 : index
%c32 = arith.constant 32 : index
%cst = arith.constant 0.000000e+00 : f32
- %2 = vector.transfer_read %sa[%c0], %cst : memref<128xf32, strided<[1], offset: ?>>, vector<32xf32>
- %3 = vector.transfer_read %sa[%c32], %cst : memref<128xf32, strided<[1], offset: ?>>, vector<32xf32>
- %4 = vector.transfer_read %sb[%c0], %cst : memref<128xf32, strided<[1], offset: ?>>, vector<64xf32>
- %5 = vector.transfer_read %sb[%c32], %cst : memref<128xf32, strided<[1], offset: ?>>, vector<64xf32>
+ %2 = vector.transfer_read %sa[%c0], %cst {in_bounds=[false]} : memref<128xf32, strided<[1], offset: ?>>, vector<32xf32>
+ %3 = vector.transfer_read %sa[%c32], %cst {in_bounds=[false]} : memref<128xf32, strided<[1], offset: ?>>, vector<32xf32>
+ %4 = vector.transfer_read %sb[%c0], %cst {in_bounds=[false]} : memref<128xf32, strided<[1], offset: ?>>, vector<64xf32>
+ %5 = vector.transfer_read %sb[%c32], %cst {in_bounds=[false]} : memref<128xf32, strided<[1], offset: ?>>, vector<64xf32>
%6 = arith.addf %2, %3 : vector<32xf32>
%7 = arith.addf %4, %5 : vector<64xf32>
- vector.transfer_write %6, %sc[%c0] : vector<32xf32>, memref<128xf32, strided<[1], offset: ?>>
- vector.transfer_write %7, %sc[%c32] : vector<64xf32>, memref<128xf32, strided<[1], offset: ?>>
+ vector.transfer_write %6, %sc[%c0] {in_bounds=[false]} : vector<32xf32>, memref<128xf32, strided<[1], offset: ?>>
+ vector.transfer_write %7, %sc[%c32] {in_bounds=[false]} : vector<64xf32>, memref<128xf32, strided<[1], offset: ?>>
}
return
}
@@ -138,8 +138,8 @@ func.func @warp_extract(%laneid: index, %arg1: memref<1024x1024xf32>, %gid : ind
%c0 = arith.constant 0 : index
%v = "test.dummy_op"() : () -> (vector<1xf32>)
%v1 = "test.dummy_op"() : () -> (vector<1x1xf32>)
- vector.transfer_write %v1, %arg1[%c0, %c0] : vector<1x1xf32>, memref<1024x1024xf32>
- vector.transfer_write %v, %arg1[%c0, %c0] : vector<1xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v1, %arg1[%c0, %c0] {in_bounds=[false, false]} : vector<1x1xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v, %arg1[%c0, %c0] {in_bounds=[false]} : vector<1xf32>, memref<1024x1024xf32>
}
return
}
@@ -166,8 +166,8 @@ func.func @warp_extract_4_elems(%laneid: index, %arg1: memref<1024x1024xf32>, %g
%c0 = arith.constant 0 : index
%v = "test.dummy_op"() : () -> (vector<4xf32>)
%v1 = "test.dummy_op"() : () -> (vector<4x1xf32>)
- vector.transfer_write %v1, %arg1[%c0, %c0] : vector<4x1xf32>, memref<1024x1024xf32>
- vector.transfer_write %v, %arg1[%c0, %c0] : vector<4xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v1, %arg1[%c0, %c0] {in_bounds=[false, false]} : vector<4x1xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v, %arg1[%c0, %c0] {in_bounds=[false]} : vector<4xf32>, memref<1024x1024xf32>
}
return
}
@@ -191,8 +191,8 @@ func.func @warp_extract_5_elems(%laneid: index, %arg1: memref<1024x1024xf32>, %g
%c0 = arith.constant 0 : index
%v = "test.dummy_op"() : () -> (vector<5xf32>)
%v1 = "test.dummy_op"() : () -> (vector<5x1xf32>)
- vector.transfer_write %v1, %arg1[%c0, %c0] : vector<5x1xf32>, memref<1024x1024xf32>
- vector.transfer_write %v, %arg1[%c0, %c0] : vector<5xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v1, %arg1[%c0, %c0] {in_bounds=[false, false]} : vector<5x1xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v, %arg1[%c0, %c0] {in_bounds=[false]} : vector<5xf32>, memref<1024x1024xf32>
}
return
}
@@ -216,8 +216,8 @@ func.func @warp_extract_8_elems(%laneid: index, %arg1: memref<1024x1024xf32>, %g
%c0 = arith.constant 0 : index
%v = "test.dummy_op"() : () -> (vector<8xf32>)
%v1 = "test.dummy_op"() : () -> (vector<8x1xf32>)
- vector.transfer_write %v1, %arg1[%c0, %c0] : vector<8x1xf32>, memref<1024x1024xf32>
- vector.transfer_write %v, %arg1[%c0, %c0] : vector<8xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v1, %arg1[%c0, %c0] {in_bounds=[false, false]} : vector<8x1xf32>, memref<1024x1024xf32>
+ vector.transfer_write %v, %arg1[%c0, %c0] {in_bounds=[false]} : vector<8xf32>, memref<1024x1024xf32>
}
return
}
@@ -284,8 +284,8 @@ func.func @warp_propagate_elementwise(%laneid: index, %dest: memref<1024xf32>) {
%id2 = affine.apply #map0()[%laneid]
// CHECK-PROP: vector.transfer_write %[[A1]], {{.*}} : vector<1xf32>, memref<1024xf32>
// CHECK-PROP: vector.transfer_write %[[A0]], {{.*}} : vector<2xf32>, memref<1024xf32>
- vector.transfer_write %r#0, %dest[%laneid] : vector<1xf32>, memref<1024xf32>
- vector.transfer_write %r#1, %dest[%id2] : vector<2xf32>, memref<1024xf32>
+ vector.transfer_write %r#0, %dest[%laneid] {in_bounds = [false]} : vector<1xf32>, memref<1024xf32>
+ vector.transfer_write %r#1, %dest[%id2] {in_bounds = [false]} : vector<2xf32>, memref<1024xf32>
return
}
@@ -342,13 +342,13 @@ func.func @warp_propagate_read(%laneid: index, %src: memref<1024xf32>, %dest: me
%c32 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
%r:2 = vector.warp_execute_on_lane_0(%laneid)[32] ->(vector<1xf32>, vector<2xf32>) {
- %2 = vector.transfer_read %src[%c0], %cst : memref<1024xf32>, vector<32xf32>
- %3 = vector.transfer_read %src[%c32], %cst : memref<1024xf32>, vector<64xf32>
+ %2 = vector.transfer_read %src[%c0], %cst {in_bounds=[false]} : memref<1024xf32>, vector<32xf32>
+ %3 = vector.transfer_read %src[%c32], %cst {in_bounds=[false]} : memref<1024xf32>, vector<64xf32>
vector.yield %2, %3 : vector<32xf32>, vector<64xf32>
}
%id2 = affine.apply #map0()[%laneid]
- vector.transfer_write %r#0, %dest[%laneid] : vector<1xf32>, memref<1024xf32>
- vector.transfer_write %r#1, %dest[%id2] : vector<2xf32>, memref<1024xf32>
+ vector.transfer_write %r#0, %dest[%laneid] {in_bounds = [false]} : vector<1xf32>, memref<1024xf32>
+ vector.transfer_write %r#1, %dest[%id2] {in_bounds = [false]} : vector<2xf32>, memref<1024xf32>
return
}
@@ -625,15 +625,15 @@ func.func @vector_reduction(%laneid: index, %m0: memref<4x2x32xf32>, %m1: memref
%f0 = arith.constant 0.0: f32
// CHECK-D: %[[R:.*]] = vector.warp_execute_on_lane_0(%{{.*}})[32] -> (vector<f32>) {
// CHECK-D: vector.warp_execute_on_lane_0(%{{.*}})[32] {
- // CHECK-D: vector.transfer_write %[[R]], %{{.*}}[] : vector<f32>, memref<f32>
+ // CHECK-D: vector.transfer_write %[[R]], %{{.*}}[] {{.*}} : vector<f32>, memref<f32>
vector.warp_execute_on_lane_0(%laneid)[32] {
%0 = vector.transfer_read %m0[%c0, %c0, %c0], %f0 {in_bounds = [true]} : memref<4x2x32xf32>, vector<32xf32>
- %1 = vector.transfer_read %m1[], %f0 : memref<f32>, vector<f32>
+ %1 = vector.transfer_read %m1[], %f0 {in_bounds=[]} : memref<f32>, vector<f32>
%2 = vector.extractelement %1[] : vector<f32>
%3 = vector.reduction <add>, %0 : vector<32xf32> into f32
%4 = arith.addf %3, %2 : f32
%5 = vector.broadcast %4 : f32 to vector<f32>
- vector.transfer_write %5, %m1[] : vector<f32>, memref<f32>
+ vector.transfer_write %5, %m1[] {in_bounds=[]} : vector<f32>, memref<f32>
}
return
}
@@ -929,10 +929,10 @@ func.func @lane_dependent_warp_propagate_read(
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
%r = vector.warp_execute_on_lane_0(%laneid)[32] -> (vector<1x1xf32>) {
- %2 = vector.transfer_read %src[%c0, %c0], %cst : memref<1x1024xf32>, vector<1x32xf32>
+ %2 = vector.transfer_read %src[%c0, %c0], %cst {in_bounds=[false, false]} : memref<1x1024xf32>, vector<1x32xf32>
vector.yield %2 : vector<1x32xf32>
}
- vector.transfer_write %r, %dest[%c0, %laneid] : vector<1x1xf32>, memref<1x1024xf32>
+ vector.transfer_write %r, %dest[%c0, %laneid] {in_bounds=[false, false]} : vector<1x1xf32>, memref<1x1024xf32>
return
}
@@ -942,7 +942,7 @@ func.func @warp_propagate_read_3d(%laneid: index, %src: memref<32x4x32xf32>) ->
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
%r = vector.warp_execute_on_lane_0(%laneid)[1024] -> (vector<1x1x4xf32>) {
- %2 = vector.transfer_read %src[%c0, %c0, %c0], %cst : memref<32x4x32xf32>, vector<32x4x32xf32>
+ %2 = vector.transfer_read %src[%c0, %c0, %c0], %cst {in_bounds=[false, false, false]} : memref<32x4x32xf32>, vector<32x4x32xf32>
vector.yield %2 : vector<32x4x32xf32>
}
return %r : vector<1x1x4xf32>
@@ -992,7 +992,7 @@ func.func @dont_duplicate_read(
// CHECK-PROP-NEXT: "blocking_use"
// CHECK-PROP-NEXT: vector.yield
%r = vector.warp_execute_on_lane_0(%laneid)[32] -> (vector<1xf32>) {
- %2 = vector.transfer_read %src[%c0], %cst : memref<1024xf32>, vector<32xf32>
+ %2 = vector.transfer_read %src[%c0], %cst {in_bounds=[false]} : memref<1024xf32>, vector<32xf32>
"blocking_use"(%2) : (vector<32xf32>) -> ()
vector.yield %2 : vector<32xf32>
}
@@ -1042,7 +1042,7 @@ func.func @warp_execute_has_broadcast_semantics(%laneid: index, %s0: f32, %v0: v
// CHECK-SCF-IF: "some_def_1"(%{{.*}}) : (vector<1xf32>) -> vector<1xf32>
// CHECK-SCF-IF: "some_def_1"(%{{.*}}) : (vector<1x1xf32>) -> vector<1x1xf32>
// CHECK-SCF-IF: memref.store {{.*}}[%[[C0]]] : memref<1xf32, 3>
- // CHECK-SCF-IF: vector.transfer_write {{.*}}[] : vector<f32>, memref<f32, 3>
+ // CHECK-SCF-IF: vector.transfer_write {{.*}}[] {{.*}} : vector<f32>, memref<f32, 3>
// CHECK-SCF-IF: vector.transfer_write {{.*}}[%[[C0]]] {in_bounds = [true]} : vector<1xf32>, memref<1xf32, 3>
// CHECK-SCF-IF: vector.transfer_write {{.*}}[%[[C0]], %[[C0]]] {in_bounds = [true, true]} : vector<1x1xf32>, memref<1x1xf32, 3>
@@ -1336,7 +1336,7 @@ func.func @warp_propagate_shape_cast(%laneid: index, %src: memref<32x4x32xf32>)
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
%r = vector.warp_execute_on_lane_0(%laneid)[1024] -> (vector<4xf32>) {
- %2 = vector.transfer_read %src[%c0, %c0, %c0], %cst : memref<32x4x32xf32>, vector<32x4x32xf32>
+ %2 = vector.transfer_read %src[%c0, %c0, %c0], %cst {in_bounds=[false, false, false]} : memref<32x4x32xf32>, vector<32x4x32xf32>
%3 = vector.shape_cast %2 : vector<32x4x32xf32> to vector<4096xf32>
vector.yield %3 : vector<4096xf32>
}
@@ -1410,8 +1410,8 @@ func.func @warp_propagate_masked_write(%laneid: index, %dest: memref<4096xf32>)
%mask2 = "mask_def_1"() : () -> (vector<32xi1>)
%0 = "some_def_0"() : () -> (vector<4096xf32>)
%1 = "some_def_1"() : () -> (vector<32xf32>)
- vector.transfer_write %0, %dest[%c0], %mask : vector<4096xf32>, memref<4096xf32>
- vector.transfer_write %1, %dest[%c0], %mask2 : vector<32xf32>, memref<4096xf32>
+ vector.transfer_write %0, %dest[%c0], %mask {in_bounds=[false]} : vector<4096xf32>, memref<4096xf32>
+ vector.transfer_write %1, %dest[%c0], %mask2 {in_bounds=[false]} : vector<32xf32>, memref<4096xf32>
vector.yield
}
return
@@ -1513,7 +1513,7 @@ func.func @warp_propagate_unconnected_read_write(%laneid: index, %buffer: memref
%r:2 = vector.warp_execute_on_lane_0(%laneid)[32] -> (vector<2xf32>, vector<4xf32>) {
%cst = arith.constant dense<2.0> : vector<128xf32>
%0 = vector.transfer_read %buffer[%c0], %f0 {in_bounds = [true]} : memref<128xf32>, vector<128xf32>
- vector.transfer_write %cst, %buffer[%c0] : vector<128xf32>, memref<128xf32>
+ vector.transfer_write %cst, %buffer[%c0] {in_bounds=[false]} : vector<128xf32>, memref<128xf32>
%1 = vector.broadcast %f1 : f32 to vector<64xf32>
vector.yield %1, %0 : vector<64xf32>, vector<128xf32>
}
@@ -1566,7 +1566,7 @@ func.func @warp_propagate_nd_write(%laneid: index, %dest: memref<4x1024xf32>) {
%c0 = arith.constant 0 : index
vector.warp_execute_on_lane_0(%laneid)[32] -> () {
%0 = "some_def"() : () -> (vector<4x1024xf32>)
- vector.transfer_write %0, %dest[%c0, %c0] : vector<4x1024xf32>, memref<4x1024xf32>
+ vector.transfer_write %0, %dest[%c0, %c0] {in_bounds=[false, false]} : vector<4x1024xf32>, memref<4x1024xf32>
vector.yield
}
return
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/dual_sparse_conv_2d.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/dual_sparse_conv_2d.mlir
index f33a3abc7a5f7..d00f20cfd1695 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/dual_sparse_conv_2d.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/dual_sparse_conv_2d.mlir
@@ -150,7 +150,7 @@ module {
// CHECK-SAME: ( 0, 0, 3, 6, -3, -6 ),
// CHECK-SAME: ( 2, -1, 3, 0, -3, 0 ) )
//
- %v = vector.transfer_read %0[%c0, %c0], %i0
+ %v = vector.transfer_read %0[%c0, %c0], %i0 {in_bounds=[false, false]}
: tensor<6x6xi32>, vector<6x6xi32>
vector.print %v : vector<6x6xi32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/padded_sparse_conv_2d.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/padded_sparse_conv_2d.mlir
index 50dd989416e2a..c2e822f9bf0a5 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/padded_sparse_conv_2d.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/padded_sparse_conv_2d.mlir
@@ -125,7 +125,7 @@ func.func @main() {
// CHECK-SAME: ( ( 180 ), ( 240 ), ( 300 ), ( 300 ), ( 300 ), ( 300 ), ( 240 ), ( 180 ) ),
// CHECK-SAME: ( ( 144 ), ( 192 ), ( 240 ), ( 240 ), ( 240 ), ( 240 ), ( 192 ), ( 144 ) ),
// CHECK-SAME: ( ( 108 ), ( 144 ), ( 180 ), ( 180 ), ( 180 ), ( 180 ), ( 144 ), ( 108 ) ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<3x8x8x1xf32>, vector<3x8x8x1xf32>
vector.print %dense_v : vector<3x8x8x1xf32>
@@ -153,7 +153,7 @@ func.func @main() {
// CHECK-SAME: ( ( 180 ), ( 240 ), ( 300 ), ( 300 ), ( 300 ), ( 300 ), ( 240 ), ( 180 ) ),
// CHECK-SAME: ( ( 144 ), ( 192 ), ( 240 ), ( 240 ), ( 240 ), ( 240 ), ( 192 ), ( 144 ) ),
// CHECK-SAME: ( ( 108 ), ( 144 ), ( 180 ), ( 180 ), ( 180 ), ( 180 ), ( 144 ), ( 108 ) ) ) )
- %CDCC_NHWC_v = vector.transfer_read %CDCC_NHWC_ret[%c0, %c0, %c0, %c0], %zero
+ %CDCC_NHWC_v = vector.transfer_read %CDCC_NHWC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<3x8x8x1xf32>, vector<3x8x8x1xf32>
vector.print %CDCC_NHWC_v : vector<3x8x8x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_block_matmul.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_block_matmul.mlir
index efef01155cc78..a1d74fb632dca 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_block_matmul.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_block_matmul.mlir
@@ -126,7 +126,7 @@ module {
func.func @dump_dense_f64(%arg0: tensor<4x4xf64>) {
%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f64
- %0 = vector.transfer_read %arg0[%c0, %c0], %d0: tensor<4x4xf64>, vector<4x4xf64>
+ %0 = vector.transfer_read %arg0[%c0, %c0], %d0 {in_bounds=[false, false]}: tensor<4x4xf64>, vector<4x4xf64>
vector.print %0 : vector<4x4xf64>
return
}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir
index 3b5168db23c58..30f86d37aae22 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir
@@ -207,28 +207,28 @@ module {
// CHECK: ( -4, -3, -2, -1, 0, 1, 2, 3, 4, 305 )
//
%c0 = call @sparse_cast_s32_to_f32(%1, %zero_f) : (tensor<10xi32, #SV>, tensor<10xf32>) -> tensor<10xf32>
- %v0 = vector.transfer_read %c0[%z], %f: tensor<10xf32>, vector<10xf32>
+ %v0 = vector.transfer_read %c0[%z], %f {in_bounds=[false]}: tensor<10xf32>, vector<10xf32>
vector.print %v0 : vector<10xf32>
//
// CHECK: ( 4.29497e+09, 4.29497e+09, 4.29497e+09, 4.29497e+09, 0, 1, 2, 3, 4, 305 )
//
%c1 = call @sparse_cast_u32_to_f32(%1, %zero_f) : (tensor<10xi32, #SV>, tensor<10xf32>) -> tensor<10xf32>
- %v1 = vector.transfer_read %c1[%z], %f: tensor<10xf32>, vector<10xf32>
+ %v1 = vector.transfer_read %c1[%z], %f {in_bounds=[false]}: tensor<10xf32>, vector<10xf32>
vector.print %v1 : vector<10xf32>
//
// CHECK: ( -4, -3, -2, -1, 0, 1, 2, 3, 4, 305 )
//
%c2 = call @sparse_cast_f32_to_s32(%3, %zero_i) : (tensor<10xf32, #SV>, tensor<10xi32>) -> tensor<10xi32>
- %v2 = vector.transfer_read %c2[%z], %i: tensor<10xi32>, vector<10xi32>
+ %v2 = vector.transfer_read %c2[%z], %i {in_bounds=[false]}: tensor<10xi32>, vector<10xi32>
vector.print %v2 : vector<10xi32>
//
// CHECK: ( 4294967295, 4294967294, 4294967293, 4294967292, 0, 1, 2, 3, 4, 305 )
//
%c3 = call @sparse_cast_f64_to_u32(%7, %zero_i) : (tensor<10xf64, #SV>, tensor<10xi32>) -> tensor<10xi32>
- %v3 = vector.transfer_read %c3[%z], %i: tensor<10xi32>, vector<10xi32>
+ %v3 = vector.transfer_read %c3[%z], %i {in_bounds=[false]}: tensor<10xi32>, vector<10xi32>
%vu = vector.bitcast %v3 : vector<10xi32> to vector<10xui32>
vector.print %vu : vector<10xui32>
@@ -236,42 +236,42 @@ module {
// CHECK: ( -4.4, -3.3, -2.2, -1.1, 0, 1.1, 2.2, 3.3, 4.4, 305.5 )
//
%c4 = call @sparse_cast_f32_to_f64(%3, %zero_d) : (tensor<10xf32, #SV>, tensor<10xf64>) -> tensor<10xf64>
- %v4 = vector.transfer_read %c4[%z], %d: tensor<10xf64>, vector<10xf64>
+ %v4 = vector.transfer_read %c4[%z], %d {in_bounds=[false]}: tensor<10xf64>, vector<10xf64>
vector.print %v4 : vector<10xf64>
//
// CHECK: ( -4.4, -3.3, -2.2, -1.1, 0, 1.1, 2.2, 3.3, 4.4, 305.5 )
//
%c5 = call @sparse_cast_f64_to_f32(%5, %zero_f) : (tensor<10xf64, #SV>, tensor<10xf32>) -> tensor<10xf32>
- %v5 = vector.transfer_read %c5[%z], %f: tensor<10xf32>, vector<10xf32>
+ %v5 = vector.transfer_read %c5[%z], %f {in_bounds=[false]}: tensor<10xf32>, vector<10xf32>
vector.print %v5 : vector<10xf32>
//
// CHECK: ( -4, -3, -2, -1, 0, 1, 2, 3, 4, 305 )
//
%c6 = call @sparse_cast_s32_to_u64(%1, %zero_l) : (tensor<10xi32, #SV>, tensor<10xi64>) -> tensor<10xi64>
- %v6 = vector.transfer_read %c6[%z], %l: tensor<10xi64>, vector<10xi64>
+ %v6 = vector.transfer_read %c6[%z], %l {in_bounds=[false]}: tensor<10xi64>, vector<10xi64>
vector.print %v6 : vector<10xi64>
//
// CHECK: ( 4294967292, 4294967293, 4294967294, 4294967295, 0, 1, 2, 3, 4, 305 )
//
%c7 = call @sparse_cast_u32_to_s64(%1, %zero_l) : (tensor<10xi32, #SV>, tensor<10xi64>) -> tensor<10xi64>
- %v7 = vector.transfer_read %c7[%z], %l: tensor<10xi64>, vector<10xi64>
+ %v7 = vector.transfer_read %c7[%z], %l {in_bounds=[false]}: tensor<10xi64>, vector<10xi64>
vector.print %v7 : vector<10xi64>
//
// CHECK: ( -4, -3, -2, -1, 0, 1, 2, 3, 4, 49 )
//
%c8 = call @sparse_cast_i32_to_i8(%1, %zero_b) : (tensor<10xi32, #SV>, tensor<10xi8>) -> tensor<10xi8>
- %v8 = vector.transfer_read %c8[%z], %b: tensor<10xi8>, vector<10xi8>
+ %v8 = vector.transfer_read %c8[%z], %b {in_bounds=[false]}: tensor<10xi8>, vector<10xi8>
vector.print %v8 : vector<10xi8>
//
// CHECK: ( -1064514355, -1068289229, -1072902963, -1081291571, 0, 1066192077, 1074580685, 1079194419, 1082969293, 1134084096 )
//
%c9 = call @sparse_cast_f32_as_s32(%3, %zero_i) : (tensor<10xf32, #SV>, tensor<10xi32>) -> tensor<10xi32>
- %v9 = vector.transfer_read %c9[%z], %i: tensor<10xi32>, vector<10xi32>
+ %v9 = vector.transfer_read %c9[%z], %i {in_bounds=[false]}: tensor<10xi32>, vector<10xi32>
vector.print %v9 : vector<10xi32>
// Release the resources.
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cmp.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cmp.mlir
index edeffea211717..51b0aebad8d02 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cmp.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cmp.mlir
@@ -150,7 +150,7 @@ module {
// CHECK-NEXT: values : ( 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0 )
// CHECK-NEXT: ----
//
- %v = vector.transfer_read %all_dn_out[%c0, %c0], %d0
+ %v = vector.transfer_read %all_dn_out[%c0, %c0], %d0 {in_bounds=[false, false]}
: tensor<4x4xi8>, vector<4x4xi8>
vector.print %v : vector<4x4xi8>
sparse_tensor.print %lhs_sp_out : tensor<4x4xi8, #DCSR>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_collapse_shape.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_collapse_shape.mlir
index 12132155e7cb3..934ddb9088b0a 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_collapse_shape.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_collapse_shape.mlir
@@ -226,23 +226,23 @@ module {
// CHECK-NEXT: values : ( 1, 3, 5, 7, 9, 21, 23, 25, 27, 29, 41, 43, 45, 47, 49 )
// CHECK-NEXT: ----
//
- %v0 = vector.transfer_read %collapse0[%c0], %df: tensor<12xf64>, vector<12xf64>
+ %v0 = vector.transfer_read %collapse0[%c0], %df {in_bounds=[false]}: tensor<12xf64>, vector<12xf64>
vector.print %v0 : vector<12xf64>
- %v1 = vector.transfer_read %collapse1[%c0], %df: tensor<12xf64>, vector<12xf64>
+ %v1 = vector.transfer_read %collapse1[%c0], %df {in_bounds=[false]}: tensor<12xf64>, vector<12xf64>
vector.print %v1 : vector<12xf64>
sparse_tensor.print %collapse2 : tensor<12xf64, #SparseVector>
sparse_tensor.print %collapse3 : tensor<12xf64, #SparseVector>
- %v4 = vector.transfer_read %collapse4[%c0, %c0], %df: tensor<6x10xf64>, vector<6x10xf64>
+ %v4 = vector.transfer_read %collapse4[%c0, %c0], %df {in_bounds=[false, false]}: tensor<6x10xf64>, vector<6x10xf64>
vector.print %v4 : vector<6x10xf64>
- %v5 = vector.transfer_read %collapse5[%c0, %c0], %df: tensor<6x10xf64>, vector<6x10xf64>
+ %v5 = vector.transfer_read %collapse5[%c0, %c0], %df {in_bounds=[false, false]}: tensor<6x10xf64>, vector<6x10xf64>
vector.print %v5 : vector<6x10xf64>
sparse_tensor.print %collapse6 : tensor<6x10xf64, #SparseMatrix>
sparse_tensor.print %collapse7 : tensor<6x10xf64, #SparseMatrix>
- %v8 = vector.transfer_read %collapse8[%c0, %c0], %df: tensor<?x?xf64>, vector<6x10xf64>
+ %v8 = vector.transfer_read %collapse8[%c0, %c0], %df {in_bounds=[false, false]}: tensor<?x?xf64>, vector<6x10xf64>
vector.print %v8 : vector<6x10xf64>
- %v9 = vector.transfer_read %collapse9[%c0, %c0], %df: tensor<?x?xf64>, vector<6x10xf64>
+ %v9 = vector.transfer_read %collapse9[%c0, %c0], %df {in_bounds=[false, false]}: tensor<?x?xf64>, vector<6x10xf64>
vector.print %v9 : vector<6x10xf64>
sparse_tensor.print %collapse10 : tensor<?x?xf64, #SparseMatrix>
sparse_tensor.print %collapse11 : tensor<?x?xf64, #SparseMatrix>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir
index 3e46b6d65112f..3ffb6214197a7 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir
@@ -107,7 +107,7 @@ func.func @main() {
// CHECK: ( ( ( 12 ), ( 28 ), ( 28 ), ( 28 ), ( 12 ), ( 12 ) ),
// CHECK-SAME: ( ( 12 ), ( 12 ), ( 12 ), ( 12 ), ( 12 ), ( 12 ) ),
// CHECK-SAME: ( ( 12 ), ( 12 ), ( 12 ), ( 12 ), ( 12 ), ( 12 ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0], %zero {in_bounds=[false, false, false]}
: tensor<?x?x?xf32>, vector<3x6x1xf32>
vector.print %dense_v : vector<3x6x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir
index 97e9d1783f67f..14418644190b0 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir
@@ -176,7 +176,7 @@ module {
// CHECK-SAME: ( 0, 0, 3, 6, -3, -6 ),
// CHECK-SAME: ( 2, -1, 3, 0, -3, 0 ) )
//
- %v = vector.transfer_read %0[%c0, %c0], %i0
+ %v = vector.transfer_read %0[%c0, %c0], %i0 {in_bounds=[false, false]}
: tensor<6x6xi32>, vector<6x6xi32>
vector.print %v : vector<6x6xi32>
@@ -263,7 +263,7 @@ module {
// CHECK-SAME: ( 0, 0, 3, 6, -3, -6 ),
// CHECK-SAME: ( 2, -1, 3, 0, -3, 0 ) )
//
- %v6 = vector.transfer_read %6[%c0, %c0], %i0
+ %v6 = vector.transfer_read %6[%c0, %c0], %i0 {in_bounds=[false, false]}
: tensor<6x6xi32>, vector<6x6xi32>
vector.print %v : vector<6x6xi32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_55.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_55.mlir
index 00805d198013d..b16677ec5d017 100755
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_55.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_55.mlir
@@ -194,17 +194,17 @@ module {
// CHECK-SAME: ( 0, 0, 0, -12, 0, -6 ),
// CHECK-SAME: ( -60, -27, -50, 0, -16, 0 ) )
//
- %v0 = vector.transfer_read %0[%c0, %c0], %i0 : tensor<6x6xi32>, vector<6x6xi32>
+ %v0 = vector.transfer_read %0[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<6x6xi32>, vector<6x6xi32>
vector.print %v0 : vector<6x6xi32>
- %v1 = vector.transfer_read %1[%c0, %c0], %i0 : tensor<6x6xi32>, vector<6x6xi32>
+ %v1 = vector.transfer_read %1[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<6x6xi32>, vector<6x6xi32>
vector.print %v1 : vector<6x6xi32>
- %v2 = vector.transfer_read %2[%c0, %c0], %i0 : tensor<6x6xi32>, vector<6x6xi32>
+ %v2 = vector.transfer_read %2[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<6x6xi32>, vector<6x6xi32>
vector.print %v2 : vector<6x6xi32>
- %v3 = vector.transfer_read %3[%c0, %c0], %i0 : tensor<6x6xi32>, vector<6x6xi32>
+ %v3 = vector.transfer_read %3[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<6x6xi32>, vector<6x6xi32>
vector.print %v3 : vector<6x6xi32>
- %v4 = vector.transfer_read %4[%c0, %c0], %i0 : tensor<6x6xi32>, vector<6x6xi32>
+ %v4 = vector.transfer_read %4[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<6x6xi32>, vector<6x6xi32>
vector.print %v4 : vector<6x6xi32>
- %v5 = vector.transfer_read %5[%c0, %c0], %i0 : tensor<6x6xi32>, vector<6x6xi32>
+ %v5 = vector.transfer_read %5[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<6x6xi32>, vector<6x6xi32>
vector.print %v5 : vector<6x6xi32>
// Release resources.
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nchw_fchw.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nchw_fchw.mlir
index 9150e97e72481..76c47084e1f07 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nchw_fchw.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nchw_fchw.mlir
@@ -131,7 +131,7 @@ func.func @main() {
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ) ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x1x6x6xf32>
vector.print %dense_v : vector<3x1x6x6xf32>
@@ -153,7 +153,7 @@ func.func @main() {
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ) ) ) )
- %v1 = vector.transfer_read %CCCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v1 = vector.transfer_read %CCCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x1x6x6xf32>
vector.print %v1 : vector<3x1x6x6xf32>
@@ -175,7 +175,7 @@ func.func @main() {
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ) ) ) )
- %v2 = vector.transfer_read %CDCD_ret[%c0, %c0, %c0, %c0], %zero
+ %v2 = vector.transfer_read %CDCD_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x1x6x6xf32>
vector.print %v2 : vector<3x1x6x6xf32>
@@ -197,7 +197,7 @@ func.func @main() {
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ) ) ) )
- %v3 = vector.transfer_read %dual_CCCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v3 = vector.transfer_read %dual_CCCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x1x6x6xf32>
vector.print %v3 : vector<3x1x6x6xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir
index 429175c1a1645..b50d91a09c462 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir
@@ -138,7 +138,7 @@ func.func @main() {
// CHECK-SAME: ( ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ) ),
// CHECK-SAME: ( ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ) ),
// CHECK-SAME: ( ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ) ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x6x6x1xf32>
vector.print %dense_v : vector<3x6x6x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir
index b23b2dcc173d9..7d9283fe17d79 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir
@@ -162,7 +162,7 @@ func.func @main() {
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ),
// CHECK-SAME: ( 108, 108, 108, 108, 108, 108 ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0], %zero {in_bounds=[false, false, false]}
: tensor<?x?x?xf32>, vector<6x6x6xf32>
vector.print %dense_v : vector<6x6x6xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir
index 8fb6704c7f509..b2c31411c565d 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir
@@ -142,7 +142,7 @@ func.func @main() {
// CHECK-SAME: ( ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ), ( 108 ) ) ) ) )
%dense_ret = call @conv_3d_ndhwc_dhwcf(%in3D_ndhwc, %filter3D_ndhwc, %out3D_ndhwc)
: (tensor<?x?x?x?x?xf32>, tensor<?x?x?x?x?xf32>, tensor<?x?x?x?x?xf32>) -> (tensor<?x?x?x?x?xf32>)
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false, false]}
: tensor<?x?x?x?x?xf32>, vector<1x6x6x6x1xf32>
vector.print %dense_v : vector<1x6x6x6x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_element.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_element.mlir
index a2ec6df392aaa..9772b81d66224 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_element.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_element.mlir
@@ -47,7 +47,7 @@ module {
func.func @dump(%arg0: tensor<2x3x4xf32>) {
%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f32
- %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0: tensor<2x3x4xf32>, vector<2x3x4xf32>
+ %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0 {in_bounds=[false, false, false]}: tensor<2x3x4xf32>, vector<2x3x4xf32>
vector.print %0 : vector<2x3x4xf32>
return
}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
index e145c4542a7bf..a32e63261a869 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
@@ -64,7 +64,7 @@ module {
func.func @dump_234(%arg0: tensor<2x3x4xf64>) {
%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f64
- %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0: tensor<2x3x4xf64>, vector<2x3x4xf64>
+ %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0 {in_bounds=[false, false, false]}: tensor<2x3x4xf64>, vector<2x3x4xf64>
vector.print %0 : vector<2x3x4xf64>
return
}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir
index 12f8e34500d81..eb68395e0a72e 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir
@@ -66,7 +66,7 @@ module {
func.func @dump(%arg0: tensor<2x3x4xf64>) {
%c0 = arith.constant 0 : index
%d0 = arith.constant -1.0 : f64
- %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0: tensor<2x3x4xf64>, vector<2x3x4xf64>
+ %0 = vector.transfer_read %arg0[%c0, %c0, %c0], %d0 {in_bounds=[false, false, false]}: tensor<2x3x4xf64>, vector<2x3x4xf64>
vector.print %0 : vector<2x3x4xf64>
return
}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_coo_test.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_coo_test.mlir
index c16ae0de18203..f987889fe8102 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_coo_test.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_coo_test.mlir
@@ -188,13 +188,13 @@ module {
//
%f0 = arith.constant 0.0 : f32
scf.for %i = %c0 to %c8 step %c1 {
- %v1 = vector.transfer_read %C1[%i, %c0], %f0
+ %v1 = vector.transfer_read %C1[%i, %c0], %f0 {in_bounds=[false]}
: tensor<8x8xf32>, vector<8xf32>
- %v2 = vector.transfer_read %C2[%i, %c0], %f0
+ %v2 = vector.transfer_read %C2[%i, %c0], %f0 {in_bounds=[false]}
: tensor<8x8xf32>, vector<8xf32>
- %v3 = vector.transfer_read %C3[%i, %c0], %f0
+ %v3 = vector.transfer_read %C3[%i, %c0], %f0 {in_bounds=[false]}
: tensor<8x8xf32>, vector<8xf32>
- %v4 = vector.transfer_read %C4[%i, %c0], %f0
+ %v4 = vector.transfer_read %C4[%i, %c0], %f0 {in_bounds=[false]}
: tensor<8x8xf32>, vector<8xf32>
vector.print %v1 : vector<8xf32>
vector.print %v2 : vector<8xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir
index 40738a9f7d7f1..61c111f08cb35 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir
@@ -111,28 +111,28 @@ func.func @main() {
// CHECK: ( ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 520 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %dense_v : vector<3x3x3x1xf32>
// CHECK-NEXT: ( ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 520 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ) )
- %v_dual = vector.transfer_read %dual_CDCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v_dual = vector.transfer_read %dual_CDCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %v_dual : vector<3x3x3x1xf32>
// CHECK-NEXT: ( ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 520 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ) )
- %v1 = vector.transfer_read %CCCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v1 = vector.transfer_read %CCCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %v1 : vector<3x3x3x1xf32>
// CHECK-NEXT: ( ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 520 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ),
// CHECK-SAME: ( ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ), ( ( 540 ), ( 540 ), ( 540 ) ) ) )
- %v2 = vector.transfer_read %CDCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v2 = vector.transfer_read %CDCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %v1 : vector<3x3x3x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_expand_shape.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_expand_shape.mlir
index 5e021596efea6..eaa34ebde16fb 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_expand_shape.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_expand_shape.mlir
@@ -180,17 +180,17 @@ module {
// CHECK-NEXT: ( ( ( 1.1, 1.2 ), ( 1.3, 1.4 ) ), ( ( 2.1, 2.2 ), ( 2.3, 2.4 ) ), ( ( 3.1, 3.2 ), ( 3.3, 3.4 ) ) )
// CHECK-NEXT: ( ( ( 1.1, 1.2 ), ( 1.3, 1.4 ) ), ( ( 2.1, 2.2 ), ( 2.3, 2.4 ) ), ( ( 3.1, 3.2 ), ( 3.3, 3.4 ) ) )
//
- %m0 = vector.transfer_read %expand0[%c0, %c0], %df: tensor<3x4xf64>, vector<3x4xf64>
+ %m0 = vector.transfer_read %expand0[%c0, %c0], %df {in_bounds=[false, false]}: tensor<3x4xf64>, vector<3x4xf64>
vector.print %m0 : vector<3x4xf64>
- %m1 = vector.transfer_read %expand1[%c0, %c0], %df: tensor<3x4xf64>, vector<3x4xf64>
+ %m1 = vector.transfer_read %expand1[%c0, %c0], %df {in_bounds=[false, false]}: tensor<3x4xf64>, vector<3x4xf64>
vector.print %m1 : vector<3x4xf64>
- %m4 = vector.transfer_read %expand4[%c0, %c0, %c0], %df: tensor<3x2x2xf64>, vector<3x2x2xf64>
+ %m4 = vector.transfer_read %expand4[%c0, %c0, %c0], %df {in_bounds=[false, false, false]}: tensor<3x2x2xf64>, vector<3x2x2xf64>
vector.print %m4 : vector<3x2x2xf64>
- %m5 = vector.transfer_read %expand5[%c0, %c0, %c0], %df: tensor<3x2x2xf64>, vector<3x2x2xf64>
+ %m5 = vector.transfer_read %expand5[%c0, %c0, %c0], %df {in_bounds=[false, false, false]}: tensor<3x2x2xf64>, vector<3x2x2xf64>
vector.print %m5 : vector<3x2x2xf64>
- %m8 = vector.transfer_read %expand8[%c0, %c0, %c0], %df: tensor<?x2x?xf64>, vector<3x2x2xf64>
+ %m8 = vector.transfer_read %expand8[%c0, %c0, %c0], %df {in_bounds=[false, false, false]}: tensor<?x2x?xf64>, vector<3x2x2xf64>
vector.print %m8 : vector<3x2x2xf64>
- %m9 = vector.transfer_read %expand9[%c0, %c0, %c0], %df: tensor<?x2x?xf64>, vector<3x2x2xf64>
+ %m9 = vector.transfer_read %expand9[%c0, %c0, %c0], %df {in_bounds=[false, false, false]}: tensor<?x2x?xf64>, vector<3x2x2xf64>
vector.print %m9 : vector<3x2x2xf64>
//
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
index 93b8eda2c2aec..b5fb4cd378ec8 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
@@ -95,7 +95,7 @@ module {
// CHECK-SAME: ( 0, 0, 3, 6, -3, -6 ),
// CHECK-SAME: ( 2, -1, 3, 0, -3, 0 ) )
//
- %v = vector.transfer_read %0[%c0, %c0], %i0
+ %v = vector.transfer_read %0[%c0, %c0], %i0 {in_bounds=[false, false]}
: tensor<6x6xi32>, vector<6x6xi32>
vector.print %v : vector<6x6xi32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir
index fc7b82fdecea3..f87c7f8316160 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir
@@ -187,14 +187,14 @@ module {
// CHECK-NEXT: ( ( 0, 0, 0, 0 ), ( 0, 2, 2, 3 ), ( 0, 2, 12, 24 ) )
// CHECK-NEXT: ( ( 1, 2, 3, 4 ), ( 2, 4, 4, 5 ), ( 3, 4, 7, 9 ) )
//
- %vv0 = vector.transfer_read %0[%c0], %du: tensor<8xi64>, vector<8xi64>
- %vv1 = vector.transfer_read %1[%c0], %du: tensor<8xi64>, vector<8xi64>
- %vv2 = vector.transfer_read %2[%c0], %du: tensor<8xi64>, vector<8xi64>
- %vv3 = vector.transfer_read %3[%c0], %du: tensor<8xi64>, vector<8xi64>
- %vv4 = vector.transfer_read %4[%c0,%c0], %du: tensor<3x4xi64>, vector<3x4xi64>
- %vv5 = vector.transfer_read %5[%c0,%c0], %du: tensor<3x4xi64>, vector<3x4xi64>
- %vv6 = vector.transfer_read %6[%c0,%c0], %du: tensor<3x4xi64>, vector<3x4xi64>
- %vv7 = vector.transfer_read %7[%c0,%c0], %du: tensor<3x4xi64>, vector<3x4xi64>
+ %vv0 = vector.transfer_read %0[%c0], %du {in_bounds=[false]}: tensor<8xi64>, vector<8xi64>
+ %vv1 = vector.transfer_read %1[%c0], %du {in_bounds=[false]}: tensor<8xi64>, vector<8xi64>
+ %vv2 = vector.transfer_read %2[%c0], %du {in_bounds=[false]}: tensor<8xi64>, vector<8xi64>
+ %vv3 = vector.transfer_read %3[%c0], %du {in_bounds=[false]}: tensor<8xi64>, vector<8xi64>
+ %vv4 = vector.transfer_read %4[%c0,%c0], %du {in_bounds=[false, false]}: tensor<3x4xi64>, vector<3x4xi64>
+ %vv5 = vector.transfer_read %5[%c0,%c0], %du {in_bounds=[false, false]}: tensor<3x4xi64>, vector<3x4xi64>
+ %vv6 = vector.transfer_read %6[%c0,%c0], %du {in_bounds=[false, false]}: tensor<3x4xi64>, vector<3x4xi64>
+ %vv7 = vector.transfer_read %7[%c0,%c0], %du {in_bounds=[false, false]}: tensor<3x4xi64>, vector<3x4xi64>
vector.print %vv0 : vector<8xi64>
vector.print %vv1 : vector<8xi64>
vector.print %vv2 : vector<8xi64>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir
index b9d1148301dd1..221cc209952b7 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir
@@ -120,7 +120,7 @@ module {
//
// CHECK: ( 889, 1514, -21, -3431 )
//
- %v = vector.transfer_read %0[%c0], %i0: tensor<?xi32>, vector<4xi32>
+ %v = vector.transfer_read %0[%c0], %i0 {in_bounds=[false]}: tensor<?xi32>, vector<4xi32>
vector.print %v : vector<4xi32>
// Release the resources.
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
index 5415625ff05d6..a4edc02e62c91 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
@@ -237,11 +237,11 @@ module {
-> (tensor<2xi32>, tensor<3x2xi32>), tensor<3xf64>, (i32, i64), index
// CHECK-NEXT: ( 1, 2, 3 )
- %vd = vector.transfer_read %d[%c0], %f0 : tensor<3xf64>, vector<3xf64>
+ %vd = vector.transfer_read %d[%c0], %f0 {in_bounds=[false]} : tensor<3xf64>, vector<3xf64>
vector.print %vd : vector<3xf64>
// CHECK-NEXT: ( ( 1, 2 ), ( 5, 6 ), ( 7, 8 ) )
- %vi = vector.transfer_read %i[%c0, %c0], %i0 : tensor<3x2xi32>, vector<3x2xi32>
+ %vi = vector.transfer_read %i[%c0, %c0], %i0 {in_bounds=[false, false]} : tensor<3x2xi32>, vector<3x2xi32>
vector.print %vi : vector<3x2xi32>
// CHECK-NEXT: 3
@@ -256,7 +256,7 @@ module {
-> (tensor<3xi32>, tensor<3xi32>), tensor<4xf64>, (i32, i64), index
// CHECK-NEXT: ( 1, 2, 3 )
- %vd_csr = vector.transfer_read %rd_csr[%c0], %f0 : tensor<4xf64>, vector<3xf64>
+ %vd_csr = vector.transfer_read %rd_csr[%c0], %f0 {in_bounds=[false]} : tensor<4xf64>, vector<3xf64>
vector.print %vd_csr : vector<3xf64>
// CHECK-NEXT: 3
@@ -271,14 +271,14 @@ module {
-> (tensor<4xindex>, tensor<6x2xindex>), tensor<6xf64>, (i32, tensor<i64>), index
// CHECK-NEXT: ( 1, 2, 3, 4, 5 )
- %vbd = vector.transfer_read %bd[%c0], %f0 : tensor<6xf64>, vector<5xf64>
+ %vbd = vector.transfer_read %bd[%c0], %f0 {in_bounds=[false]} : tensor<6xf64>, vector<5xf64>
vector.print %vbd : vector<5xf64>
// CHECK-NEXT: 5
vector.print %ld : index
// CHECK-NEXT: ( ( 1, 2 ), ( 5, 6 ), ( 7, 8 ), ( 2, 3 ), ( 4, 2 ), ( {{.*}}, {{.*}} ) )
- %vbi = vector.transfer_read %bi[%c0, %c0], %c0 : tensor<6x2xindex>, vector<6x2xindex>
+ %vbi = vector.transfer_read %bi[%c0, %c0], %c0 {in_bounds=[false, false]} : tensor<6x2xindex>, vector<6x2xindex>
vector.print %vbi : vector<6x2xindex>
// CHECK-NEXT: 10
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_permute.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_permute.mlir
index 664a86c7ad58f..7aabbd8dcf98b 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_permute.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_permute.mlir
@@ -70,7 +70,7 @@ module {
func.func @dump(%a: tensor<2x3x4xf64>) {
%c0 = arith.constant 0 : index
%f0 = arith.constant 0.0 : f64
- %v = vector.transfer_read %a[%c0, %c0, %c0], %f0 : tensor<2x3x4xf64>, vector<2x3x4xf64>
+ %v = vector.transfer_read %a[%c0, %c0, %c0], %f0 {in_bounds=[false, false, false]} : tensor<2x3x4xf64>, vector<2x3x4xf64>
vector.print %v : vector<2x3x4xf64>
return
}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pooling_nhwc.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pooling_nhwc.mlir
index 7c78bfc362007..65c5752f918a4 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pooling_nhwc.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pooling_nhwc.mlir
@@ -69,7 +69,7 @@ func.func @main() {
%CCCC_ret = call @pooling_nhwc_sum_CCCC(%in_CCCC, %filter) : (tensor<1x4x4x1xf32, #CCCC>, tensor<2x2xf32>) -> tensor<1x3x3x1xf32, #CCCC>
// CHECK: ( ( ( ( 6 ), ( 6 ), ( 6 ) ), ( ( 6 ), ( 6 ), ( 6 ) ), ( ( 6 ), ( 6 ), ( 6 ) ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<1x3x3x1xf32>, vector<1x3x3x1xf32>
vector.print %dense_v : vector<1x3x3x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir
index 873322929232a..2f606bdf5c23d 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir
@@ -85,7 +85,7 @@ module {
// CHECK-SAME: ( -254, 0, 256, -300, -30, -6 ),
// CHECK-SAME: ( 1397, 0, -1408, 100, 10, 33 ) )
//
- %v = vector.transfer_read %0[%c0, %c0], %i0
+ %v = vector.transfer_read %0[%c0, %c0], %i0 {in_bounds=[false, false]}
: tensor<5x6xi32>, vector<5x6xi32>
vector.print %v : vector<5x6xi32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir
index 1536249e60f28..6799326ffda91 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir
@@ -51,7 +51,7 @@ module {
vector.print %s1 : index
// CHECK ( 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 )
- %values = vector.transfer_read %buffer3[%c0], %d0: memref<?xf32>, vector<11xf32>
+ %values = vector.transfer_read %buffer3[%c0], %d0 {in_bounds=[false]}: memref<?xf32>, vector<11xf32>
vector.print %values : vector<11xf32>
// Release the buffers.
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir
index 0682bc6f314fd..fdcebc490a84f 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir
@@ -114,15 +114,15 @@ module {
sparse_tensor.sort quick_sort %i5, %xy jointly %y1 {perm_map = #ID_MAP, ny = 1 : index}
: memref<?xi32> jointly memref<?xi32>
// Dumps memory in the same order as the perm_map such that the output is ordered.
- %x1v = vector.transfer_read %x1[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x1v = vector.transfer_read %x1[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x1v : vector<5xi32>
- %x2v = vector.transfer_read %x2[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x2v = vector.transfer_read %x2[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x2v : vector<5xi32>
- %x0v = vector.transfer_read %x0[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x0v = vector.transfer_read %x0[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x0v : vector<5xi32>
- %y0v = vector.transfer_read %y0[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %y0v = vector.transfer_read %y0[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %y0v : vector<5xi32>
- %y1v = vector.transfer_read %y1[%i0], %c100: memref<?xi32>, vector<5xi32>
+ %y1v = vector.transfer_read %y1[%i0], %c100 {in_bounds=[false]}: memref<?xi32>, vector<5xi32>
vector.print %y1v : vector<5xi32>
// Stable sort.
// CHECK: ( 1, 1, 2, 5, 10 )
@@ -142,15 +142,15 @@ module {
: (memref<?xi32>, i32, i32, i32, i32, i32) -> ()
sparse_tensor.sort insertion_sort_stable %i5, %xy jointly %y1 {perm_map = #ID_MAP, ny = 1 : index}
: memref<?xi32> jointly memref<?xi32>
- %x1v2 = vector.transfer_read %x1[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x1v2 = vector.transfer_read %x1[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x1v2 : vector<5xi32>
- %x2v2 = vector.transfer_read %x2[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x2v2 = vector.transfer_read %x2[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x2v2 : vector<5xi32>
- %x0v2 = vector.transfer_read %x0[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x0v2 = vector.transfer_read %x0[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x0v2 : vector<5xi32>
- %y0v2 = vector.transfer_read %y0[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %y0v2 = vector.transfer_read %y0[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %y0v2 : vector<5xi32>
- %y1v2 = vector.transfer_read %y1[%i0], %c100: memref<?xi32>, vector<5xi32>
+ %y1v2 = vector.transfer_read %y1[%i0], %c100 {in_bounds=[false]}: memref<?xi32>, vector<5xi32>
vector.print %y1v2 : vector<5xi32>
// Heap sort.
// CHECK: ( 1, 1, 2, 5, 10 )
@@ -170,15 +170,15 @@ module {
: (memref<?xi32>, i32, i32, i32, i32, i32) -> ()
sparse_tensor.sort heap_sort %i5, %xy jointly %y1 {perm_map = #ID_MAP, ny = 1 : index}
: memref<?xi32> jointly memref<?xi32>
- %x1v3 = vector.transfer_read %x1[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x1v3 = vector.transfer_read %x1[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x1v3 : vector<5xi32>
- %x2v3 = vector.transfer_read %x2[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x2v3 = vector.transfer_read %x2[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x2v3 : vector<5xi32>
- %x0v3 = vector.transfer_read %x0[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %x0v3 = vector.transfer_read %x0[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %x0v3 : vector<5xi32>
- %y0v3 = vector.transfer_read %y0[%i0], %c100: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
+ %y0v3 = vector.transfer_read %y0[%i0], %c100 {in_bounds=[false]}: memref<?xi32, strided<[4], offset: ?>>, vector<5xi32>
vector.print %y0v3 : vector<5xi32>
- %y1v3 = vector.transfer_read %y1[%i0], %c100: memref<?xi32>, vector<5xi32>
+ %y1v3 = vector.transfer_read %y1[%i0], %c100 {in_bounds=[false]}: memref<?xi32>, vector<5xi32>
vector.print %y1v3 : vector<5xi32>
// Release the buffers.
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir
index 085b36a368704..ba150e00ac67e 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir
@@ -128,7 +128,7 @@ module {
// CHECK: ( 0, 520, 0, 0, 1250 )
//
scf.for %i = %c0 to %c5 step %c1 {
- %v = vector.transfer_read %0[%i, %c0], %d0: tensor<?x?xf32>, vector<5xf32>
+ %v = vector.transfer_read %0[%i, %c0], %d0 {in_bounds=[false]}: tensor<?x?xf32>, vector<5xf32>
vector.print %v : vector<5xf32>
}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir
index eecd970e01ac9..060ce29f1eda9 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir
@@ -229,9 +229,9 @@ module {
// CHECK-NEXT: values : ( 96, 192 )
// CHECK-NEXT: ----
//
- %v0 = vector.transfer_read %0[%c0, %c0], %d0
+ %v0 = vector.transfer_read %0[%c0, %c0], %d0 {in_bounds=[false, false]}
: tensor<8x8xf64>, vector<8x8xf64>
- %v1 = vector.transfer_read %1[%c0, %c0], %d0
+ %v1 = vector.transfer_read %1[%c0, %c0], %d0 {in_bounds=[false, false]}
: tensor<8x8xf64>, vector<8x8xf64>
vector.print %v0 : vector<8x8xf64>
vector.print %v1 : vector<8x8xf64>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir
index ca8bcd7744c8f..3aece2d9ca3e4 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir
@@ -110,7 +110,7 @@ module {
//
// CHECK: ( ( 3548, 3550, 3552, 3554 ), ( 6052, 6053, 6054, 6055 ), ( -56, -63, -70, -77 ), ( -13704, -13709, -13714, -13719 ) )
//
- %v = vector.transfer_read %0[%c0, %c0], %i0: tensor<?x?xf64>, vector<4x4xf64>
+ %v = vector.transfer_read %0[%c0, %c0], %i0 {in_bounds=[false, false]}: tensor<?x?xf64>, vector<4x4xf64>
vector.print %v : vector<4x4xf64>
// Release the resources.
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir
index 2b2b8536fe39e..886a7f31eaee0 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir
@@ -110,28 +110,28 @@ func.func @main() {
// CHECK: ( ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 20 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ) )
- %v_dual = vector.transfer_read %dual_CDCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v_dual = vector.transfer_read %dual_CDCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %v_dual : vector<3x3x3x1xf32>
// CHECK-NEXT: ( ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 20 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ) )
- %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero
+ %dense_v = vector.transfer_read %dense_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %dense_v : vector<3x3x3x1xf32>
// CHECK-NEXT: ( ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 20 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ) )
- %v1 = vector.transfer_read %CCCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v1 = vector.transfer_read %CCCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %v1 : vector<3x3x3x1xf32>
// CHECK-NEXT: ( ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 20 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ),
// CHECK-SAME: ( ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ), ( ( 0 ), ( 0 ), ( 0 ) ) ) )
- %v2 = vector.transfer_read %CDCC_ret[%c0, %c0, %c0, %c0], %zero
+ %v2 = vector.transfer_read %CDCC_ret[%c0, %c0, %c0, %c0], %zero {in_bounds=[false, false, false, false]}
: tensor<?x?x?x?xf32>, vector<3x3x3x1xf32>
vector.print %v1 : vector<3x3x3x1xf32>
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
index acb7a99a34180..1dc9bfd482bd7 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
@@ -303,7 +303,7 @@ module {
sparse_tensor.print %2 : tensor<?xf64, #SparseVector>
sparse_tensor.print %3 : tensor<?x?xf64, #DCSR>
sparse_tensor.print %4 : tensor<?x?xf64, #DCSR>
- %v = vector.transfer_read %5[%c0], %cmu: tensor<?xi32>, vector<32xi32>
+ %v = vector.transfer_read %5[%c0], %cmu {in_bounds=[false]} : tensor<?xi32>, vector<32xi32>
vector.print %v : vector<32xi32>
// Release the resources.
diff --git a/mlir/test/Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir b/mlir/test/Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir
index a7013eacc9849..d6a677e5b365a 100644
--- a/mlir/test/Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir
+++ b/mlir/test/Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir
@@ -10,7 +10,7 @@
func.func @transfer_read_2d(%A : memref<40xi32>, %base1: index) {
%i42 = arith.constant -42: i32
%f = vector.transfer_read %A[%base1], %i42
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {permutation_map = affine_map<(d0) -> (d0)>, in_bounds = [false]} :
memref<40xi32>, vector<40xi32>
vector.print %f: vector<40xi32>
return
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/realloc.mlir b/mlir/test/Integration/Dialect/Vector/CPU/realloc.mlir
index 6a988000d67d2..17ae068a5b4dc 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/realloc.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/realloc.mlir
@@ -21,7 +21,7 @@ func.func @entry() {
}
%d0 = arith.constant -1.0 : f32
- %Av = vector.transfer_read %A[%c0], %d0: memref<8xf32>, vector<8xf32>
+ %Av = vector.transfer_read %A[%c0], %d0 {in_bounds=[false]} : memref<8xf32>, vector<8xf32>
vector.print %Av : vector<8xf32>
// CHECK: ( 0, 1, 2, 3, 4, 5, 6, 7 )
@@ -35,7 +35,7 @@ func.func @entry() {
memref.store %fi, %B[%i] : memref<10xf32>
}
- %Bv = vector.transfer_read %B[%c0], %d0: memref<10xf32>, vector<10xf32>
+ %Bv = vector.transfer_read %B[%c0], %d0 {in_bounds=[false]} : memref<10xf32>, vector<10xf32>
vector.print %Bv : vector<10xf32>
// CHECK: ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 )
@@ -51,7 +51,7 @@ func.func @entry() {
memref.store %fi, %C[%i] : memref<13xf32>
}
- %Cv = vector.transfer_read %C[%c0], %d0: memref<13xf32>, vector<13xf32>
+ %Cv = vector.transfer_read %C[%c0], %d0 {in_bounds=[false]} : memref<13xf32>, vector<13xf32>
vector.print %Cv : vector<13xf32>
// CHECK: ( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 )
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-1d.mlir b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-1d.mlir
index 8a98d39e657f2..c8a306d572092 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-1d.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-1d.mlir
@@ -21,7 +21,7 @@ memref.global "private" @gv : memref<5x6xf32> =
func.func @transfer_read_1d(%A : memref<?x?xf32>, %base1 : index, %base2 : index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%base1, %base2], %fm42
- {permutation_map = affine_map<(d0, d1) -> (d0)>}
+ {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]}
: memref<?x?xf32>, vector<9xf32>
vector.print %f: vector<9xf32>
return
@@ -82,7 +82,7 @@ func.func @transfer_read_1d_broadcast(
%A : memref<?x?xf32>, %base1 : index, %base2 : index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%base1, %base2], %fm42
- {permutation_map = affine_map<(d0, d1) -> (0)>}
+ {permutation_map = affine_map<(d0, d1) -> (0)>, in_bounds = [true]}
: memref<?x?xf32>, vector<9xf32>
vector.print %f: vector<9xf32>
return
@@ -105,7 +105,7 @@ func.func @transfer_read_1d_mask(
%fm42 = arith.constant -42.0: f32
%mask = arith.constant dense<[1, 0, 1, 0, 1, 1, 1, 0, 1]> : vector<9xi1>
%f = vector.transfer_read %A[%base1, %base2], %fm42, %mask
- {permutation_map = affine_map<(d0, d1) -> (d0)>}
+ {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]}
: memref<?x?xf32>, vector<9xf32>
vector.print %f: vector<9xf32>
return
@@ -139,7 +139,7 @@ func.func @transfer_write_1d(%A : memref<?x?xf32>, %base1 : index, %base2 : inde
%fn1 = arith.constant -1.0 : f32
%vf0 = vector.splat %fn1 : vector<7xf32>
vector.transfer_write %vf0, %A[%base1, %base2]
- {permutation_map = affine_map<(d0, d1) -> (d0)>}
+ {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]}
: vector<7xf32>, memref<?x?xf32>
return
}
@@ -150,7 +150,7 @@ func.func @transfer_write_1d_mask(%A : memref<?x?xf32>, %base1 : index, %base2 :
%vf0 = vector.splat %fn1 : vector<7xf32>
%mask = arith.constant dense<[1, 0, 1, 0, 1, 1, 1]> : vector<7xi1>
vector.transfer_write %vf0, %A[%base1, %base2], %mask
- {permutation_map = affine_map<(d0, d1) -> (d0)>}
+ {permutation_map = affine_map<(d0, d1) -> (d0)>, in_bounds = [false]}
: vector<7xf32>, memref<?x?xf32>
return
}
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-2d.mlir b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-2d.mlir
index cb8a8ce8ab0b0..b6415dae77592 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-2d.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-2d.mlir
@@ -16,7 +16,7 @@ memref.global "private" @gv : memref<3x4xf32> = dense<[[0. , 1. , 2. , 3. ],
func.func @transfer_read_2d(%A : memref<?x?xf32>, %base1: index, %base2: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%base1, %base2], %fm42
- {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
+ {in_bounds = [ false, false], permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
memref<?x?xf32>, vector<4x9xf32>
vector.print %f: vector<4x9xf32>
return
@@ -30,7 +30,7 @@ func.func @transfer_read_2d_mask(%A : memref<?x?xf32>, %base1: index, %base2: in
[1, 1, 1, 1, 1, 1, 1, 0, 1],
[0, 0, 1, 0, 1, 1, 1, 0, 1]]> : vector<4x9xi1>
%f = vector.transfer_read %A[%base1, %base2], %fm42, %mask
- {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
memref<?x?xf32>, vector<4x9xf32>
vector.print %f: vector<4x9xf32>
return
@@ -45,7 +45,7 @@ func.func @transfer_read_2d_mask_transposed(
[1, 1, 1, 1, 1, 1, 1, 0, 1],
[0, 0, 1, 0, 1, 1, 1, 0, 1]]> : vector<4x9xi1>
%f = vector.transfer_read %A[%base1, %base2], %fm42, %mask
- {permutation_map = affine_map<(d0, d1) -> (d1, d0)>} :
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d1, d0)>} :
memref<?x?xf32>, vector<9x4xf32>
vector.print %f: vector<9x4xf32>
return
@@ -57,7 +57,7 @@ func.func @transfer_read_2d_mask_broadcast(
%fm42 = arith.constant -42.0: f32
%mask = arith.constant dense<[1, 0, 1, 0, 1, 1, 1, 0, 1]> : vector<9xi1>
%f = vector.transfer_read %A[%base1, %base2], %fm42, %mask
- {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
+ {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
memref<?x?xf32>, vector<4x9xf32>
vector.print %f: vector<4x9xf32>
return
@@ -69,7 +69,7 @@ func.func @transfer_read_2d_mask_transpose_broadcast_last_dim(
%fm42 = arith.constant -42.0: f32
%mask = arith.constant dense<[1, 0, 1, 1]> : vector<4xi1>
%f = vector.transfer_read %A[%base1, %base2], %fm42, %mask
- {permutation_map = affine_map<(d0, d1) -> (d1, 0)>} :
+ {in_bounds = [false, true], permutation_map = affine_map<(d0, d1) -> (d1, 0)>} :
memref<?x?xf32>, vector<4x9xf32>
vector.print %f: vector<4x9xf32>
return
@@ -80,7 +80,7 @@ func.func @transfer_read_2d_transposed(
%A : memref<?x?xf32>, %base1: index, %base2: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%base1, %base2], %fm42
- {permutation_map = affine_map<(d0, d1) -> (d1, d0)>} :
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d1, d0)>} :
memref<?x?xf32>, vector<4x9xf32>
vector.print %f: vector<4x9xf32>
return
@@ -91,7 +91,7 @@ func.func @transfer_read_2d_broadcast(
%A : memref<?x?xf32>, %base1: index, %base2: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%base1, %base2], %fm42
- {permutation_map = affine_map<(d0, d1) -> (d1, 0)>} :
+ {in_bounds = [false, true], permutation_map = affine_map<(d0, d1) -> (d1, 0)>} :
memref<?x?xf32>, vector<4x9xf32>
vector.print %f: vector<4x9xf32>
return
@@ -102,7 +102,7 @@ func.func @transfer_write_2d(%A : memref<?x?xf32>, %base1: index, %base2: index)
%fn1 = arith.constant -1.0 : f32
%vf0 = vector.splat %fn1 : vector<1x4xf32>
vector.transfer_write %vf0, %A[%base1, %base2]
- {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
vector<1x4xf32>, memref<?x?xf32>
return
}
@@ -113,7 +113,7 @@ func.func @transfer_write_2d_mask(%A : memref<?x?xf32>, %base1: index, %base2: i
%mask = arith.constant dense<[[1, 0, 1, 0]]> : vector<1x4xi1>
%vf0 = vector.splat %fn1 : vector<1x4xf32>
vector.transfer_write %vf0, %A[%base1, %base2], %mask
- {permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
+ {in_bounds = [false, false], permutation_map = affine_map<(d0, d1) -> (d0, d1)>} :
vector<1x4xf32>, memref<?x?xf32>
return
}
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-3d.mlir b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-3d.mlir
index 4aecca3d6891e..8fc38348bb26d 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-3d.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read-3d.mlir
@@ -12,6 +12,7 @@ func.func @transfer_read_3d(%A : memref<?x?x?x?xf32>,
%o: index, %a: index, %b: index, %c: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%o, %a, %b, %c], %fm42
+ {in_bounds = [false, false, false]}
: memref<?x?x?x?xf32>, vector<2x5x3xf32>
vector.print %f: vector<2x5x3xf32>
return
@@ -32,7 +33,8 @@ func.func @transfer_read_3d_broadcast(%A : memref<?x?x?x?xf32>,
%o: index, %a: index, %b: index, %c: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%o, %a, %b, %c], %fm42
- {permutation_map = affine_map<(d0, d1, d2, d3) -> (d1, 0, d3)>}
+ {permutation_map = affine_map<(d0, d1, d2, d3) -> (d1, 0, d3)>,
+ in_bounds = [false, true, false]}
: memref<?x?x?x?xf32>, vector<2x5x3xf32>
vector.print %f: vector<2x5x3xf32>
return
@@ -43,7 +45,8 @@ func.func @transfer_read_3d_mask_broadcast(
%fm42 = arith.constant -42.0: f32
%mask = arith.constant dense<[0, 1]> : vector<2xi1>
%f = vector.transfer_read %A[%o, %a, %b, %c], %fm42, %mask
- {permutation_map = affine_map<(d0, d1, d2, d3) -> (d1, 0, 0)>}
+ {permutation_map = affine_map<(d0, d1, d2, d3) -> (d1, 0, 0)>,
+ in_bounds = [false, true, true]}
: memref<?x?x?x?xf32>, vector<2x5x3xf32>
vector.print %f: vector<2x5x3xf32>
return
@@ -53,7 +56,8 @@ func.func @transfer_read_3d_transposed(%A : memref<?x?x?x?xf32>,
%o: index, %a: index, %b: index, %c: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%o, %a, %b, %c], %fm42
- {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, d0, d1)>}
+ {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, d0, d1)>,
+ in_bounds = [false, false, false]}
: memref<?x?x?x?xf32>, vector<3x5x3xf32>
vector.print %f: vector<3x5x3xf32>
return
@@ -63,7 +67,8 @@ func.func @transfer_write_3d(%A : memref<?x?x?x?xf32>,
%o: index, %a: index, %b: index, %c: index) {
%fn1 = arith.constant -1.0 : f32
%vf0 = vector.splat %fn1 : vector<2x9x3xf32>
- vector.transfer_write %vf0, %A[%o, %a, %b, %c]
+ vector.transfer_write %vf0, %A[%o, %a, %b, %c]
+ {in_bounds = [false, false, false] }
: vector<2x9x3xf32>, memref<?x?x?x?xf32>
return
}
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read.mlir b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read.mlir
index 91dc945cd3432..afe94ddcde515 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/transfer-read.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/transfer-read.mlir
@@ -11,7 +11,7 @@
func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) {
%fm42 = arith.constant -42.0: f32
%f = vector.transfer_read %A[%base], %fm42
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {permutation_map = affine_map<(d0) -> (d0)>, in_bounds = [false]} :
memref<?xf32>, vector<13xf32>
vector.print %f: vector<13xf32>
return
@@ -20,7 +20,7 @@ func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) {
func.func @transfer_read_mask_1d(%A : memref<?xf32>, %base: index) {
%fm42 = arith.constant -42.0: f32
%m = arith.constant dense<[0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]> : vector<13xi1>
- %f = vector.transfer_read %A[%base], %fm42, %m : memref<?xf32>, vector<13xf32>
+ %f = vector.transfer_read %A[%base], %fm42, %m {in_bounds=[false]} : memref<?xf32>, vector<13xf32>
vector.print %f: vector<13xf32>
return
}
@@ -47,7 +47,7 @@ func.func @transfer_write_1d(%A : memref<?xf32>, %base: index) {
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<4xf32>
vector.transfer_write %vf0, %A[%base]
- {permutation_map = affine_map<(d0) -> (d0)>} :
+ {permutation_map = affine_map<(d0) -> (d0)>, in_bounds = [false]} :
vector<4xf32>, memref<?xf32>
return
}
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/transfer-to-loops.mlir b/mlir/test/Integration/Dialect/Vector/CPU/transfer-to-loops.mlir
index 2c1f3e2b6fd52..05dfeca9a07f6 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/transfer-to-loops.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/transfer-to-loops.mlir
@@ -50,7 +50,7 @@ func.func @main() {
// CHECK-NEXT: [4, 104, 204, 304, 404, 504],
// CHECK-NEXT: [5, 105, 205, 305, 405, 505]]
- %init = vector.transfer_read %0[%c1, %c1], %cst : memref<?x?xf32>, vector<5x5xf32>
+ %init = vector.transfer_read %0[%c1, %c1], %cst {in_bounds=[false, false]} : memref<?x?xf32>, vector<5x5xf32>
vector.print %init : vector<5x5xf32>
// 5x5 block rooted at {1, 1}
// CHECK-NEXT: ( ( 101, 201, 301, 401, 501 ),
@@ -59,7 +59,7 @@ func.func @main() {
// CHECK-SAME: ( 104, 204, 304, 404, 504 ),
// CHECK-SAME: ( 105, 205, 305, 405, 505 ) )
- %1 = vector.transfer_read %0[%c1, %c1], %cst {permutation_map = #map0} : memref<?x?xf32>, vector<5x5xf32>
+ %1 = vector.transfer_read %0[%c1, %c1], %cst {permutation_map = #map0, in_bounds = [false, false]} : memref<?x?xf32>, vector<5x5xf32>
vector.print %1 : vector<5x5xf32>
// Transposed 5x5 block rooted @{1, 1} in memory.
// CHECK-NEXT: ( ( 101, 102, 103, 104, 105 ),
@@ -69,9 +69,9 @@ func.func @main() {
// CHECK-SAME: ( 501, 502, 503, 504, 505 ) )
// Transpose-write the transposed 5x5 block @{0, 0} in memory.
- vector.transfer_write %1, %0[%c0, %c0] {permutation_map = #map0} : vector<5x5xf32>, memref<?x?xf32>
+ vector.transfer_write %1, %0[%c0, %c0] {permutation_map = #map0, in_bounds = [false, false]} : vector<5x5xf32>, memref<?x?xf32>
- %2 = vector.transfer_read %0[%c1, %c1], %cst : memref<?x?xf32>, vector<5x5xf32>
+ %2 = vector.transfer_read %0[%c1, %c1], %cst {in_bounds=[false, false]} : memref<?x?xf32>, vector<5x5xf32>
vector.print %2 : vector<5x5xf32>
// New 5x5 block rooted @{1, 1} in memory.
// Here we expect the boundaries from the original data
@@ -83,7 +83,7 @@ func.func @main() {
// CHECK-SAME: ( 205, 305, 405, 505, 504 ),
// CHECK-SAME: ( 105, 205, 305, 405, 505 ) )
- %3 = vector.transfer_read %0[%c2, %c3], %cst : memref<?x?xf32>, vector<5x5xf32>
+ %3 = vector.transfer_read %0[%c2, %c3], %cst {in_bounds=[false, false]} : memref<?x?xf32>, vector<5x5xf32>
vector.print %3 : vector<5x5xf32>
// New 5x5 block rooted @{2, 3} in memory.
// CHECK-NEXT: ( ( 403, 503, 502, -42, -42 ),
@@ -92,7 +92,7 @@ func.func @main() {
// CHECK-SAME: ( 305, 405, 505, -42, -42 ),
// CHECK-SAME: ( -42, -42, -42, -42, -42 ) )
- %4 = vector.transfer_read %0[%c2, %c3], %cst {permutation_map = #map0} : memref<?x?xf32>, vector<5x5xf32>
+ %4 = vector.transfer_read %0[%c2, %c3], %cst {permutation_map = #map0, in_bounds = [false, false]} : memref<?x?xf32>, vector<5x5xf32>
vector.print %4 : vector<5x5xf32>
// Transposed 5x5 block rooted @{2, 3} in memory.
// CHECK-NEXT: ( ( 403, 404, 405, 305, -42 ),
@@ -101,7 +101,7 @@ func.func @main() {
// CHECK-SAME: ( -42, -42, -42, -42, -42 ),
// CHECK-SAME: ( -42, -42, -42, -42, -42 ) )
- %5 = vector.transfer_read %0[%c2, %c3], %cst {permutation_map = #map1} : memref<?x?xf32>, vector<5xf32>
+ %5 = vector.transfer_read %0[%c2, %c3], %cst {permutation_map = #map1, in_bounds = [false]} : memref<?x?xf32>, vector<5xf32>
vector.print %5 : vector<5xf32>
// CHECK-NEXT: ( 403, 503, 502, -42, -42 )
diff --git a/mlir/test/Integration/Dialect/Vector/CPU/transfer-write.mlir b/mlir/test/Integration/Dialect/Vector/CPU/transfer-write.mlir
index cc6763e54c1cb..492542b41e88b 100644
--- a/mlir/test/Integration/Dialect/Vector/CPU/transfer-write.mlir
+++ b/mlir/test/Integration/Dialect/Vector/CPU/transfer-write.mlir
@@ -16,7 +16,7 @@ func.func @transfer_write13_1d(%A : memref<?xf32>, %base: index) {
%f = arith.constant 13.0 : f32
%v = vector.splat %f : vector<13xf32>
vector.transfer_write %v, %A[%base]
- {permutation_map = affine_map<(d0) -> (d0)>}
+ {permutation_map = affine_map<(d0) -> (d0)>, in_bounds = [false]}
: vector<13xf32>, memref<?xf32>
return
}
@@ -25,7 +25,7 @@ func.func @transfer_write17_1d(%A : memref<?xf32>, %base: index) {
%f = arith.constant 17.0 : f32
%v = vector.splat %f : vector<17xf32>
vector.transfer_write %v, %A[%base]
- {permutation_map = affine_map<(d0) -> (d0)>}
+ {permutation_map = affine_map<(d0) -> (d0)>, in_bounds = [false]}
: vector<17xf32>, memref<?xf32>
return
}
@@ -34,7 +34,7 @@ func.func @transfer_read_1d(%A : memref<?xf32>) -> vector<32xf32> {
%z = arith.constant 0: index
%f = arith.constant 0.0: f32
%r = vector.transfer_read %A[%z], %f
- {permutation_map = affine_map<(d0) -> (d0)>}
+ {permutation_map = affine_map<(d0) -> (d0)>, in_bounds = [false]}
: memref<?xf32>, vector<32xf32>
return %r : vector<32xf32>
}
@@ -133,6 +133,7 @@ func.func @entry() {
call @transfer_write_inbounds_3d(%A1) : (memref<4x4x4xf32>) -> ()
%f = arith.constant 0.0: f32
%r = vector.transfer_read %A1[%c0, %c0, %c0], %f
+ {in_bounds = [true, true, true]}
: memref<4x4x4xf32>, vector<4x4x4xf32>
vector.print %r : vector<4x4x4xf32>
diff --git a/mlir/test/Transforms/loop-invariant-subset-hoisting.mlir b/mlir/test/Transforms/loop-invariant-subset-hoisting.mlir
index 3a78287a0dcad..3bb62fef4324f 100644
--- a/mlir/test/Transforms/loop-invariant-subset-hoisting.mlir
+++ b/mlir/test/Transforms/loop-invariant-subset-hoisting.mlir
@@ -326,12 +326,12 @@ func.func @hoist_vector_transfer_pairs_tensor(
%arg9 = %arg3, %arg10 = %arg4, %arg11 = %arg5)
-> (tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>,
tensor<?x?xf32>, tensor<?x?xf32>) {
- %r0 = vector.transfer_read %arg7[%c0, %c0], %cst: tensor<?x?xf32>, vector<1xf32>
- %r1 = vector.transfer_read %arg6[%i, %i], %cst: tensor<?x?xf32>, vector<2xf32>
- %r3 = vector.transfer_read %arg9[%c0, %c0], %cst: tensor<?x?xf32>, vector<4xf32>
+ %r0 = vector.transfer_read %arg7[%c0, %c0], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<1xf32>
+ %r1 = vector.transfer_read %arg6[%i, %i], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<2xf32>
+ %r3 = vector.transfer_read %arg9[%c0, %c0], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<4xf32>
"test.some_crippling_use"(%arg10) : (tensor<?x?xf32>) -> ()
- %r4 = vector.transfer_read %arg10[%c0, %c0], %cst: tensor<?x?xf32>, vector<5xf32>
- %r5 = vector.transfer_read %arg11[%c0, %c0], %cst: tensor<?x?xf32>, vector<6xf32>
+ %r4 = vector.transfer_read %arg10[%c0, %c0], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<5xf32>
+ %r5 = vector.transfer_read %arg11[%c0, %c0], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<6xf32>
"test.some_crippling_use"(%arg11) : (tensor<?x?xf32>) -> ()
%u0 = "test.some_use"(%r0) : (vector<1xf32>) -> vector<1xf32>
%u1 = "test.some_use"(%r1) : (vector<2xf32>) -> vector<2xf32>
@@ -339,12 +339,12 @@ func.func @hoist_vector_transfer_pairs_tensor(
%u3 = "test.some_use"(%r3) : (vector<4xf32>) -> vector<4xf32>
%u4 = "test.some_use"(%r4) : (vector<5xf32>) -> vector<5xf32>
%u5 = "test.some_use"(%r5) : (vector<6xf32>) -> vector<6xf32>
- %w1 = vector.transfer_write %u0, %arg7[%c0, %c0] : vector<1xf32>, tensor<?x?xf32>
- %w0 = vector.transfer_write %u1, %arg6[%i, %i] : vector<2xf32>, tensor<?x?xf32>
- %w2 = vector.transfer_write %u2, %arg8[%c0, %c0] : vector<3xf32>, tensor<?x?xf32>
- %w3 = vector.transfer_write %u3, %arg9[%c0, %c0] : vector<4xf32>, tensor<?x?xf32>
- %w4 = vector.transfer_write %u4, %arg10[%c0, %c0] : vector<5xf32>, tensor<?x?xf32>
- %w5 = vector.transfer_write %u5, %arg11[%c0, %c0] : vector<6xf32>, tensor<?x?xf32>
+ %w1 = vector.transfer_write %u0, %arg7[%c0, %c0] {in_bounds=[true]} : vector<1xf32>, tensor<?x?xf32>
+ %w0 = vector.transfer_write %u1, %arg6[%i, %i] {in_bounds=[true]} : vector<2xf32>, tensor<?x?xf32>
+ %w2 = vector.transfer_write %u2, %arg8[%c0, %c0] {in_bounds=[true]} : vector<3xf32>, tensor<?x?xf32>
+ %w3 = vector.transfer_write %u3, %arg9[%c0, %c0] {in_bounds=[true]} : vector<4xf32>, tensor<?x?xf32>
+ %w4 = vector.transfer_write %u4, %arg10[%c0, %c0] {in_bounds=[true]} : vector<5xf32>, tensor<?x?xf32>
+ %w5 = vector.transfer_write %u5, %arg11[%c0, %c0] {in_bounds=[true]} : vector<6xf32>, tensor<?x?xf32>
"test.some_crippling_use"(%w3) : (tensor<?x?xf32>) -> ()
scf.yield %w0, %w1, %w2, %w3, %w4, %w5 :
tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>,
@@ -415,14 +415,14 @@ func.func @hoist_vector_transfer_pairs_disjoint_tensor(
iter_args(%arg4 = %arg0, %arg5 = %arg1, %arg6 = %arg2,
%arg7 = %arg3)
-> (tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>) {
- %r00 = vector.transfer_read %arg5[%c0, %c0], %cst: tensor<?x?xf32>, vector<2xf32>
- %r01 = vector.transfer_read %arg5[%c0, %c1], %cst: tensor<?x?xf32>, vector<2xf32>
- %r20 = vector.transfer_read %arg6[%c0, %c0], %cst: tensor<?x?xf32>, vector<3xf32>
- %r21 = vector.transfer_read %arg6[%c0, %c3], %cst: tensor<?x?xf32>, vector<3xf32>
- %r30 = vector.transfer_read %arg7[%c0, %random_index], %cst: tensor<?x?xf32>, vector<4xf32>
- %r31 = vector.transfer_read %arg7[%c1, %random_index], %cst: tensor<?x?xf32>, vector<4xf32>
- %r10 = vector.transfer_read %arg4[%i, %i], %cst: tensor<?x?xf32>, vector<2xf32>
- %r11 = vector.transfer_read %arg4[%random_index, %random_index], %cst: tensor<?x?xf32>, vector<2xf32>
+ %r00 = vector.transfer_read %arg5[%c0, %c0], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<2xf32>
+ %r01 = vector.transfer_read %arg5[%c0, %c1], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<2xf32>
+ %r20 = vector.transfer_read %arg6[%c0, %c0], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<3xf32>
+ %r21 = vector.transfer_read %arg6[%c0, %c3], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<3xf32>
+ %r30 = vector.transfer_read %arg7[%c0, %random_index], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<4xf32>
+ %r31 = vector.transfer_read %arg7[%c1, %random_index], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<4xf32>
+ %r10 = vector.transfer_read %arg4[%i, %i], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<2xf32>
+ %r11 = vector.transfer_read %arg4[%random_index, %random_index], %cst {in_bounds=[true]}: tensor<?x?xf32>, vector<2xf32>
%u00 = "test.some_use"(%r00) : (vector<2xf32>) -> vector<2xf32>
%u01 = "test.some_use"(%r01) : (vector<2xf32>) -> vector<2xf32>
%u20 = "test.some_use"(%r20) : (vector<3xf32>) -> vector<3xf32>
@@ -431,14 +431,14 @@ func.func @hoist_vector_transfer_pairs_disjoint_tensor(
%u31 = "test.some_use"(%r31) : (vector<4xf32>) -> vector<4xf32>
%u10 = "test.some_use"(%r10) : (vector<2xf32>) -> vector<2xf32>
%u11 = "test.some_use"(%r11) : (vector<2xf32>) -> vector<2xf32>
- %w10 = vector.transfer_write %u00, %arg5[%c0, %c0] : vector<2xf32>, tensor<?x?xf32>
- %w11 = vector.transfer_write %u01, %w10[%c0, %c1] : vector<2xf32>, tensor<?x?xf32>
- %w20 = vector.transfer_write %u20, %arg6[%c0, %c0] : vector<3xf32>, tensor<?x?xf32>
- %w21 = vector.transfer_write %u21, %w20[%c0, %c3] : vector<3xf32>, tensor<?x?xf32>
- %w30 = vector.transfer_write %u30, %arg7[%c0, %random_index] : vector<4xf32>, tensor<?x?xf32>
- %w31 = vector.transfer_write %u31, %w30[%c1, %random_index] : vector<4xf32>, tensor<?x?xf32>
- %w00 = vector.transfer_write %u10, %arg4[%i, %i] : vector<2xf32>, tensor<?x?xf32>
- %w01 = vector.transfer_write %u11, %w00[%random_index, %random_index] : vector<2xf32>, tensor<?x?xf32>
+ %w10 = vector.transfer_write %u00, %arg5[%c0, %c0] {in_bounds=[true]} : vector<2xf32>, tensor<?x?xf32>
+ %w11 = vector.transfer_write %u01, %w10[%c0, %c1] {in_bounds=[true]} : vector<2xf32>, tensor<?x?xf32>
+ %w20 = vector.transfer_write %u20, %arg6[%c0, %c0] {in_bounds=[true]} : vector<3xf32>, tensor<?x?xf32>
+ %w21 = vector.transfer_write %u21, %w20[%c0, %c3] {in_bounds=[true]} : vector<3xf32>, tensor<?x?xf32>
+ %w30 = vector.transfer_write %u30, %arg7[%c0, %random_index] {in_bounds=[true]} : vector<4xf32>, tensor<?x?xf32>
+ %w31 = vector.transfer_write %u31, %w30[%c1, %random_index] {in_bounds=[true]} : vector<4xf32>, tensor<?x?xf32>
+ %w00 = vector.transfer_write %u10, %arg4[%i, %i] {in_bounds=[true]} : vector<2xf32>, tensor<?x?xf32>
+ %w01 = vector.transfer_write %u11, %w00[%random_index, %random_index] {in_bounds=[true]} : vector<2xf32>, tensor<?x?xf32>
scf.yield %w01, %w11, %w21, %w31 : tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>
}
scf.yield %1#0, %1#1, %1#2, %1#3 : tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>
@@ -492,19 +492,19 @@ func.func @hoist_vector_transfer_pairs_tensor_and_slices(
-> (tensor<?x?xf32>, tensor<?x?xf32>, tensor<?x?xf32>) {
// Hoists.
%st0 = tensor.extract_slice %arg6[%i, %i][%step, %step][1, 1] : tensor<?x?xf32> to tensor<?x?xf32>
- %r0 = vector.transfer_read %st0[%c0, %c0], %cst: tensor<?x?xf32>, vector<1xf32>
+ %r0 = vector.transfer_read %st0[%c0, %c0], %cst {in_bounds=[false]}: tensor<?x?xf32>, vector<1xf32>
// CHECK: %[[ST1:.*]] = tensor.extract_slice %[[TENSOR1_ARG_L2]][%[[J]],{{.*}}: tensor<?x?xf32> to tensor<?x?xf32>
// CHECK: %[[V1:.*]] = vector.transfer_read %[[ST1]]{{.*}} : tensor<?x?xf32>, vector<2xf32>
// Does not hoist (slice depends on %j)
%st1 = tensor.extract_slice %arg7[%j, %c0][%step, %step][1, 1] : tensor<?x?xf32> to tensor<?x?xf32>
- %r1 = vector.transfer_read %st1[%c0, %c0], %cst: tensor<?x?xf32>, vector<2xf32>
+ %r1 = vector.transfer_read %st1[%c0, %c0], %cst {in_bounds=[false]}: tensor<?x?xf32>, vector<2xf32>
// CHECK: %[[ST2:.*]] = tensor.extract_slice %[[TENSOR2_ARG_L2]][%[[I]],{{.*}}: tensor<?x?xf32> to tensor<?x?xf32>
// CHECK: %[[V2:.*]] = vector.transfer_read %[[ST2]]{{.*}} : tensor<?x?xf32>, vector<3xf32>
// Does not hoist, 2 slice %arg8.
%st2 = tensor.extract_slice %arg8[%i, %c0][%step, %step][1, 1] : tensor<?x?xf32> to tensor<?x?xf32>
- %r2 = vector.transfer_read %st2[%c0, %c0], %cst: tensor<?x?xf32>, vector<3xf32>
+ %r2 = vector.transfer_read %st2[%c0, %c0], %cst {in_bounds=[false]}: tensor<?x?xf32>, vector<3xf32>
// CHECK: %[[U0:.*]] = "test.some_use"(%[[V0_ARG_L2]]) : (vector<1xf32>) -> vector<1xf32>
// CHECK: %[[U1:.*]] = "test.some_use"(%[[V1]]) : (vector<2xf32>) -> vector<2xf32>
@@ -514,15 +514,15 @@ func.func @hoist_vector_transfer_pairs_tensor_and_slices(
%u2 = "test.some_use"(%r2) : (vector<3xf32>) -> vector<3xf32>
// Hoists
- %w0 = vector.transfer_write %u0, %st0[%c0, %c0] : vector<1xf32>, tensor<?x?xf32>
+ %w0 = vector.transfer_write %u0, %st0[%c0, %c0] {in_bounds=[false]} : vector<1xf32>, tensor<?x?xf32>
// CHECK-DAG: %[[STI1:.*]] = vector.transfer_write %[[U1]], %{{.*}} : vector<2xf32>, tensor<?x?xf32>
// Does not hoist (associated slice depends on %j).
- %w1 = vector.transfer_write %u1, %st1[%i, %i] : vector<2xf32>, tensor<?x?xf32>
+ %w1 = vector.transfer_write %u1, %st1[%i, %i] {in_bounds=[false]} : vector<2xf32>, tensor<?x?xf32>
// CHECK-DAG: %[[STI2:.*]] = vector.transfer_write %[[U2]], %{{.*}} : vector<3xf32>, tensor<?x?xf32>
// Does not hoist, 2 slice / insert_slice for %arg8.
- %w2 = vector.transfer_write %u2, %st2[%c0, %c0] : vector<3xf32>, tensor<?x?xf32>
+ %w2 = vector.transfer_write %u2, %st2[%c0, %c0] {in_bounds=[false]} : vector<3xf32>, tensor<?x?xf32>
// Hoists.
%sti0 = tensor.insert_slice %w0 into %arg6[%i, %i][%step, %step][1, 1] : tensor<?x?xf32> into tensor<?x?xf32>
@@ -570,8 +570,8 @@ func.func @hoist_vector_transfer_pairs_tensor_and_slices(
// CHECK: %[[R5:.*]] = "test.some_use"(%[[R3]]) : (vector<2xf32>) -> vector<2xf32>
// CHECK: scf.yield %[[TL]], %[[R4]], %[[R5]] : tensor<?x?xf32>, vector<2xf32>, vector<2xf32>
// CHECK: }
-// CHECK: %[[W0:.*]] = vector.transfer_write %[[F]]#2, %[[F]]#0[%[[C0]], %[[C3]]] : vector<2xf32>, tensor<?x?xf32>
-// CHECK: %[[W1:.*]] = vector.transfer_write %[[F]]#1, %[[W0]][%[[C0]], %[[C0]]] : vector<2xf32>, tensor<?x?xf32>
+// CHECK: %[[W0:.*]] = vector.transfer_write %[[F]]#2, %[[F]]#0[%[[C0]], %[[C3]]] {{.*}} : vector<2xf32>, tensor<?x?xf32>
+// CHECK: %[[W1:.*]] = vector.transfer_write %[[F]]#1, %[[W0]][%[[C0]], %[[C0]]] {{.*}} : vector<2xf32>, tensor<?x?xf32>
// CHECK: return %[[W1]] : tensor<?x?xf32>
func.func @hoist_vector_transfer_write_pairs_disjoint_tensor(
%tensor: tensor<?x?xf32>,
@@ -583,14 +583,14 @@ func.func @hoist_vector_transfer_write_pairs_disjoint_tensor(
%cst = arith.constant 0.0 : f32
%1 = scf.for %j = %lb to %ub step %step iter_args(%arg5 = %tensor)
-> (tensor<?x?xf32>) {
- %r00 = vector.transfer_read %arg5[%c0, %c0], %cst: tensor<?x?xf32>, vector<2xf32>
+ %r00 = vector.transfer_read %arg5[%c0, %c0], %cst {in_bounds=[false]}: tensor<?x?xf32>, vector<2xf32>
%u00 = "test.some_use"(%r00) : (vector<2xf32>) -> vector<2xf32>
- %w10 = vector.transfer_write %u00, %arg5[%c0, %c0] : vector<2xf32>, tensor<?x?xf32>
+ %w10 = vector.transfer_write %u00, %arg5[%c0, %c0] {in_bounds=[false]} : vector<2xf32>, tensor<?x?xf32>
// Hoist by properly bypassing the disjoint write %w10.
- %r01 = vector.transfer_read %w10[%c0, %c3], %cst: tensor<?x?xf32>, vector<2xf32>
+ %r01 = vector.transfer_read %w10[%c0, %c3], %cst {in_bounds=[false]}: tensor<?x?xf32>, vector<2xf32>
%u01 = "test.some_use"(%r01) : (vector<2xf32>) -> vector<2xf32>
- %w11 = vector.transfer_write %u01, %w10[%c0, %c3] : vector<2xf32>, tensor<?x?xf32>
+ %w11 = vector.transfer_write %u01, %w10[%c0, %c3] {in_bounds=[false]} : vector<2xf32>, tensor<?x?xf32>
scf.yield %w11 : tensor<?x?xf32>
}
return %1 : tensor<?x?xf32>
More information about the Mlir-commits
mailing list