[all-commits] [llvm/llvm-project] 2de936: [mlir][vector] Fix emulation of "narrow" type `vec...
Andrzej Warzyński via All-commits
all-commits at lists.llvm.org
Thu Apr 24 10:06:03 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 2de936b6eb38e7a37224a97c2a22aa79b9dfb9dc
https://github.com/llvm/llvm-project/commit/2de936b6eb38e7a37224a97c2a22aa79b9dfb9dc
Author: Andrzej Warzyński <andrzej.warzynski at arm.com>
Date: 2025-04-24 (Thu, 24 Apr 2025)
Changed paths:
M mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp
M mlir/test/Dialect/Vector/vector-emulate-narrow-type-unaligned.mlir
M mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir
Log Message:
-----------
[mlir][vector] Fix emulation of "narrow" type `vector.store` (#133231)
Below are two examples of "narrow" `vector.stores`. The first example
does not require partial stores and hence no RMW stores. This is
currently emulated correctly.
```mlir
func.func @example_1(%arg0: vector<4xi2>) {
%0 = memref.alloc() : memref<13xi2>
%c4 = arith.constant 4 : index
vector.store %arg0, %0[%c4] : memref<13xi2>, vector<4xi2>
return
}
```
The second example requires a partial (and hence RMW) store due to the
offset pointing outside the emulated type boundary (`%c3`).
```mlir
func.func @example_2(%arg0: vector<4xi2>) {
%0 = memref.alloc() : memref<13xi2>
%c3 = arith.constant 3 : index
vector.store %arg0, %0[%c3] : memref<13xi2>, vector<4xi2>
return
}
```
This is currently incorrectly emulated as a single "full" store (note
that the offset is incorrect) instead of partial stores:
```mlir
func.func @example_2(%arg0: vector<4xi2>) {
%alloc = memref.alloc() : memref<4xi8>
%0 = vector.bitcast %arg0 : vector<4xi2> to vector<1xi8>
%c0 = arith.constant 0 : index
vector.store %0, %alloc[%c0] : memref<4xi8>, vector<1xi8>
return
}
```
The incorrect emulation stems from this simplified (i.e. incomplete)
calculation of the front padding:
```cpp
std::optional<int64_t> foldedNumFrontPadElems =
isDivisibleInSize ? 0
: getConstantIntValue(linearizedInfo.intraDataOffset);
```
Since `isDivisibleInSize` is `true` (i8 / i2 = 4):
* front padding is set to `0` and, as a result,
* the input offset (`%c3`) is ignored, and
* we incorrectly assume that partial stores won't be needed.
Note that in both examples we are storing `vector<4xi2>` into
`memref<13xi2>` (note _different_ trailing dims) and hence partial
stores might in fact be required. The condition above is updated to:
```cpp
std::optional<int64_t> foldedNumFrontPadElems =
(isDivisibleInSize && trailingDimsMatch)
? 0
: getConstantIntValue(linearizedInfo.intraDataOffset);
```
This change ensures that the input offset is properly taken into
account, which fixes the issue. It doesn't affect `@example1`.
Additional comments are added to clarify the current logic.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list