[Mlir-commits] [mlir] [MLIR][XeGPU] Update XeGPU create_tdesc, update_offset, load, store and prefetch. (PR #154653)
Artem Kroviakov
llvmlistbot at llvm.org
Thu Aug 21 11:21:11 PDT 2025
akroviakov wrote:
Another possible inconsistency for SIMT code in the current upstream. Loadgather can have a memref source, in this case, the chunk size needs to be part of the op. In verification, when we check the mask shape, we have:
```cpp
llvm::SmallVector<int64_t> expectedMaskShape(valueShape);
if (chunkSize > 1)
expectedMaskShape.pop_back();
if (expectedMaskShape != maskShape)
return emitError() << "Mask should match value except the chunk size dim.";
```
For SIMT (i.e., 1D valueShape) code with non-1 chunk size, this means that we always hit `(expectedMaskShape != maskShape)`, because we pop from the 1D mask shape, but mask is still 1D.
Example IR:
```mlir
gpu.func @scatter_ops(%arg0: memref<256xf16>, %arg1: vector<16xindex>) {
%cst = arith.constant dense<true> : vector<1xi1>
%0 = gpu.lane_id
%1 = vector.extract %arg1[%0] : index from vector<16xindex>
%2 = vector.broadcast %1 : index to vector<1xindex>
%3 = xegpu.load %arg0[%2], %cst <{chunk_size = 8 : i64}> : memref<256xf16>, vector<1xindex>, vector<1xi1> -> vector<8xf16>
xegpu.store %3, %arg0[%2], %cst <{chunk_size = 8 : i64}> : vector<8xf16>, memref<256xf16>, vector<1xindex>, vector<1xi1>
gpu.return
}
```
Could you please add this example (maybe a simplified version of it) to see whether this PR addresses the SIMT verification inconsistency?
https://github.com/llvm/llvm-project/pull/154653
More information about the Mlir-commits
mailing list