[Mlir-commits] [mlir] [MLIR][XeGPU] Update XeGPU create_tdesc, update_offset, load, store and prefetch. (PR #154653)

Thu Aug 21 11:21:11 PDT 2025

akroviakov wrote:

Another possible inconsistency for SIMT code in the current upstream. Loadgather can have a memref source, in this case, the chunk size needs to be part of the op. In verification, when we check the mask shape, we have:
```cpp

  llvm::SmallVector<int64_t> expectedMaskShape(valueShape);
  if (chunkSize > 1)
    expectedMaskShape.pop_back();
  if (expectedMaskShape != maskShape)
    return emitError() << "Mask should match value except the chunk size dim.";

```

For SIMT (i.e., 1D valueShape) code with non-1 chunk size, this means that we always hit `(expectedMaskShape != maskShape)`, because we pop from the 1D mask shape, but mask is still 1D.

Example IR:
```mlir
    gpu.func @scatter_ops(%arg0: memref<256xf16>, %arg1: vector<16xindex>) {
      %cst = arith.constant dense<true> : vector<1xi1>
      %0 = gpu.lane_id
      %1 = vector.extract %arg1[%0] : index from vector<16xindex>
      %2 = vector.broadcast %1 : index to vector<1xindex>
      %3 = xegpu.load %arg0[%2], %cst <{chunk_size = 8 : i64}> : memref<256xf16>, vector<1xindex>, vector<1xi1> -> vector<8xf16>
      xegpu.store %3, %arg0[%2], %cst <{chunk_size = 8 : i64}> : vector<8xf16>, memref<256xf16>, vector<1xindex>, vector<1xi1>
      gpu.return
    }
```

Could you please add this example (maybe a simplified version of it) to see whether this PR addresses the SIMT verification inconsistency?

https://github.com/llvm/llvm-project/pull/154653