[Mlir-commits] [mlir] [MLIR][XeGPU] Add unroll pattern for load_gather and store_scatter with offsets (PR #159453)

Tue Sep 23 14:47:35 PDT 2025

================
@@ -342,6 +408,31 @@ gpu.module @test {
     gpu.return
   }
 
+//-----
+  // CHECK-LABEL: store_with_offsets_chunk
+  // CHECK-SAME: [[arg0:%.+]]: memref<64xf32>
+  // CHECK: [[cst:%.+]] = arith.constant dense<1.023000e+03> : vector<16x2xf32
+  // CHECK: [[cst0:%.+]] = arith.constant dense<[130, 138, 146, 154, 162, 170, 178, 186, 194, 202, 210, 218, 226, 234, 242, 250]> : vector<16xindex>
+  // CHECK: [[cst1:%.+]] = arith.constant dense<[2, 10, 18, 26, 34, 42, 50, 58, 66, 74, 82, 90, 98, 106, 114, 122]> : vector<16xindex>
+  // CHECK: [[cst2:%.+]] = arith.constant dense<[128, 136, 144, 152, 160, 168, 176, 184, 192, 200, 208, 216, 224, 232, 240, 248]> : vector<16xindex>
+  // CHECK: [[cst3:%.+]] = arith.constant dense<[0, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96, 104, 112, 120]> : vector<16xindex>
+  // CHECK-COUNT-4: xegpu.store  {{.*}}[{{.*}}], {{.*}} <{chunk_size = 2 : i64, l1_hint = #xegpu.cache_hint<cached>}> : vector<16x2xf32>, memref<64xf32>, vector<16xindex>, vector<16xi1>
+  gpu.func @store_with_offsets_chunk(%src: memref<64xf32>) {
----------------
Jianhui-Li wrote:

consider change memref<64xf32> to ui64, as xegpu.store doesn't require the structure info, and 64xf32 is actually causing out-of-boundary issue here. 

https://github.com/llvm/llvm-project/pull/159453