[Mlir-commits] [mlir] [MLIR][NVGPU] Introduce `warpgroup.init.accumulator` Op (PR #67530)

Mon Oct 9 15:47:33 PDT 2023

================
@@ -727,4 +727,15 @@ def NVGPU_WarpgroupMmaOp : NVGPU_Op<"warpgroup.mma"> {
   let hasVerifier = 1;
 }
 
+def NVGPU_WarpgroupMmaInitAccumulatorOp : NVGPU_Op<"warpgroup.mma.init.accumulator"> {  
+  let summary = "Initialize accumulator matrix for `warppgroup.mma`";
+
+  let description = [{
+    This Op generates and initilizes the accumulator matrix for 
+    `nvgpu.warpgroup.mma` op to perform matrix-multiply-and-accumulate (mma).
+  }];
+  let results = (outs Variadic<NVGPU_WarpgroupAccumulator>:$matrixC);
----------------
grypp wrote:

I decided to simplify the `nvgpu.wargroup.accumulator` type. I am going to get rid of the varidic, but I need to change it for these 3 ops.

The IR will look like this instead of 1st option I wrote above

```
// Init
%matrixC1, %matrixC2 = nvgpu.wargroup.mma.init.accumulator ->  
                    !nvgpu.wargroup.accumulator<fragmented = vector<128x128xf32>>

// GEMM
%matrixD = nvgpu.wargroup.mma %descA, %descB, %matrixC ...

// Epilogue 
nvgpu.wargroup.mma.store [%matrixD1, %matrixD2] to %sharedMemoryBuffer
  : !nvgpu.wargroup.accumulator<fragmented = vector<128x128xf32>>
    into memref<128x128xf32,3>
```

https://github.com/llvm/llvm-project/pull/67530