[all-commits] [llvm/llvm-project] cce3e8: [MLIR][NVGPU] Introduction of wgmma.generate.descr...
    Guray Ozen via All-commits 
    all-commits at lists.llvm.org
       
    Tue Aug 22 07:12:42 PDT 2023
    
    
  
  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: cce3e8ed895b2d4c1396929c363c071e15fdbf8b
      https://github.com/llvm/llvm-project/commit/cce3e8ed895b2d4c1396929c363c071e15fdbf8b
  Author: Guray Ozen <guray.ozen at gmail.com>
  Date:   2023-08-22 (Tue, 22 Aug 2023)
  Changed paths:
    M mlir/include/mlir/Dialect/NVGPU/IR/NVGPU.td
    M mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
    M mlir/lib/Dialect/NVGPU/IR/NVGPUDialect.cpp
    M mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir
  Log Message:
  -----------
  [MLIR][NVGPU] Introduction of wgmma.generate.descriptor Op
This work introduces a new Op, `wgmma.generate.descriptor`, designed to create a wgmma descriptor for inputs of matrix multiply and accumulate operations using `wgmma.mma_async` PTX instruction.
The descriptor format specifications can be found in the following link:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#asynchronous-warpgroup-level-matrix-shared-memory-layout-matrix-descriptor
It's important to note that this op is in its initial phase, and it does come with certain limitations. It only supports 128b swizzling and does not incorporate interleaving. In the future, different calculations will be addressed in separate works, expanding the capabilities of the op.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D157382
    
    
More information about the All-commits
mailing list