[all-commits] [llvm/llvm-project] 40deed: [mlir][Transform] Introduce nvgpu transform extens...

Mon Jun 26 09:22:10 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 40deed40ae77ba22f7c72693903752ab6bfeb4e7
      https://github.com/llvm/llvm-project/commit/40deed40ae77ba22f7c72693903752ab6bfeb4e7
  Author: Nicolas Vasilache <nicolasvasilache at users.noreply.github.com>
  Date:   2023-06-26 (Mon, 26 Jun 2023)

  Changed paths:
    M mlir/include/mlir/Dialect/NVGPU/CMakeLists.txt
    A mlir/include/mlir/Dialect/NVGPU/TransformOps/CMakeLists.txt
    A mlir/include/mlir/Dialect/NVGPU/TransformOps/NVGPUTransformOps.h
    A mlir/include/mlir/Dialect/NVGPU/TransformOps/NVGPUTransformOps.td
    M mlir/include/mlir/InitAllDialects.h
    M mlir/lib/Dialect/NVGPU/CMakeLists.txt
    A mlir/lib/Dialect/NVGPU/TransformOps/CMakeLists.txt
    A mlir/lib/Dialect/NVGPU/TransformOps/NVGPUTransformOps.cpp
    A mlir/test/Dialect/NVGPU/transform-matmul-to-nvvm.mlir
    A mlir/test/Integration/GPU/CUDA/TensorCore/transform-mma-sync-matmul-f32.mlir
    M utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

  Log Message:
  -----------
  [mlir][Transform] Introduce nvgpu transform extensions

Mapping to NVGPU operations such as mma.sync with mixed precision and ldmatrix with transposes and
various data types involves complex matchings from low-level IR.
This is akin to raising complex patterns after unnecessarily having lost structural information.
To avoid such unnecessary complexity, introduce a direct mapping step from a matmul on memrefs
to distributed NVGPU vector abstractions.
In this context, mapping to specific mma.sync operations is trivial and consists in simply
translating the documentation into indexing expressions.

Correctness is demonstrated with an end-to-end integration test.

Differential Revision: https://reviews.llvm.org/D153420