[all-commits] [llvm/llvm-project] 585cbe: [mlir][gpu] Improving Cubin Serialization with ptx...

Mon Jul 24 03:30:07 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 585cbe3f639783bf0307b47504acbd205f135310
      https://github.com/llvm/llvm-project/commit/585cbe3f639783bf0307b47504acbd205f135310
  Author: Guray Ozen <guray.ozen at gmail.com>
  Date:   2023-07-24 (Mon, 24 Jul 2023)

  Changed paths:
    M mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
    M mlir/lib/Dialect/GPU/Transforms/SerializeToCubin.cpp
    M mlir/lib/Dialect/SparseTensor/Pipelines/SparseTensorPipelines.cpp
    M mlir/test/lib/Dialect/GPU/TestLowerToNVVM.cpp

  Log Message:
  -----------
  [mlir][gpu] Improving Cubin Serialization with ptxas Compiler

This work improves how we compile the generated PTX code using the `ptxas` compiler. Currently, we rely on the driver's jit API to compile the PTX code. However, this approach has some limitations. It doesn't always produce the same binary output as the ptxas compiler, leading to potential inconsistencies in the generated Cubin files.

This work introduces a significant improvement by directly utilizing the ptxas compiler for PTX compilation. By doing so, we can achieve more consistent and reliable results in generating cubin files. Key Benefits:
- Using the Ptxas compiler directly ensures that the cubin files generated during the build process remain consistent with CUDA compilation using `nvcc` or `clang`.
- Another advantage of this work is that it allows developers to experiment with different ptxas compilers without the need to change the compiler. Performance among ptxas compiler versions are vary, therefore, one can easily try different ptxas compilers.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155563