[all-commits] [llvm/llvm-project] 585cbe: [mlir][gpu] Improving Cubin Serialization with ptx...
Guray Ozen via All-commits
all-commits at lists.llvm.org
Mon Jul 24 03:30:07 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 585cbe3f639783bf0307b47504acbd205f135310
https://github.com/llvm/llvm-project/commit/585cbe3f639783bf0307b47504acbd205f135310
Author: Guray Ozen <guray.ozen at gmail.com>
Date: 2023-07-24 (Mon, 24 Jul 2023)
Changed paths:
M mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
M mlir/lib/Dialect/GPU/Transforms/SerializeToCubin.cpp
M mlir/lib/Dialect/SparseTensor/Pipelines/SparseTensorPipelines.cpp
M mlir/test/lib/Dialect/GPU/TestLowerToNVVM.cpp
Log Message:
-----------
[mlir][gpu] Improving Cubin Serialization with ptxas Compiler
This work improves how we compile the generated PTX code using the `ptxas` compiler. Currently, we rely on the driver's jit API to compile the PTX code. However, this approach has some limitations. It doesn't always produce the same binary output as the ptxas compiler, leading to potential inconsistencies in the generated Cubin files.
This work introduces a significant improvement by directly utilizing the ptxas compiler for PTX compilation. By doing so, we can achieve more consistent and reliable results in generating cubin files. Key Benefits:
- Using the Ptxas compiler directly ensures that the cubin files generated during the build process remain consistent with CUDA compilation using `nvcc` or `clang`.
- Another advantage of this work is that it allows developers to experiment with different ptxas compilers without the need to change the compiler. Performance among ptxas compiler versions are vary, therefore, one can easily try different ptxas compilers.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155563
More information about the All-commits
mailing list