[Mlir-commits] [mlir] [mlir][gpu][NVPTX] Enable NVIDIA GPU JIT compilation path (PR #66220)
Fabian Mora
llvmlistbot at llvm.org
Thu Sep 14 06:20:40 PDT 2023
================
@@ -144,6 +144,22 @@ struct SparseCompilerOptions
desc("GPU target architecture")};
PassOptions::Option<std::string> gpuFeatures{*this, "gpu-features",
desc("GPU target features")};
+ /// For NVIDIA GPUs there are 3 compilation format options:
+ /// 1. `isa`: the compiler generates PTX and the runtime JITs the PTX.
+ /// 2. `bin`: generates a CUBIN object for `chip=gpuChip`.
+ /// 3. `fatbin`: generates a fat binary with a CUBIN object for `gpuChip` and
+ /// also embeds the PTX in the fat binary.
+ /// Notes:
+ /// Option 1 adds a significant runtime performance hit, however, tests are
+ /// more likely to pass with this option.
+ /// Option 2 is better for execution time as there is no JIT; however, the
+ /// program will fail if there's an arch mismatch between `gpuChip` and the
+ /// GPU running the program.
+ /// Option 3 is the best compromise between options 1 & 2 as it can JIT in
+ /// case of an arch mismatch, however, it's only possible to JIT to a higher
+ /// CC than `gpuChip`.
----------------
fabianmcg wrote:
It's never specified that's why `gpu-to-cubin` always worked, it's always JITted to the running arch.
If there's an arch mismatch then 1 and 3 have the same performance hit, however if the compiled arch matches the running arch, then it behaves like 2 and there's no performance hit.
https://github.com/llvm/llvm-project/pull/66220
More information about the Mlir-commits
mailing list