[Mlir-commits] [mlir] [mlir][gpu][nvptx] Remove null terminator when outputting PTX (PR #133019)

Fri Mar 28 18:15:50 PDT 2025

modiking wrote:

> > Looking around I couldn't find if there was a convention to what should be outputted by `moduleToObject`. I did see that the ROCDL target omits the null terminator (https://github.com/llvm/llvm-project/blob/main/mlir/lib/Target/LLVM/ROCDL/Target.cpp#L440).
> 
> They don't have it because there's no JIT path for ROCm.
> 
> > @fabianmcg do you remember if there was a specific reason the null terminator got added in [5093413](https://github.com/llvm/llvm-project/commit/5093413a5007b017a530edbeed42d32bfd18b126) and only for NVPTX?
> 
> Yes, that was added for the ptx-JIT execution path, see the driver function https://github.com/llvm/llvm-project/blob/main/mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp#L143-L144 Which according to the [CUDA driver docs](https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MODULE.html#group__CUDA__MODULE_1g9e8047e9dbf725f0cd7cafd18bfd4d12) it requires a null-terminated string.

That's good to know, thanks! I'm thinking then if this is specific to the JIT execution path it may be better to add the null there instead of to the ptx under `gpu::CompilationTarget::Assembly`. WDYT?

> > The image may be a cubin or fatbin as output by nvcc, or a NULL-terminated PTX, either as output by nvcc or hand-written.
> 
> Also, FWIW `\0` is ASCII https://en.wikipedia.org/wiki/ASCII

That's true 😅. Nevertheless ptxas team does want to exclude '\0' as well so a way to do that would still be great.

> If this is about `ptxas` then the fix may rather be at the point where we call `ptxas`. We have to write this down to a file somehow to pass it to `ptxas` and this is where we write an extra null character to the file.

On that path we're in `gpu::CompilationTarget::Binary` or `gpu::CompilationTarget::Fatbin` and there's no extra null character written in the file. `serializedISA` has no null character at the end but one gets added under the `gpu::CompilationTarget::Assembly` path:

https://github.com/llvm/llvm-project/blob/1c9fe8c8af7e473e37e141d36837d196dd9a5256/mlir/lib/Target/LLVM/NVVM/Target.cpp#L724-L739

On the specific path that's causing issues, `gpu::CompilationTarget::Assembly` is being requested and then the contents of the buffer is dumped to a file. This file would contain a null character and then when passed to 12.8+ ptxas causes an error.

https://github.com/llvm/llvm-project/pull/133019