[PATCH] D140226: [NVPTX] Introduce attribute to mark kernels without a language mode

Sun Dec 18 08:46:33 PST 2022

jhuber6 added a comment.

In D140226#4003781 <https://reviews.llvm.org/D140226#4003781>, @keryell wrote:

> I wonder whether we could not factorize some code/attribute/logic with AMDGPU or SYCL.
> Is the use case to have for example CUDA+HIP+SYCL in the same TU and thus there is a need for different attributes

It would probably be good to have the high level concept of a "kernel" be factored out since this is common between all the offloading languages. The actual implementation it gets lowered to would still need to be distinct since this usually gets turned into some magic bits stashed in the executable for the runtime to read. The use-case for this patch is simply to allow people to compile pure C/C++ code to the NVPTX architecture, but still be able to mark the necessary metadata for kernels and globals.

I've recently thought if we could just apply the same logic used for shared objects with GPU images, that is globals without `hidden` visibility would be considered `__global__` and ones with `hidden` visibility would be considered `__device__` in CUDA terms. I think the only thing preventing us from thinking of a kernel call as a dynamic symbol load is probably the launch parameters. But this is purely theoretical, I don't think we need to worry about moving away from offloading languages or anything.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140226/new/

https://reviews.llvm.org/D140226