[llvm-dev] AMDGPU mimics JIT?

Tue Feb 25 09:17:24 PST 2020

Hi!

I'm looking into ways how to port our application to machines using AMD 
GPUs.

The way the application is currently set up for accelerated computing is 
the following:

1) LLVM IR modules are being built in memory during runtime

2) In case of NVIDIA GPUs each module is compiled with the NVPTX backend 
and the assembly file (PTX) extracted

3) The PTX is loaded with the NVIDIA kernel driver which JIT-compiles it 
to the actual GPU installed, and the kernel is launched on the GPU for 
execution.

Now, AMD doesn't seem to use an intermediate IR level comparable to PTX. 
As far as I understand the AMDGPU backend generates binary code (AMGCN) 
for the GPU kernels directly. This makes me wonder if there is any way 
to execute (launch) such a kernel after it has been compiled by the 
AMDGPU backend.

It will certainly not work in the typical HIP way using the ROCM 
utilities where a kernel is specified with the __global__ attribute like

__global__ void kernel(const T* in, T* out) {}

 From the AMDGPU kernel's compilation at most a raw pointer is returned 
(and that is only in the case if the backend supports JIT, which it 
probably doesn't!?) otherwise it produces a library. Is it possible to 
'dlload' such a (kernel) library into the address space and launch it?

Anyone has any idea how to launch such a kernel within the same program 
context/execution?

Thanks,

Frank