[Openmp-commits] [PATCH] D139287: [WIP][OpenMP] Introduce basic JIT support to OpenMP target offloading

Mon Dec 5 07:52:20 PST 2022

tianshilei1992 added a comment.

In D139287#3971062 <https://reviews.llvm.org/D139287#3971062>, @jhuber6 wrote:

> In D139287#3971024 <https://reviews.llvm.org/D139287#3971024>, @tianshilei1992 wrote:
>
>> In D139287#3970996 <https://reviews.llvm.org/D139287#3970996>, @jhuber6 wrote:
>>
>>> Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by `libomptarget` proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip `ptxas` by giving CUDA the ptx directly, we probably will need to invoke `lld` on the command line however right.
>>>
>>>   for each image:
>>>     if image is bitcode
>>>       image = compile(image)
>>>    register(image)
>>
>> We could handle them in `libomptarget`, but that's gonna require we add another two interface functions: `is_valid_bitcode_image`, and `compile_bitcode_image`. It is doable. Handling them in plugin as a separate module can just reuse the two existing interfaces.
>
> Would we need to consult the plugin? We can just check the `magic` directly, if it's bitcode we just compile it for its triple. If this was wrong then when the plugin gets the compiled image it will error.

I prefer error out at earlier stage, especially if we have a bitcode image, and both Nvidia and AMD support JIT, then both NVIDIA and AMD will report a valid binary, thus continue compiling the image, initializing the plugin, etc., which could give us the wrong results.

>>> Also I don't know if we can skip `ptxas` by giving CUDA the ptx directly, we probably will need to invoke `lld` on the command line however right.
>>>
>>>   for each image:
>>>     if image is bitcode
>>>       image = compile(image)
>>>    register(image)
>>
>> We can give CUDA PTX directly, since the CUDA JIT is to just call `ptxas` instead of `ptxas -c`, which requires `nvlink` afterwards.
>
> That makes it easier for us, so the only command line tool we need to call is `lld` for AMDGPU.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139287/new/

https://reviews.llvm.org/D139287