[Openmp-commits] [PATCH] D139287: [WIP][OpenMP] Introduce basic JIT support to OpenMP target offloading
Shilei Tian via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Mon Dec 5 07:52:20 PST 2022
tianshilei1992 added a comment.
In D139287#3971062 <https://reviews.llvm.org/D139287#3971062>, @jhuber6 wrote:
> In D139287#3971024 <https://reviews.llvm.org/D139287#3971024>, @tianshilei1992 wrote:
>
>> In D139287#3970996 <https://reviews.llvm.org/D139287#3970996>, @jhuber6 wrote:
>>
>>> Why do we have the JIT in the nextgen plugins? I figured that JIT would be handled by `libomptarget` proper rather than the plugins. I guess this is needed for per-kernel specialization? My idea of the rough pseudocode would be like this and we wouldn't need a complex class heirarchy. Also I don't know if we can skip `ptxas` by giving CUDA the ptx directly, we probably will need to invoke `lld` on the command line however right.
>>>
>>> for each image:
>>> if image is bitcode
>>> image = compile(image)
>>> register(image)
>>
>> We could handle them in `libomptarget`, but that's gonna require we add another two interface functions: `is_valid_bitcode_image`, and `compile_bitcode_image`. It is doable. Handling them in plugin as a separate module can just reuse the two existing interfaces.
>
> Would we need to consult the plugin? We can just check the `magic` directly, if it's bitcode we just compile it for its triple. If this was wrong then when the plugin gets the compiled image it will error.
I prefer error out at earlier stage, especially if we have a bitcode image, and both Nvidia and AMD support JIT, then both NVIDIA and AMD will report a valid binary, thus continue compiling the image, initializing the plugin, etc., which could give us the wrong results.
>>> Also I don't know if we can skip `ptxas` by giving CUDA the ptx directly, we probably will need to invoke `lld` on the command line however right.
>>>
>>> for each image:
>>> if image is bitcode
>>> image = compile(image)
>>> register(image)
>>
>> We can give CUDA PTX directly, since the CUDA JIT is to just call `ptxas` instead of `ptxas -c`, which requires `nvlink` afterwards.
>
> That makes it easier for us, so the only command line tool we need to call is `lld` for AMDGPU.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D139287/new/
https://reviews.llvm.org/D139287
More information about the Openmp-commits
mailing list