[Openmp-dev] Declare variant + Nvidia Device Offload

Mon May 18 09:18:41 PDT 2020

Oh, I forgot about this one.

The math stuff works because all declare variant functions are static.

I think if we need to replace the `.` with a symbol that the user cannot

use but the ptax assembler is not upset about. we should also move

`getOpenMPVariantManglingSeparatorStr` from `Decl.h` into

`llvm/lib/Frontends/OpenMP/OMPContext.h`, I forgot why I didn't.

You should also be able to use the clang builtin atomics and even the

`omp atomic` should eventually resolve to the same thing (I hope).

Let me know if that helps,

   Johannes

On 5/18/20 10:33 AM, Lukas Sommer via Openmp-dev wrote:
> Hi all,
>
> what's the current status of declare variant when compiling for Nvidia
> GPUs?
>
> In my code, I have declared a variant of a function, that uses CUDA's
> built-in atomicAdd (using the syntax from OpenMP TR8):
>
>> #pragma omp begin declare variant match(device={kind(nohost)})
>>
>> void atom_add(double* address, double val){
>>          atomicAdd(address, val);
>> }
>>
>> #pragma omp end declare variant
> When compiling with Clang from master, ptxas fails:
>
>> clang++ -fopenmp   -O3 -std=c++11 -fopenmp
>> -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target -march=sm_72 -v
>> [...]
>> ptxas kernel-openmp-nvptx64-nvidia-cuda.s, line 322; fatal   : Parsing
>> error near '.ompvariant': syntax error
>> ptxas fatal   : Ptx assembly aborted due to errors
>> [...]
>> clang-11: error: ptxas command failed with exit code 255 (use -v to
>> see invocation)
> The line mentioned in the ptxas error looks like this:
>
>>          // .globl       _Z33atom_add.ompvariant.S2.s6.PnohostPdd
>> .visible .func _Z33atom_add.ompvariant.S2.s6.PnohostPdd(
>>          .param .b64 _Z33atom_add.ompvariant.S2.s6.PnohostPdd_param_0,
>>          .param .b64 _Z33atom_add.ompvariant.S2.s6.PnohostPdd_param_1
>> )
>> {
> My guess was that ptxas stumbles across the ".ompvariant"-part of the
> mangled function name.
>
> Is declare variant currently supported when compiling for Nvidia GPUs?
> If not, is there a workaround (macro defined only for device
> compilation, access to the atomic CUDA functions, ...)?
>
> Thanks in advance,
>
> Best
>
> Lukas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20200518/18bae9e4/attachment.html>