[Openmp-dev] Declare variant + Nvidia Device Offload

Mon May 18 08:33:02 PDT 2020

Hi all,

what's the current status of declare variant when compiling for Nvidia
GPUs?

In my code, I have declared a variant of a function, that uses CUDA's
built-in atomicAdd (using the syntax from OpenMP TR8):

> #pragma omp begin declare variant match(device={kind(nohost)})
>
> void atom_add(double* address, double val){
>         atomicAdd(address, val);
> }
>
> #pragma omp end declare variant
When compiling with Clang from master, ptxas fails:

> clang++ -fopenmp   -O3 -std=c++11 -fopenmp
> -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target -march=sm_72 -v
> [...]
> ptxas kernel-openmp-nvptx64-nvidia-cuda.s, line 322; fatal   : Parsing
> error near '.ompvariant': syntax error
> ptxas fatal   : Ptx assembly aborted due to errors
> [...]
> clang-11: error: ptxas command failed with exit code 255 (use -v to
> see invocation)
The line mentioned in the ptxas error looks like this:

>         // .globl       _Z33atom_add.ompvariant.S2.s6.PnohostPdd
> .visible .func _Z33atom_add.ompvariant.S2.s6.PnohostPdd(
>         .param .b64 _Z33atom_add.ompvariant.S2.s6.PnohostPdd_param_0,
>         .param .b64 _Z33atom_add.ompvariant.S2.s6.PnohostPdd_param_1
> )
> {
My guess was that ptxas stumbles across the ".ompvariant"-part of the
mangled function name.

Is declare variant currently supported when compiling for Nvidia GPUs?
If not, is there a workaround (macro defined only for device
compilation, access to the atomic CUDA functions, ...)?

Thanks in advance,

Best

Lukas

-- 
Lukas Sommer, M.Sc.
TU Darmstadt
Embedded Systems and Applications Group (ESA)
Hochschulstr. 10, 64289 Darmstadt, Germany
Phone: +49 6151 1622429
www.esa.informatik.tu-darmstadt.de