[Openmp-dev] GPU kernels are called with the wrong number of arguments, deliberate?

Johannes Doerfert via Openmp-dev openmp-dev at lists.llvm.org
Mon Aug 10 15:37:33 PDT 2020


It certainly looks weird.

Here is the simplified version in which I cannot tell how we would 
determine in `libomptarget` how many arguments to pass on, the calls 
look all the same to me:

https://clang.godbolt.org/z/TdroxE



On 8/10/20 10:22 AM, Jon Chesterfield via Openmp-dev wrote:
> The code appended compiles to a single kernel. It uses the data mapping
> logic. The kernel that expects one argument gets invoked by cuda's rtl with
> two arguments. That looks wrong to me.
>
> Is this a deliberate feature of target mapper, and if not, can anyone point
> me to roughly where to start looking?
>
> Thanks,
>
> Jon
>
> Trailing test ompiled with:
> ~/llvm/bin/clang++  -O2  -target x86_64-pc-linux-gnu -fopenmp
> -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda
> -march=sm_50  test.cpp -o test -L/usr/local/cuda/targets/x86_64-linux/lib
> -lcudart -save-temps
>
> This errors because __dummy.omp_offloading.entry isn't marked weak, but
> ignoring that error for now because we still get the temp files out,
>
> The device kernel has one argument:
> llvm-dis test-openmp-nvptx64-nvidia-cuda.bc -o - | grep __omp_offloading
>
> define weak void @__omp_offloading_802_9c0dba_main_l23([10 x i32*]* nonnull
> align 8 dereferenceable(80) %0) % <- one argument
>
> It is passed to tgt_target_mapper,
> llvm-dis test-host-x86_64-pc-linux-gnu.bc -o - | grep __omp_offloading
> call i32 @__tgt_target_mapper(i64 -1, i8*
> @.__omp_offloading_802_9c0dba_main_l23.region_id, i32 2, <- arg num here
> ...)
>
>
> // Smallest example from aomp's test suite that hits this case
> #include <stdio.h>
> #include <omp.h>
> #include <stdint.h>
>
> #define N 640
> #define C 64
> #define P 10
>
> int A[N];
> int *p[P];
>
> int main()
> {
>    int i;
>    for(i=0; i<N; i++) A[i] = i;
>    for(i=0; i<P; i++) p[i] = &A[i*C];
>
>    #pragma omp target enter data map(to: A) map(alloc: p)
>    for(i=0; i<P; i++) {
>      #pragma omp target enter data map(alloc: p[i][0:C])
>    }
>
>    #pragma omp target map(alloc: A, p)
>    {
>      int i, j;
>      for(i=0; i<P; i++) {
>        for(j=0; j<C; j++) {
>          p[i][j]++;
>        }
>      }
>    }
>
>    #pragma omp target update from( A)
>
>    int error = 0;
>    for(i=0; i<N; i++) {
>      if (A[i] != i+1) printf("%4d: got %d, expected %d, error %d\n", i,
> A[i], i+1, ++error);
>    }
>    printf("completed TEST ARRAY with %d errors\n", error);
>
>    return (error == 0)?0:1;
> }
>
>
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev


More information about the Openmp-dev mailing list