[Openmp-commits] [PATCH] D115279: [OpenMP][AMDGPU] Switch host-device memory copy to asynchronous version

Carlo Bertolli via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Dec 8 09:51:27 PST 2021


carlo.bertolli added a comment.

In D115279#3179916 <https://reviews.llvm.org/D115279#3179916>, @ye-luo wrote:

> I failed to verify your first lock behaves as intended.
>
>   #include <hsa/hsa.h>
>   #include <hsa/hsa_ext_amd.h>
>   #include <omp.h>
>   #include <stdio.h>
>   
>   #define N 100293
>   
>   int checkLocked(void *ptr) {
>     hsa_amd_pointer_info_t info;
>     hsa_status_t herr;
>   
>     herr = hsa_amd_pointer_info(ptr, &info, NULL, NULL, NULL);
>     if (herr != HSA_STATUS_SUCCESS) {
>       printf("  hsa_amd_pointer_info failed\n");
>       return 1;
>     }
>   
>     if (info.type != HSA_EXT_POINTER_TYPE_LOCKED) {
>       printf("  pointer is noooooooooooot locked\n");
>       return 1;
>     } else
>       printf("  pointer is locked\n");
>   
>     return 0;
>   }
>   
>   int main() {
>     int n = N;
>     int *a = new int[n];
>     for (int i = 0; i < n; i++)
>       a[i] = 0;
>   
>     int *a_locked = nullptr;
>     hsa_status_t herr =
>         hsa_amd_memory_lock(a, n * sizeof(int), nullptr, 0, (void **)&a_locked);
>     if (herr != HSA_STATUS_SUCCESS) {
>       printf("Locking failed\n");
>       return 1;
>     }
>   
>     checkLocked(a);
>   
>   #pragma omp target parallel for map(tofrom : a_locked[:n])
>     for (int i = 0; i < n; i++)
>       a_locked[i] = i;
>   
>     herr = hsa_amd_memory_unlock(a);
>     if (herr != HSA_STATUS_SUCCESS) {
>       printf("Unlocking failed\n");
>       return 1;
>     }
>   
>     int err = 0;
>     for (int i = 0; i < n; i++)
>       if (a[i] != i) {
>         err++;
>         printf("Err at %d, got %d expected %d\n", i, a[i], i);
>         if (err > 10)
>           break;
>       }
>   
>     delete[] a;
>   
>     return err;
>   }
>
> I got  failure at the first check with "hsa_amd_pointer_info failed". Could you take a look?

Thanks for the test. This works for me and this is what I get:

  pointer is locked

I believe that is what you expect?

Tracing shows we are running on the gpu correctly:
export LIBOMPTARGET_KERNEL_TRACE=2
 ./user_memory_locks

  pointer is locked

DEVID: 0 SGN:2 ConstWGSize:256  args: 2 teamsXthrds:(   1X 256) reqd:(   1X   0) lds_usage:11304B sgpr_count:39 vgpr_count:22 sgpr_spill_count:0 vgpr_spill_count:0 tripcount:0 n:__omp_offloading_fd00_5882c9d_main_l43

In this run, I am using the latest of trunk with rocm 4.5 installed on the machine. GPU is a gfx90a.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115279/new/

https://reviews.llvm.org/D115279



More information about the Openmp-commits mailing list