[Openmp-commits] [PATCH] D115279: [OpenMP][AMDGPU] Switch host-device memory copy to asynchronous version
Carlo Bertolli via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Dec 8 09:51:27 PST 2021
carlo.bertolli added a comment.
In D115279#3179916 <https://reviews.llvm.org/D115279#3179916>, @ye-luo wrote:
> I failed to verify your first lock behaves as intended.
>
> #include <hsa/hsa.h>
> #include <hsa/hsa_ext_amd.h>
> #include <omp.h>
> #include <stdio.h>
>
> #define N 100293
>
> int checkLocked(void *ptr) {
> hsa_amd_pointer_info_t info;
> hsa_status_t herr;
>
> herr = hsa_amd_pointer_info(ptr, &info, NULL, NULL, NULL);
> if (herr != HSA_STATUS_SUCCESS) {
> printf(" hsa_amd_pointer_info failed\n");
> return 1;
> }
>
> if (info.type != HSA_EXT_POINTER_TYPE_LOCKED) {
> printf(" pointer is noooooooooooot locked\n");
> return 1;
> } else
> printf(" pointer is locked\n");
>
> return 0;
> }
>
> int main() {
> int n = N;
> int *a = new int[n];
> for (int i = 0; i < n; i++)
> a[i] = 0;
>
> int *a_locked = nullptr;
> hsa_status_t herr =
> hsa_amd_memory_lock(a, n * sizeof(int), nullptr, 0, (void **)&a_locked);
> if (herr != HSA_STATUS_SUCCESS) {
> printf("Locking failed\n");
> return 1;
> }
>
> checkLocked(a);
>
> #pragma omp target parallel for map(tofrom : a_locked[:n])
> for (int i = 0; i < n; i++)
> a_locked[i] = i;
>
> herr = hsa_amd_memory_unlock(a);
> if (herr != HSA_STATUS_SUCCESS) {
> printf("Unlocking failed\n");
> return 1;
> }
>
> int err = 0;
> for (int i = 0; i < n; i++)
> if (a[i] != i) {
> err++;
> printf("Err at %d, got %d expected %d\n", i, a[i], i);
> if (err > 10)
> break;
> }
>
> delete[] a;
>
> return err;
> }
>
> I got failure at the first check with "hsa_amd_pointer_info failed". Could you take a look?
Thanks for the test. This works for me and this is what I get:
pointer is locked
I believe that is what you expect?
Tracing shows we are running on the gpu correctly:
export LIBOMPTARGET_KERNEL_TRACE=2
./user_memory_locks
pointer is locked
DEVID: 0 SGN:2 ConstWGSize:256 args: 2 teamsXthrds:( 1X 256) reqd:( 1X 0) lds_usage:11304B sgpr_count:39 vgpr_count:22 sgpr_spill_count:0 vgpr_spill_count:0 tripcount:0 n:__omp_offloading_fd00_5882c9d_main_l43
In this run, I am using the latest of trunk with rocm 4.5 installed on the machine. GPU is a gfx90a.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115279/new/
https://reviews.llvm.org/D115279
More information about the Openmp-commits
mailing list