[Openmp-commits] [PATCH] D115279: [OpenMP][AMDGPU] Switch host-device memory copy to asynchronous version
Carlo Bertolli via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Dec 8 08:40:26 PST 2021
carlo.bertolli added a comment.
In D115279#3179687 <https://reviews.llvm.org/D115279#3179687>, @ye-luo wrote:
>> I have not tried with memory that has already been locked, but I will. In any case, with this patch, if locking fails, then we revert to malloc+lock+unlock+free. This is not ideal, and this case is added for other reasons, but it should be supporting the case.
>
> It will be better skipping lock/free if the memory is known to HSA already. I think IBM XL skips its pinned memory optimization when it sees the pointer pinned already for CUDA.
> I have code managing lock/unlock via HIP. Even if a lock call from the plugin succeeds, and then a plugin unlock call succeeds, the user unlock call fails.
> For this reason, check memory info is required.
> fallback to "malloc+lock+unlock+free" is the worst option.
The following test works for me:
#include<stdio.h>
#include<omp.h>
#include<hsa/hsa.h>
#include <hsa/hsa_ext_amd.h>
#define N 100293
int main() {
int n = N;
int *a = new int[n];
int *a_locked = nullptr;
hsa_status_t herr = hsa_amd_memory_lock(a, n*sizeof(int), nullptr, 0, (void **)&a_locked);
if (herr != HSA_STATUS_SUCCESS) {
printf("Locking failed\n");
return 1;
}
#pragma omp target parallel for map(tofrom:a_locked[:n])
for(int i = 0; i < n; i++)
a_locked[i] = i;
herr = hsa_amd_memory_unlock(a);
if (herr != HSA_STATUS_SUCCESS) {
printf("Unlocking failed\n");
return 1;
}
int err = 0;
for(int i = 0; i < n; i++)
if (a[i] != i) {
err++;
printf("Err at %d, got %d expected %d\n", i, a[i], i);
if (err >10) break;
}
delete[] a;
return err;
}
@ye-luo can you please share a minimal test that is failing for you? Thanks!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115279/new/
https://reviews.llvm.org/D115279
More information about the Openmp-commits
mailing list