[Openmp-commits] [PATCH] D115279: [OpenMP][AMDGPU] Switch host-device memory copy to asynchronous version
Ye Luo via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Dec 8 08:09:33 PST 2021
ye-luo added a comment.
> I have not tried with memory that has already been locked, but I will. In any case, with this patch, if locking fails, then we revert to malloc+lock+unlock+free. This is not ideal, and this case is added for other reasons, but it should be supporting the case.
It will be better skipping lock/free if the memory is known to HSA already. I think IBM XL skips its pinned memory optimization when it sees the pointer pinned already for CUDA.
I have code managing lock/unlock via HIP. Even if a lock call from the plugin succeeds, and then a plugin unlock call succeeds, the user unlock call fails.
For this reason, check memory info is required.
fallback to "malloc+lock+unlock+free" is the worst option.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the Openmp-commits