[Openmp-commits] [PATCH] D84381: [OpenMP] Wait for kernel prior to memory deallocation
Ye Luo via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Jul 22 20:12:33 PDT 2020
ye-luo added a comment.
Indeed, target_data_begin should be split as well. cudaMalloc blocks the whole device. Alternating cudaMalloc and transfer only makes the whole process further slower. Better to make all the allocation and then start queuing the transfer.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84381/new/
https://reviews.llvm.org/D84381
More information about the Openmp-commits
mailing list