[Openmp-commits] [PATCH] D84381: [OpenMP] Wait for kernel prior to memory deallocation

Ye Luo via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Jul 22 20:12:33 PDT 2020


ye-luo added a comment.

Indeed, target_data_begin should be split as well. cudaMalloc blocks the whole device. Alternating cudaMalloc and transfer only makes the whole process further slower. Better to make all the allocation and then start queuing the transfer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84381/new/

https://reviews.llvm.org/D84381





More information about the Openmp-commits mailing list