[PATCH] D70010: [OpenMP][Offloading] Replaced default stream with an actual per-device unblocking stream in NVPTX implementation
Shilei Tian via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 8 10:08:14 PST 2019
tianshilei1992 added a comment.
In D70010#1738930 <https://reviews.llvm.org/D70010#1738930>, @ABataev wrote:
> Also, the main question, how does it affect the exiting execution model? What if we have target region in a parallel region, will they be executed asynchronously? We need some tests for this if we don't have such tests.
According to https://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf, non-default stream can improve performance. This is actually the first step to use multiple streams I'm gonna implement later.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70010/new/
https://reviews.llvm.org/D70010
More information about the llvm-commits
mailing list