[PATCH] D70010: [OpenMP][Offloading] Replaced default stream with an actual per-device unblocking stream in NVPTX implementation

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 8 10:17:20 PST 2019


ABataev added a comment.

In D70010#1739049 <https://reviews.llvm.org/D70010#1739049>, @tianshilei1992 wrote:

> In D70010#1738930 <https://reviews.llvm.org/D70010#1738930>, @ABataev wrote:
>
> > Also, the main question, how does it affect the exiting execution model? What if we have target region in a parallel region, will they be executed asynchronously? We need some tests for this if we don't have such tests.
>
>
> According to https://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf, non-default stream can improve performance. This is actually the first step to use multiple streams I'm gonna implement later.


My question is different. Does it affect execution of the existing code anyhow?


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70010/new/

https://reviews.llvm.org/D70010





More information about the llvm-commits mailing list