[PATCH] D70010: [OpenMP][Offloading] Replaced default stream with an actual per-device unblocking stream in NVPTX implementation

Shilei Tian via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 8 10:08:14 PST 2019


tianshilei1992 added a comment.

In D70010#1738930 <https://reviews.llvm.org/D70010#1738930>, @ABataev wrote:

> Also, the main question, how does it affect the exiting execution model? What if we have target region in a parallel region, will they be executed asynchronously? We need some tests for this if we don't have such tests.


According to https://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf, non-default stream can improve performance. This is actually the first step to use multiple streams I'm gonna implement later.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70010/new/

https://reviews.llvm.org/D70010





More information about the llvm-commits mailing list