[Openmp-commits] [PATCH] D74145: [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Feb 7 11:30:59 PST 2020


JonChesterfield added inline comments.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:95
   std::vector<std::list<FuncOrGblEntryTy>> FuncGblEntries;
+  std::vector<std::unique_ptr<std::atomic_int>> NextStreamId;
 
----------------
tianshilei1992 wrote:
> JonChesterfield wrote:
> > tianshilei1992 wrote:
> > > jdoerfert wrote:
> > > > Make it `uint` please.
> > > Right, in case of integer overflow, my bad...
> > vector of pointers to atomic_int is interesting. What's the advantage over vector<atomic_int>?
> > 
> > It might be worth putting a few asserts in the code to the effect that resizing the vector after the initial construction will break access from other threads.
> `atomic_int` is not copyable. And the initialization of all these pointers are after the resize of vector, so we might not need to consider that.
Sure, but you're not copying the element anywhere.

Sadly I think we would need to provide the size of the vector up front. reserve() calls copy constructors (which I didn't expect) and they're deleted for atomic_int. I'm not sure the cuda api will permit that.

Which leads to the suggestion:
```
std::unique_ptr<std::atomic_int[]>> NextStreamId;
// ...
NextStreamId = std::make_unique<std::atomic_int[]>(NumberOfDevices);
```

This elides the NumberOfDevices heap allocations and the associated indirection on every access and makes it somewhat more obvious that we can't call various vector api functions.

It has the disadvantage that the integers will now definitely be in the same cache line, whereas previously there was a chance that the allocator would put them on different cache lines. 

Overall I'm fine with either structure.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:259
+    for (std::unique_ptr<std::atomic_int> &Ptr : NextStreamId) {
+      Ptr = std::unique_ptr<std::atomic_int>(new std::atomic_int(0));
+    }
----------------
jdoerfert wrote:
> tianshilei1992 wrote:
> > JonChesterfield wrote:
> > > If we do need the pointer wrapper, this should be make_unique
> > `make_unique` only works since C++14.
> Do we have llvm::make_unique? But maybe not necessarily good to use it here anyway. @jon ok to stick with this for now?
llvm::make_unique was removed by D66259, as we're now assuming C++14. They're semantically identical in this context so it doesn't matter much.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74145/new/

https://reviews.llvm.org/D74145





More information about the Openmp-commits mailing list