[Parallel_libs-commits] [PATCH] D25701: Initial check-in of Acxxel (StreamExecutor renamed)
Justin Lebar via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Tue Oct 18 18:29:31 PDT 2016
jlebar added inline comments.
================
Comment at: acxxel/cuda_acxxel.cpp:29
+/// Index of active device for this thread.
+thread_local int ActiveDeviceIndex;
+
----------------
Does this have to be per-thread rather than per stream? Kind of a bummer that one thread can't operate on two streams for different devices (among other things).
At least maybe setActiveDevice should be called setActiveDeviceForThread.
================
Comment at: acxxel/cuda_acxxel.cpp:169
+ /// Map from device index to context.
+ std::map<int, CUcontext> TheContexts;
+};
----------------
It looks like this map always has the keys 0, 1, ..., n. So, could we make it a vector? That will make lookups substantially cheaper, although I don't know if we care.
================
Comment at: acxxel/cuda_acxxel.cpp:193
+
+ ActiveDeviceIndex = 0;
+ return CUDAPlatform(Contexts);
----------------
How does this work, exactly? We set this thread's active device index to 0, but new threads have it as uninitialized memory? That seems not so good.
================
Comment at: acxxel/cuda_acxxel.cpp:283
+ return getCUError(Result, "cuEventElapsedTime");
+ return Milliseconds;
+}
----------------
Forgot to convert to seconds.
================
Comment at: acxxel/cuda_acxxel.cpp:476
+ std::unique_ptr<StreamCallbackUserData> UserData(
+ new StreamCallbackUserData(Stream, Callback));
+ return getCUError(cuStreamAddCallback(Stream, cuStreamCallbackShim,
----------------
std::move(Callback)
================
Comment at: acxxel/cuda_acxxel.cpp:497
+ ErrorLogBuffer,
+ const_cast<int *>(&LogBufferSizeBytes)};
+ constexpr size_t OptionNameCount =
----------------
Could we use std::array so we don't have to do these ARRAYSIZE shenanigans?
https://reviews.llvm.org/D25701
More information about the Parallel_libs-commits
mailing list