[Parallel_libs-commits] [PATCH] D24538: [SE] Add CUDA platform
Justin Lebar via Parallel_libs-commits
parallel_libs-commits at lists.llvm.org
Tue Sep 13 21:18:39 PDT 2016
jlebar added inline comments.
================
Comment at: streamexecutor/CMakeLists.txt:6
@@ -5,1 +5,3 @@
option(STREAM_EXECUTOR_ENABLE_CONFIG_TOOL "enable building streamexecutor-config tool" ON)
+option(STREAM_EXECUTOR_ENABLE_CUDA_PLATFORM "enable building the CUDA StreamExecutor platform" OFF)
+
----------------
Not necessarily in this patch, but we should document how to configure this with CUDA enabled and (see below) how to point cmake at different CUDA installs.
================
Comment at: streamexecutor/include/streamexecutor/PlatformOptions.h.in:4
@@ +3,3 @@
+
+#cmakedefine STREAM_EXECUTOR_ENABLE_CUDA_PLATFORM
+
----------------
Maybe we should have a comment in this file explaining what it's for.
================
Comment at: streamexecutor/include/streamexecutor/platforms/cuda/CUDAPlatformDevice.h:37
@@ +36,3 @@
+
+ std::string getName() const override { return "CUDA"; }
+
----------------
Do we want the device number in the name?
================
Comment at: streamexecutor/lib/CMakeLists.txt:41
@@ -20,1 +40,3 @@
+ LINK_LIBS
+ ${STREAM_EXECUTOR_LIBCUDA_LIBRARIES})
----------------
Note to self, need to patch this in and try it out with the in-tree build.
================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatform.cpp:44
@@ +43,3 @@
+Expected<Device> CUDAPlatform::getDevice(size_t DeviceIndex) {
+ static CUresult InitResult = []() { return cuInit(0); }();
+
----------------
Do you want to wrap the cuInit(0) call in a helper function so we only ever call it once?
static CUResult ensureCudaInitialized() {
static InitResult = ...;
return InitResult;
}
I guess it doesn't make a big difference either way.
================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatform.cpp:59
@@ +58,3 @@
+ }
+ return Device(&Iterator->second);
+}
----------------
Hm. Do we care if getDevice() is slow? We could probably do this without a lock without too much trouble (essentially using the same mechanism used to ensure that function-static variables are initialized only once, except we'd be using it for member variables). Maybe for another patch, if we care about performance. If we don't care, even better. :)
================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:37
@@ +36,3 @@
+ if (CUresult Result = cuCtxSetCurrent(ContextHandle))
+ return CUresultToError(Result);
+
----------------
Hm, I wonder if we want to be more descriptive in our error handling and give some sort of "backtrace", indicating where we failed. e.g. something as simple as
return CUresultToError(result, "cuCtxSetCurrent");
Maybe not for this patch, though.
================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:56
@@ +55,3 @@
+ CUresult Result = cuDevicePrimaryCtxRelease(DeviceIndex);
+ // TODO(jhen): Log error.
+}
----------------
Is this an "unused variable" warning? If so maybe do
(void) Result;
================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:80
@@ +79,3 @@
+ if (!Code)
+ return make_error("no suitable CUDA source found for compute capability " +
+ llvm::Twine(ComputeCapabilityMajor) + "." +
----------------
Nit, "CUDA source" is probably not the right phrase, since this is all compiled CUDA code (PTX and SASS).
================
Comment at: streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake:7
@@ +6,3 @@
+
+find_path(LIBCUDA_INCLUDE_DIR cuda.h /usr/local/cuda/include)
+find_library(LIBCUDA_LIBRARY cuda)
----------------
Can we override this on the command line somehow? If so, is that obvious, or is it worth documenting?
================
Comment at: streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake:8
@@ +7,3 @@
+find_path(LIBCUDA_INCLUDE_DIR cuda.h /usr/local/cuda/include)
+find_library(LIBCUDA_LIBRARY cuda)
+
----------------
The libcuda binary we use should come from the same cuda install as the headers we use, I think?
https://reviews.llvm.org/D24538
More information about the Parallel_libs-commits
mailing list