[Parallel_libs-commits] [PATCH] D24538: [SE] Add CUDA platform

Tue Sep 13 21:18:39 PDT 2016

jlebar added inline comments.

================
Comment at: streamexecutor/CMakeLists.txt:6
@@ -5,1 +5,3 @@
 option(STREAM_EXECUTOR_ENABLE_CONFIG_TOOL "enable building streamexecutor-config tool" ON)
+option(STREAM_EXECUTOR_ENABLE_CUDA_PLATFORM "enable building the CUDA StreamExecutor platform" OFF)
+
----------------
Not necessarily in this patch, but we should document how to configure this with CUDA enabled and (see below) how to point cmake at different CUDA installs.

================
Comment at: streamexecutor/include/streamexecutor/PlatformOptions.h.in:4
@@ +3,3 @@
+
+#cmakedefine STREAM_EXECUTOR_ENABLE_CUDA_PLATFORM
+
----------------
Maybe we should have a comment in this file explaining what it's for.

================
Comment at: streamexecutor/include/streamexecutor/platforms/cuda/CUDAPlatformDevice.h:37
@@ +36,3 @@
+
+  std::string getName() const override { return "CUDA"; }
+
----------------
Do we want the device number in the name?

================
Comment at: streamexecutor/lib/CMakeLists.txt:41
@@ -20,1 +40,3 @@
+    LINK_LIBS
+    ${STREAM_EXECUTOR_LIBCUDA_LIBRARIES})
 
----------------
Note to self, need to patch this in and try it out with the in-tree build.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatform.cpp:44
@@ +43,3 @@
+Expected<Device> CUDAPlatform::getDevice(size_t DeviceIndex) {
+  static CUresult InitResult = []() { return cuInit(0); }();
+
----------------
Do you want to wrap the cuInit(0) call in a helper function so we only ever call it once?

  static CUResult ensureCudaInitialized() {
    static InitResult = ...;
    return InitResult;
  }

I guess it doesn't make a big difference either way.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatform.cpp:59
@@ +58,3 @@
+  }
+  return Device(&Iterator->second);
+}
----------------
Hm.  Do we care if getDevice() is slow?  We could probably do this without a lock without too much trouble (essentially using the same mechanism used to ensure that function-static variables are initialized only once, except we'd be using it for member variables).  Maybe for another patch, if we care about performance.  If we don't care, even better.  :)

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:37
@@ +36,3 @@
+  if (CUresult Result = cuCtxSetCurrent(ContextHandle))
+    return CUresultToError(Result);
+
----------------
Hm, I wonder if we want to be more descriptive in our error handling and give some sort of "backtrace", indicating where we failed.  e.g. something as simple as

  return CUresultToError(result, "cuCtxSetCurrent");

Maybe not for this patch, though.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:56
@@ +55,3 @@
+  CUresult Result = cuDevicePrimaryCtxRelease(DeviceIndex);
+  // TODO(jhen): Log error.
+}
----------------
Is this an "unused variable" warning?  If so maybe do

  (void) Result;

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:80
@@ +79,3 @@
+  if (!Code)
+    return make_error("no suitable CUDA source found for compute capability " +
+                      llvm::Twine(ComputeCapabilityMajor) + "." +
----------------
Nit, "CUDA source" is probably not the right phrase, since this is all compiled CUDA code (PTX and SASS).

================
Comment at: streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake:7
@@ +6,3 @@
+
+find_path(LIBCUDA_INCLUDE_DIR cuda.h /usr/local/cuda/include)
+find_library(LIBCUDA_LIBRARY cuda)
----------------
Can we override this on the command line somehow?  If so, is that obvious, or is it worth documenting?

================
Comment at: streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake:8
@@ +7,3 @@
+find_path(LIBCUDA_INCLUDE_DIR cuda.h /usr/local/cuda/include)
+find_library(LIBCUDA_LIBRARY cuda)
+
----------------
The libcuda binary we use should come from the same cuda install as the headers we use, I think?


https://reviews.llvm.org/D24538