[Parallel_libs-commits] [PATCH] D24538: [SE] Add CUDA platform

Jason Henline via Parallel_libs-commits parallel_libs-commits at lists.llvm.org
Wed Sep 14 10:13:48 PDT 2016


jhen added inline comments.

================
Comment at: streamexecutor/CMakeLists.txt:6
@@ -5,1 +5,3 @@
 option(STREAM_EXECUTOR_ENABLE_CONFIG_TOOL "enable building streamexecutor-config tool" ON)
+option(STREAM_EXECUTOR_ENABLE_CUDA_PLATFORM "enable building the CUDA StreamExecutor platform" OFF)
+
----------------
jlebar wrote:
> Not necessarily in this patch, but we should document how to configure this with CUDA enabled and (see below) how to point cmake at different CUDA installs.
Sounds good. I'll tackle this in the next patch when I enable pointing cmake at different CUDA installs.

================
Comment at: streamexecutor/include/streamexecutor/platforms/cuda/CUDAPlatformDevice.h:23
@@ +22,3 @@
+
+Error CUresultToError(int CUResult);
+
----------------
jprice wrote:
> Where is this defined? I get undefined references to this function when I try and build this patch.
Oops, I failed at version control. The definition is now in CUDAPlatformDevice.cpp.

================
Comment at: streamexecutor/include/streamexecutor/platforms/cuda/CUDAPlatformDevice.h:37
@@ +36,3 @@
+
+  std::string getName() const override { return "CUDA"; }
+
----------------
jprice wrote:
> jlebar wrote:
> > Do we want the device number in the name?
> Or even the actual name of the device (`cuDeviceGetName`)? This is certainly more useful for client code (in my experience), particularly when working with systems that have more than one type of GPU.
> 
> It strikes me that "CUDA" is really the name of the platform, not the device. In which case, maybe it makes sense to also have a `getPlatformName()` or `getPlatform()->getName()` that gives you the "CUDA" vs "OpenCL" vs "Host" which is used for the error strings.
> 
I agree that more information is better here. Just as jprice suggested, I changed `getName` to return the device index and the name from `cuDeviceGetName`, and I made a new `getPlatformName` function for error messages..

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatform.cpp:44
@@ +43,3 @@
+Expected<Device> CUDAPlatform::getDevice(size_t DeviceIndex) {
+  static CUresult InitResult = []() { return cuInit(0); }();
+
----------------
jlebar wrote:
> Do you want to wrap the cuInit(0) call in a helper function so we only ever call it once?
> 
>   static CUResult ensureCudaInitialized() {
>     static InitResult = ...;
>     return InitResult;
>   }
> 
> I guess it doesn't make a big difference either way.
Great idea! I like that much better.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatform.cpp:59
@@ +58,3 @@
+  }
+  return Device(&Iterator->second);
+}
----------------
jlebar wrote:
> Hm.  Do we care if getDevice() is slow?  We could probably do this without a lock without too much trouble (essentially using the same mechanism used to ensure that function-static variables are initialized only once, except we'd be using it for member variables).  Maybe for another patch, if we care about performance.  If we don't care, even better.  :)
I think it doesn't matter for now, and we can come back to it later if it ever seems to matter.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:37
@@ +36,3 @@
+  if (CUresult Result = cuCtxSetCurrent(ContextHandle))
+    return CUresultToError(Result);
+
----------------
jlebar wrote:
> Hm, I wonder if we want to be more descriptive in our error handling and give some sort of "backtrace", indicating where we failed.  e.g. something as simple as
> 
>   return CUresultToError(result, "cuCtxSetCurrent");
> 
> Maybe not for this patch, though.
Yes, I could have used that in debugging already. I did the simple method name reporting for now.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:56
@@ +55,3 @@
+  CUresult Result = cuDevicePrimaryCtxRelease(DeviceIndex);
+  // TODO(jhen): Log error.
+}
----------------
jlebar wrote:
> Is this an "unused variable" warning?  If so maybe do
> 
>   (void) Result;
Yes, it was a warning. Thanks for the suggestion. I've done this for all these cases where error handling has yet to be implemented.

================
Comment at: streamexecutor/lib/platforms/cuda/CUDAPlatformDevice.cpp:80
@@ +79,3 @@
+  if (!Code)
+    return make_error("no suitable CUDA source found for compute capability " +
+                      llvm::Twine(ComputeCapabilityMajor) + "." +
----------------
jlebar wrote:
> Nit, "CUDA source" is probably not the right phrase, since this is all compiled CUDA code (PTX and SASS).
I changed it to "CUDA code". I think that should be correct.

================
Comment at: streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake:7
@@ +6,3 @@
+
+find_path(LIBCUDA_INCLUDE_DIR cuda.h /usr/local/cuda/include)
+find_library(LIBCUDA_LIBRARY cuda)
----------------
jlebar wrote:
> Can we override this on the command line somehow?  If so, is that obvious, or is it worth documenting?
I added a TODO to add this functionality. I'll plan to add it in the next patch.

================
Comment at: streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake:8
@@ +7,3 @@
+find_path(LIBCUDA_INCLUDE_DIR cuda.h /usr/local/cuda/include)
+find_library(LIBCUDA_LIBRARY cuda)
+
----------------
jlebar wrote:
> The libcuda binary we use should come from the same cuda install as the headers we use, I think?
I agree. I added a TODO here to fix it in the next patch.


https://reviews.llvm.org/D24538





More information about the Parallel_libs-commits mailing list