[Openmp-commits] [PATCH] D106627: [OpenMP] Add environment variables to change stack / heap size in the CUDA plugin

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Jul 22 22:01:12 PDT 2021


jdoerfert added inline comments.


================
Comment at: openmp/libomptarget/plugins/cuda/src/rtl.cpp:652
+    } else {
+      if (cuCtxGetLimit(&StackLimit, CU_LIMIT_STACK_SIZE) != CUDA_SUCCESS)
+        return OFFLOAD_FAIL;
----------------
abhinavgaba wrote:
> abhinavgaba wrote:
> > jdoerfert wrote:
> > > abhinavgaba wrote:
> > > > These enums don't seem to be defined in `cuda.h`, or somewhere else. Can you please take a look?
> > > My cuda.h defines them.
> > > ```
> > > /opt/cuda/targets/x86_64-linux/include/cuda.h
> > > 1130:    CU_LIMIT_STACK_SIZE                       = 0x00, /**< GPU thread stack size */
> > > ```
> > > https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TYPES.html lists them as well.
> > > Are you sure they are not there?
> > I meant in the compiler sources. The other enums used in libomptarget seem to be defined in `openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h`. Absence of these definitions on a machine that doesn't have Cuda drivers installed, is causing a build fail for me with these errors:
> > ```
> > llvm/openmp/libomptarget/plugins/cuda/src/rtl.cpp: In member function ‘int {anonymous}::DeviceRTLTy::initDevice(int)’:
> > llvm/openmp/libomptarget/plugins/cuda/src/rtl.cpp:649:25: error: ‘CU_LIMIT_STACK_SIZE’ was not declared in this scope
> > if (cuCtxSetLimit(CU_LIMIT_STACK_SIZE, StackLimit) != CUDA_SUCCESS)
> >                   ^~~~~~~~~~~~~~~~~~~
> > ```
> > 
> > I looked at the buildbots, but they are skipping building the cuda plugin altogether, so don't report any fails. For instance, https://lab.llvm.org/buildbot/#/builders/84/builds/12107/steps/4/logs/stdio has this in the log:
> > 
> > ```
> > -- Could NOT find LIBOMPTARGET_DEP_CUDA_DRIVER (missing: LIBOMPTARGET_DEP_CUDA_DRIVER_LIBRARIES) 
> > -- Could NOT find LIBOMPTARGET_DEP_VEO (missing: LIBOMPTARGET_DEP_VEO_LIBRARIES LIBOMPTARGET_DEP_VEOSINFO_LIBRARIES LIBOMPTARGET_DEP_VEO_INCLUDE_DIRS) 
> > -- LIBOMPTARGET: Building offloading runtime library libomptarget.
> > ...
> > -- LIBOMPTARGET: Not building CUDA offloading plugin: libelf dependency not found.
> > ...
> > -- check-libomptarget does nothing.
> > ```
> This patch makes the build pass for me, but I have no way to verify it. @jhuber6 can you please take a look?
> 
> ```
> diff --git a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp
> index c84b3814065e..235efd2728de 100644
> --- a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp
> +++ b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp
> @@ -61,6 +61,9 @@ DLWRAP(cuDeviceCanAccessPeer, 3);
>  DLWRAP(cuCtxEnablePeerAccess, 2);
>  DLWRAP(cuMemcpyPeerAsync, 6);
> 
> +DLWRAP(cuCtxGetLimit, 2);
> +DLWRAP(cuCtxSetLimit, 2);
> +
>  DLWRAP_FINALIZE();
> 
>  #ifndef DYNAMIC_CUDA_PATH
> diff --git a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
> index 045c39cacc97..17aa2a12ef6c 100644
> --- a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
> +++ b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
> @@ -34,6 +34,17 @@ typedef enum CUstream_flags_enum {
>    CU_STREAM_NON_BLOCKING = 0x1,
>  } CUstream_flags;
> 
> +typedef enum CUlimit_enum {
> +  CU_LIMIT_STACK_SIZE = 0x0,
> +  CU_LIMIT_PRINTF_FIFO_SIZE = 0x1,
> +  CU_LIMIT_MALLOC_HEAP_SIZE = 0x2,
> +  CU_LIMIT_DEV_RUNTIME_SYNC_DEPTH = 0x3,
> +  CU_LIMIT_DEV_RUNTIME_PENDING_LAUNCH_COUNT = 0x4,
> +  CU_LIMIT_MAX_L2_FETCH_GRANULARITY = 0x5,
> +  CU_LIMIT_PERSISTING_L2_CACHE_SIZE = 0x6,
> +  CU_LIMIT_MAX
> +} CUlimit;
> +
>  typedef enum CUdevice_attribute_enum {
>    CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X = 2,
>    CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X = 5,
> @@ -100,4 +111,7 @@ CUresult cuCtxEnablePeerAccess(CUcontext, unsigned);
>  CUresult cuMemcpyPeerAsync(CUdeviceptr, CUcontext, CUdeviceptr, CUcontext,
>                             size_t, CUstream);
> 
> +CUresult cuCtxGetLimit(size_t *, CUlimit);
> +CUresult cuCtxSetLimit(CUlimit, size_t);
> +
> ```
Right, that makes sense. 

The above looks good to me. Could you commit it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106627/new/

https://reviews.llvm.org/D106627



More information about the Openmp-commits mailing list