[Openmp-commits] [openmp] cfa6e79 - [Libomptarget] Don't report lack of CUDA devices
Joel E. Denny via Openmp-commits
openmp-commits at lists.llvm.org
Fri Jul 22 11:50:30 PDT 2022
Author: Joel E. Denny
Date: 2022-07-22T14:46:45-04:00
New Revision: cfa6e79df30c7f9ea319d304670dcce7e9376787
URL: https://github.com/llvm/llvm-project/commit/cfa6e79df30c7f9ea319d304670dcce7e9376787
DIFF: https://github.com/llvm/llvm-project/commit/cfa6e79df30c7f9ea319d304670dcce7e9376787.diff
LOG: [Libomptarget] Don't report lack of CUDA devices
Sometimes libomptarget's CUDA plugin produces unhelpful diagnostics
about a lack of CUDA devices before an application runs:
```
$ clang -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa hello-world.c
$ ./a.out
CUDA error: Error returned from cuInit
CUDA error: no CUDA-capable device is detected
Hello World: 4
```
This can happen when the CUDA plugin was built but all CUDA devices
are currently disabled in some manner, perhaps because
`CUDA_VISIBLE_DEVICES` is set to the empty string. As shown in the
above example, it can even happen when we haven't compiled the
application for offloading to CUDA.
The following code from `openmp/libomptarget/plugins/cuda/src/rtl.cpp`
appears to be intended to handle this case, and it chooses not to
write a diagnostic to stderr unless debugging is enabled:
```
if (NumberOfDevices == 0) {
DP("There are no devices supporting CUDA.\n");
return;
}
```
The problem is that the above code is never reached because the
earlier `cuInit` returns `CUDA_ERROR_NO_DEVICE`. This patch handles
that `cuInit` case in the same manner as the above code handles the
`NumberOfDevices == 0` case.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D130371
Added:
openmp/libomptarget/test/offloading/cuda_no_devices.c
Modified:
openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
openmp/libomptarget/plugins/cuda/src/rtl.cpp
Removed:
################################################################################
diff --git a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
index 51a59480473ee..04e3c1f6ce2c2 100644
--- a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
+++ b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h
@@ -27,6 +27,7 @@ typedef struct CUevent_st *CUevent;
typedef enum cudaError_enum {
CUDA_SUCCESS = 0,
CUDA_ERROR_INVALID_VALUE = 1,
+ CUDA_ERROR_NO_DEVICE = 100,
CUDA_ERROR_INVALID_HANDLE = 400,
} CUresult;
diff --git a/openmp/libomptarget/plugins/cuda/src/rtl.cpp b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
index 2ab4d6017b5ed..97fc3e9908eea 100644
--- a/openmp/libomptarget/plugins/cuda/src/rtl.cpp
+++ b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
@@ -507,6 +507,10 @@ class DeviceRTLTy {
DP("Failed to load CUDA shared library\n");
return;
}
+ if (Err == CUDA_ERROR_NO_DEVICE) {
+ DP("There are no devices supporting CUDA.\n");
+ return;
+ }
if (!checkResult(Err, "Error returned from cuInit\n")) {
return;
}
diff --git a/openmp/libomptarget/test/offloading/cuda_no_devices.c b/openmp/libomptarget/test/offloading/cuda_no_devices.c
new file mode 100644
index 0000000000000..ff3e3a6f5560e
--- /dev/null
+++ b/openmp/libomptarget/test/offloading/cuda_no_devices.c
@@ -0,0 +1,20 @@
+// The CUDA plugin used to complain on stderr when no CUDA devices were enabled,
+// and then it let the application run anyway. Check that there's no such
+// complaint anymore, especially when the user isn't targeting CUDA.
+
+// RUN: %libomptarget-compile-generic
+// RUN: env CUDA_VISIBLE_DEVICES= \
+// RUN: %libomptarget-run-generic 2>&1 | %fcheck-generic
+
+#include <stdio.h>
+
+// CHECK-NOT: {{.}}
+// CHECK: Hello World: 4
+// CHECK-NOT: {{.}}
+int main() {
+ int x = 0;
+ #pragma omp target teams num_teams(2) reduction(+:x)
+ x += 2;
+ printf("Hello World: %d\n", x);
+ return 0;
+}
More information about the Openmp-commits
mailing list