[Openmp-commits] [openmp] fdbb153 - [Libomptarget][CUDA] Check CUDA compatibilty correctly

Joseph Huber via Openmp-commits openmp-commits at lists.llvm.org
Wed Aug 10 08:15:34 PDT 2022


Author: Joseph Huber
Date: 2022-08-10T11:15:27-04:00
New Revision: fdbb15355e7977b914cbd7e753b5e909d735ad83

URL: https://github.com/llvm/llvm-project/commit/fdbb15355e7977b914cbd7e753b5e909d735ad83
DIFF: https://github.com/llvm/llvm-project/commit/fdbb15355e7977b914cbd7e753b5e909d735ad83.diff

LOG: [Libomptarget][CUDA] Check CUDA compatibilty correctly

We recently added support for multi-architecture binaries in
libomptarget. This is done by extracting the architecture from the
embedded image and comparing it with the major and minor version
supported by the current CUDA installation. Previously we just compared
these directly, which was not correct for binary compatibility. The CUDA
documentation states that we can consider any image with an equivalent
major or a greater or equal to minor compatible with the current image.
Change the check to use this new logic in the CUDA plugin.

Fixes #57049

Reviewed By: jdoerfert, ye-luo

Differential Revision: https://reviews.llvm.org/D131567

Added: 
    

Modified: 
    openmp/libomptarget/plugins/cuda/src/rtl.cpp

Removed: 
    


################################################################################
diff  --git a/openmp/libomptarget/plugins/cuda/src/rtl.cpp b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
index 2b83878fba0ef..2916a2d723381 100644
--- a/openmp/libomptarget/plugins/cuda/src/rtl.cpp
+++ b/openmp/libomptarget/plugins/cuda/src/rtl.cpp
@@ -10,6 +10,8 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "llvm/ADT/StringRef.h"
+
 #include <algorithm>
 #include <cassert>
 #include <cstddef>
@@ -33,6 +35,8 @@
 
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 
+using namespace llvm;
+
 // Utility for retrieving and printing CUDA error string.
 #ifdef OMPTARGET_DEBUG
 #define CUDA_ERR_STRING(err)                                                   \
@@ -1529,13 +1533,14 @@ int32_t __tgt_rtl_is_valid_binary_info(__tgt_device_image *image,
     return false;
 
   // A subarchitecture was not specified. Assume it is compatible.
-  if (!info->Arch)
+  if (!info || !info->Arch)
     return true;
 
   int32_t NumberOfDevices = 0;
   if (cuDeviceGetCount(&NumberOfDevices) != CUDA_SUCCESS)
     return false;
 
+  StringRef ArchStr = StringRef(info->Arch).drop_front(sizeof("sm_") - 1);
   for (int32_t DeviceId = 0; DeviceId < NumberOfDevices; ++DeviceId) {
     CUdevice Device;
     if (cuDeviceGet(&Device, DeviceId) != CUDA_SUCCESS)
@@ -1551,8 +1556,11 @@ int32_t __tgt_rtl_is_valid_binary_info(__tgt_device_image *image,
                              Device) != CUDA_SUCCESS)
       return false;
 
-    std::string ArchStr = "sm_" + std::to_string(Major) + std::to_string(Minor);
-    if (ArchStr != info->Arch)
+    // A cubin generated for a certain compute capability is supported to run on
+    // any GPU with the same major revision and same or higher minor revision.
+    int32_t ImageMajor = ArchStr[0] - '0';
+    int32_t ImageMinor = ArchStr[1] - '0';
+    if (Major != ImageMajor || Minor < ImageMinor)
       return false;
   }
 


        


More information about the Openmp-commits mailing list