[Openmp-commits] [PATCH] D127505: [Libomptarget] Add checks for CUDA subarchitecture using new info
Joseph Huber via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Jun 10 08:31:41 PDT 2022
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, saiislam, tianshilei1992, JonChesterfield.
Herald added subscribers: mattd, yaxunl.
Herald added a project: All.
jhuber6 requested review of this revision.
Herald added a project: OpenMP.
Herald added a subscriber: openmp-commits.
This patch extends the `is_valid_binary` routine to also check if the
binary's architecture string matches the one parsed from the runtime.
This should allow us to only use the binary whose compute capability
matches, allowing us to support basic multi-architecture binaries for
CUDA.
Depends on D127432 <https://reviews.llvm.org/D127432>
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D127505
Files:
openmp/libomptarget/plugins/cuda/src/rtl.cpp
Index: openmp/libomptarget/plugins/cuda/src/rtl.cpp
===================================================================
--- openmp/libomptarget/plugins/cuda/src/rtl.cpp
+++ openmp/libomptarget/plugins/cuda/src/rtl.cpp
@@ -1484,7 +1484,40 @@
#endif
int32_t __tgt_rtl_is_valid_binary(__tgt_device_binary *image) {
- return elf_check_machine(image, /* EM_CUDA */ 190);
+ if (!elf_check_machine(image, /* EM_CUDA */ 190))
+ return false;
+
+ // A subarchitecture was not specified. Assume it is compatible.
+ if (!image->Info.Arch)
+ return true;
+
+ DP("The binary's compute capability is %s\n", image->Info.Arch);
+
+ int32_t NumberOfDevices = 0;
+ if (cuDeviceGetCount(&NumberOfDevices) != CUDA_SUCCESS)
+ return false;
+
+ for (int32_t DeviceId = 0; DeviceId < NumberOfDevices; ++DeviceId) {
+ CUdevice Device;
+ if (cuDeviceGet(&Device, DeviceId) != CUDA_SUCCESS)
+ return false;
+
+ int32_t Major, Minor;
+ if (cuDeviceGetAttribute(&Major,
+ CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR,
+ Device) != CUDA_SUCCESS)
+ return false;
+ if (cuDeviceGetAttribute(&Minor,
+ CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR,
+ Device) != CUDA_SUCCESS)
+ return false;
+
+ std::string ArchStr = "sm_" + std::to_string(Major) + std::to_string(Minor);
+ DP("Device %d has compute capability %s\n", DeviceId, ArchStr.c_str());
+ if (ArchStr == image->Info.Arch)
+ return true;
+ }
+ return false;
}
int32_t __tgt_rtl_number_of_devices() { return DeviceRTL.getNumOfDevices(); }
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D127505.435928.patch
Type: text/x-patch
Size: 1647 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20220610/f22137f7/attachment.bin>
More information about the Openmp-commits
mailing list