[Openmp-commits] [PATCH] D106752: [OpenMP][Tool] Introducing the `llvm-deviceinfo` tool
Jose Manuel Monsalve Diaz via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Sat Jul 24 10:58:40 PDT 2021
josemonsalve2 created this revision.
Herald added subscribers: guansong, yaxunl, mgorny.
josemonsalve2 requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: openmp-commits, sstefan1.
Herald added a project: OpenMP.
This patch introduces the `llvm-deviceinfo` tool, which uses the
omptarget library and interface to query the device info from all the
available devices as seen by OpenMP. This is inspired by PGI's `pgaccelinfo`
Since omptarget usually requires a description structure with executable
kernels, I split the initialization of the RTLs and Devices to be able to
initialize all possible devices and query each of them.
This revision relies on the patch that introduces the print device info.
A limitation is that the order in which the devices are initialized, and the
corresponding device ID is not necesarily the one seen by OpenMP.
The changes are as follows:
1. Separate the RTL initialization that was performed in `RegisterLib` to its own `initRTLonce` function
2. Create an `initAllRTLs` method that initializes all available RTLs at runtime
3. Created the `llvm-deviceinfo.cpp` tool that uses `omptarget` to query each device and prints its information.
Example Output:
Device (0):
print_device_info not implemented
Device (1):
print_device_info not implemented
Device (2):
print_device_info not implemented
Device (3):
print_device_info not implemented
Device (4):
CUDA Driver Version: 11000
CUDA Device Number: 0
Device Name: Quadro P1000
Global Memory Size: 4236312576 bytes
Number of Multiprocessors: 5
Concurrent Copy and Execution: Yes
Total Constant Memory: 65536 bytes
Max Shared Memory per Block: 49152 bytes
Registers per Block: 65536
Warp Size: 32 Threads
Maximum Threads per Block: 1024
Maximum Block Dimensions: 1024, 1024, 64
Maximum Grid Dimensions: 2147483647 x 65535 x 65535
Maximum Memory Pitch: 2147483647 bytes
Texture Alignment: 512 bytes
Clock Rate: 1480500 kHz
Execution Timeout: Yes
Integrated Device: No
Can Map Host Memory: Yes
Compute Mode: DEFAULT
Concurrent Kernels: Yes
ECC Enabled: No
Memory Clock Rate: 2505000 kHz
Memory Bus Width: 128 bits
L2 Cache Size: 1048576 bytes
Max Threads Per SMP: 2048
Async Engines: Yes (2)
Unified Addressing: Yes
Managed Memory: Yes
Concurrent Managed Memory: Yes
Preemption Supported: Yes
Cooperative Launch: Yes
Multi-Device Boars: No
Compute Capabilities: 61
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D106752
Files:
openmp/libomptarget/CMakeLists.txt
openmp/libomptarget/include/omptarget.h
openmp/libomptarget/plugins/cuda/CMakeLists.txt
openmp/libomptarget/src/exports
openmp/libomptarget/src/interface.cpp
openmp/libomptarget/src/rtl.cpp
openmp/libomptarget/src/rtl.h
openmp/libomptarget/tools/CMakeLists.txt
openmp/libomptarget/tools/deviceinfo/CMakeLists.txt
openmp/libomptarget/tools/deviceinfo/llvm-deviceinfo.cpp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D106752.361471.patch
Type: text/x-patch
Size: 8190 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20210724/620a5d5e/attachment-0001.bin>
More information about the Openmp-commits
mailing list