[clang] [llvm] [openmp] [Libomptarget] Statically link all plugin runtimes (PR #87009)

via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 28 14:33:56 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

<details>
<summary>Changes</summary>

This patch overhauls the `libomptarget` and plugin interface. Currently,
we define a C API and compile each plugin as a separate shared library.
Then, `libomptarget` loads these API functions and forwards its internal
calls to them. This was originally designed to allow multiple
implementations of a library to be live. However, since then no one has
used this functionality and it prevents us from using much nicer
interfaces. If the old behavior is desired it should instead be
implemented as a separate plugin.

This patch replaces the `PluginAdaptorTy` interface with the
`GenericPluginTy` that is used by the plugins. Each plugin exports a
`createPlugin_<name>` function that is used to get the specific
implementation. This code is now shared with `libomptarget`.

There are some notable improvements to this.
1. Massively improved lifetimes of life runtime objects
2. The plugins can use a C++ interface
3. Global state does not need to be duplicated for each plugin +
   libomptarget
4. Easier to use and add features and improve error handling
5. Less function call overhead / Improved LTO performance.

Additional changes in this plugin are related to contending with the
fact that state is now shared. Initialization and deinitialization is
now handled correctly and in phase with the underlying runtime, allowing
us to actually know when something is getting deallocated.

Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868


---

Patch is 71.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87009.diff


28 Files Affected:

- (modified) clang/test/Driver/linker-wrapper-image.c (+1-1) 
- (modified) llvm/lib/Frontend/Offloading/OffloadWrapper.cpp (+4-3) 
- (modified) openmp/libomptarget/CMakeLists.txt (+26) 
- (modified) openmp/libomptarget/include/PluginManager.h (+24-70) 
- (removed) openmp/libomptarget/include/Shared/PluginAPI.h (-232) 
- (removed) openmp/libomptarget/include/Shared/PluginAPI.inc (-51) 
- (added) openmp/libomptarget/include/Shared/Targets.def.in (+20) 
- (modified) openmp/libomptarget/include/device.h (+5-3) 
- (modified) openmp/libomptarget/plugins-nextgen/CMakeLists.txt (+9-19) 
- (modified) openmp/libomptarget/plugins-nextgen/amdgpu/CMakeLists.txt (-5) 
- (modified) openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp (+8-6) 
- (modified) openmp/libomptarget/plugins-nextgen/common/CMakeLists.txt (+2-3) 
- (modified) openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h (+4-90) 
- (modified) openmp/libomptarget/plugins-nextgen/common/include/Utils/ELF.h (-2) 
- (modified) openmp/libomptarget/plugins-nextgen/common/src/JIT.cpp (+17-23) 
- (modified) openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp (-205) 
- (modified) openmp/libomptarget/plugins-nextgen/cuda/CMakeLists.txt (-5) 
- (modified) openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp (+8-6) 
- (modified) openmp/libomptarget/plugins-nextgen/host/CMakeLists.txt (+16-24) 
- (modified) openmp/libomptarget/plugins-nextgen/host/src/rtl.cpp (+8-6) 
- (modified) openmp/libomptarget/src/CMakeLists.txt (+8-18) 
- (modified) openmp/libomptarget/src/OffloadRTL.cpp (+1) 
- (modified) openmp/libomptarget/src/OpenMP/InteropAPI.cpp (+2-2) 
- (modified) openmp/libomptarget/src/PluginManager.cpp (+72-113) 
- (modified) openmp/libomptarget/src/device.cpp (+1-2) 
- (modified) openmp/libomptarget/src/interface.cpp (+1-1) 
- (modified) openmp/libomptarget/tools/kernelreplay/llvm-omp-kernel-replay.cpp (-2) 
- (modified) openmp/libomptarget/unittests/Plugins/NextgenPluginsTest.cpp (-1) 


``````````diff
diff --git a/clang/test/Driver/linker-wrapper-image.c b/clang/test/Driver/linker-wrapper-image.c
index d01445e3aed04e..5d5d62805e174d 100644
--- a/clang/test/Driver/linker-wrapper-image.c
+++ b/clang/test/Driver/linker-wrapper-image.c
@@ -30,8 +30,8 @@
 
 //      OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" {
 // OPENMP-NEXT: entry:
-// OPENMP-NEXT:   %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg)
 // OPENMP-NEXT:   call void @__tgt_register_lib(ptr @.omp_offloading.descriptor)
+// OPENMP-NEXT:   %0 = call i32 @atexit(ptr @.omp_offloading.descriptor_unreg)
 // OPENMP-NEXT:   ret void
 // OPENMP-NEXT: }
 
diff --git a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp
index 7241d15ed1c670..8b6f9ea1f4cca3 100644
--- a/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp
+++ b/llvm/lib/Frontend/Offloading/OffloadWrapper.cpp
@@ -232,12 +232,13 @@ void createRegisterFunction(Module &M, GlobalVariable *BinDesc,
   // Construct function body
   IRBuilder<> Builder(BasicBlock::Create(C, "entry", Func));
 
+  Builder.CreateCall(RegFuncC, BinDesc);
+
   // Register the destructors with 'atexit'. This is expected by the CUDA
   // runtime and ensures that we clean up before dynamic objects are destroyed.
-  // This needs to be done before the runtime is called and registers its own.
+  // This needs to be done after plugin initialization to ensure that it is
+  // called before the plugin runtime is destroyed.
   Builder.CreateCall(AtExit, UnregFunc);
-
-  Builder.CreateCall(RegFuncC, BinDesc);
   Builder.CreateRetVoid();
 
   // Add this function to constructors.
diff --git a/openmp/libomptarget/CMakeLists.txt b/openmp/libomptarget/CMakeLists.txt
index 531198fae01699..786f1cbcc4fc53 100644
--- a/openmp/libomptarget/CMakeLists.txt
+++ b/openmp/libomptarget/CMakeLists.txt
@@ -41,6 +41,25 @@ if (NOT LIBOMPTARGET_LLVM_INCLUDE_DIRS)
   message(FATAL_ERROR "Missing definition for LIBOMPTARGET_LLVM_INCLUDE_DIRS")
 endif()
 
+set(LIBOMPTARGET_ALL_PLUGIN_TARGETS amdgpu cuda host)
+set(LIBOMPTARGET_PLUGINS_TO_BUILD "all" CACHE STRING
+    "Semicolon-separated list of plugins to use, or \"all\".")
+
+if(LIBOMPTARGET_PLUGINS_TO_BUILD STREQUAL "all")
+  set(LIBOMPTARGET_PLUGINS_TO_BUILD ${LIBOMPTARGET_ALL_PLUGIN_TARGETS})
+endif()
+
+set(LIBOMPTARGET_ENUM_PLUGIN_TARGETS "")
+foreach(plugin IN LISTS LIBOMPTARGET_PLUGINS_TO_BUILD)
+  set(LIBOMPTARGET_ENUM_PLUGIN_TARGETS
+      "${LIBOMPTARGET_ENUM_PLUGIN_TARGETS}PLUGIN_TARGET(${plugin})\n")
+endforeach()
+string(STRIP ${LIBOMPTARGET_ENUM_PLUGIN_TARGETS} LIBOMPTARGET_ENUM_PLUGIN_TARGETS)
+configure_file(
+  ${CMAKE_CURRENT_SOURCE_DIR}/include/Shared/Targets.def.in
+  ${CMAKE_CURRENT_BINARY_DIR}/include/Shared/Targets.def
+)
+
 include_directories(${LIBOMPTARGET_LLVM_INCLUDE_DIRS})
 
 # This is a list of all the targets that are supported/tested right now.
@@ -126,6 +145,7 @@ set(LIBOMPTARGET_GPU_LIBC_SUPPORT ${LLVM_LIBC_GPU_BUILD} CACHE BOOL
 pythonize_bool(LIBOMPTARGET_GPU_LIBC_SUPPORT)
 
 set(LIBOMPTARGET_INCLUDE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/include)
+set(LIBOMPTARGET_BINARY_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/include)
 message(STATUS "OpenMP tools dir in libomptarget: ${LIBOMP_OMP_TOOLS_INCLUDE_DIR}")
 include_directories(${LIBOMP_OMP_TOOLS_INCLUDE_DIR})
 
@@ -144,6 +164,12 @@ add_subdirectory(plugins-nextgen)
 add_subdirectory(DeviceRTL)
 add_subdirectory(tools)
 
+macro(check_plugin_target target)
+if (TARGET omptarget.rtl.${target})
+	list(APPEND LIBOMPTARGET_PLUGINS_TO_LOAD ${target})
+endif()
+endmacro()
+
 # Build target agnostic offloading library.
 add_subdirectory(src)
 
diff --git a/openmp/libomptarget/include/PluginManager.h b/openmp/libomptarget/include/PluginManager.h
index 77684285ddf15e..e49a4b24cab57b 100644
--- a/openmp/libomptarget/include/PluginManager.h
+++ b/openmp/libomptarget/include/PluginManager.h
@@ -16,7 +16,6 @@
 #include "DeviceImage.h"
 #include "ExclusiveAccess.h"
 #include "Shared/APITypes.h"
-#include "Shared/PluginAPI.h"
 #include "Shared/Requirements.h"
 
 #include "device.h"
@@ -34,66 +33,10 @@
 #include <mutex>
 #include <string>
 
-struct PluginManager;
-
-/// Plugin adaptors should be created via `PluginAdaptorTy::create` which will
-/// invoke the constructor and call `PluginAdaptorTy::init`. Eventual errors are
-/// reported back to the caller, otherwise a valid and initialized adaptor is
-/// returned.
-struct PluginAdaptorTy {
-  /// Try to create a plugin adaptor from a filename.
-  static llvm::Expected<std::unique_ptr<PluginAdaptorTy>>
-  create(const std::string &Name);
-
-  /// Initialize as many devices as possible for this plugin adaptor. Devices
-  /// that fail to initialize are ignored.
-  void initDevices(PluginManager &PM);
-
-  bool isUsed() const { return DeviceOffset >= 0; }
-
-  /// Return the number of devices visible to the underlying plugin.
-  int32_t getNumberOfPluginDevices() const { return NumberOfPluginDevices; }
-
-  /// Return the number of devices successfully initialized and visible to the
-  /// user.
-  int32_t getNumberOfUserDevices() const { return NumberOfUserDevices; }
-
-  /// RTL index, index is the number of devices of other RTLs that were
-  /// registered before, i.e. the OpenMP index of the first device to be
-  /// registered with this RTL.
-  int32_t DeviceOffset = -1;
-
-  /// Name of the shared object file representing the plugin.
-  std::string Name;
-
-  /// Access to the shared object file representing the plugin.
-  std::unique_ptr<llvm::sys::DynamicLibrary> LibraryHandler;
-
-#define PLUGIN_API_HANDLE(NAME)                                                \
-  using NAME##_ty = decltype(__tgt_rtl_##NAME);                                \
-  NAME##_ty *NAME = nullptr;
-
-#include "Shared/PluginAPI.inc"
-#undef PLUGIN_API_HANDLE
-
-  llvm::DenseSet<const __tgt_device_image *> UsedImages;
+#include "PluginInterface.h"
+using GenericPluginTy = llvm::omp::target::plugin::GenericPluginTy;
 
-private:
-  /// Number of devices the underling plugins sees.
-  int32_t NumberOfPluginDevices = -1;
-
-  /// Number of devices exposed to the user. This can be less than the number of
-  /// devices for the plugin if some failed to initialize.
-  int32_t NumberOfUserDevices = 0;
-
-  /// Create a plugin adaptor for filename \p Name with a dynamic library \p DL.
-  PluginAdaptorTy(const std::string &Name,
-                  std::unique_ptr<llvm::sys::DynamicLibrary> DL);
-
-  /// Initialize the plugin adaptor, this can fail in which case the adaptor is
-  /// useless.
-  llvm::Error init();
-};
+struct PluginManager;
 
 /// Struct for the data required to handle plugins
 struct PluginManager {
@@ -108,6 +51,8 @@ struct PluginManager {
 
   void init();
 
+  void deinit();
+
   // Register a shared library with all (compatible) RTLs.
   void registerLib(__tgt_bin_desc *Desc);
 
@@ -120,6 +65,11 @@ struct PluginManager {
         std::make_unique<DeviceImageTy>(TgtBinDesc, TgtDeviceImage));
   }
 
+  /// Initialize as many devices as possible for this plugin. Devices  that fail
+  /// to initialize are ignored. Returns the offset the devices were registered
+  /// at.
+  void initDevices(GenericPluginTy &RTL);
+
   /// Return the device presented to the user as device \p DeviceNo if it is
   /// initialized and ready. Otherwise return an error explaining the problem.
   llvm::Expected<DeviceTy &> getDevice(uint32_t DeviceNo);
@@ -169,18 +119,13 @@ struct PluginManager {
     return Devices.getExclusiveAccessor();
   }
 
-  int getNumUsedPlugins() const {
-    int NCI = 0;
-    for (auto &P : PluginAdaptors)
-      NCI += P->isUsed();
-    return NCI;
-  }
+  int getNumUsedPlugins() const { return DeviceOffsets.size(); }
 
   // Initialize all plugins.
   void initAllPlugins();
 
-  /// Iterator range for all plugin adaptors (in use or not, but always valid).
-  auto pluginAdaptors() { return llvm::make_pointee_range(PluginAdaptors); }
+  /// Iterator range for all plugins (in use or not, but always valid).
+  auto plugins() { return llvm::make_pointee_range(Plugins); }
 
   /// Return the user provided requirements.
   int64_t getRequirements() const { return Requirements.getRequirements(); }
@@ -192,8 +137,17 @@ struct PluginManager {
   bool RTLsLoaded = false;
   llvm::SmallVector<__tgt_bin_desc *> DelayedBinDesc;
 
-  // List of all plugin adaptors, in use or not.
-  llvm::SmallVector<std::unique_ptr<PluginAdaptorTy>> PluginAdaptors;
+  // List of all plugins, in use or not.
+  llvm::SmallVector<std::unique_ptr<GenericPluginTy>> Plugins;
+
+  // Mapping of plugins to offsets in the device table.
+  llvm::DenseMap<const GenericPluginTy *, int32_t> DeviceOffsets;
+
+  // Mapping of plugins to the number of used devices.
+  llvm::DenseMap<const GenericPluginTy *, int32_t> DeviceUsed;
+
+  // Set of all device images currently in use.
+  llvm::DenseSet<const __tgt_device_image *> UsedImages;
 
   /// Executable images and information extracted from the input images passed
   /// to the runtime.
diff --git a/openmp/libomptarget/include/Shared/PluginAPI.h b/openmp/libomptarget/include/Shared/PluginAPI.h
deleted file mode 100644
index ecf669c774f142..00000000000000
--- a/openmp/libomptarget/include/Shared/PluginAPI.h
+++ /dev/null
@@ -1,232 +0,0 @@
-//===-- Shared/PluginAPI.h - Target independent plugin API ------*- C++ -*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This file defines an interface between target independent OpenMP offload
-// runtime library libomptarget and target dependent plugin.
-//
-//===----------------------------------------------------------------------===//
-
-#ifndef OMPTARGET_SHARED_PLUGIN_API_H
-#define OMPTARGET_SHARED_PLUGIN_API_H
-
-#include <cstddef>
-#include <cstdint>
-
-#include "Shared/APITypes.h"
-
-extern "C" {
-
-// First method called on the plugin
-int32_t __tgt_rtl_init_plugin();
-
-// Return the number of available devices of the type supported by the
-// target RTL.
-int32_t __tgt_rtl_number_of_devices(void);
-
-// Return an integer different from zero if the provided device image can be
-// supported by the runtime. The functionality is similar to comparing the
-// result of __tgt__rtl__load__binary to NULL. However, this is meant to be a
-// lightweight query to determine if the RTL is suitable for an image without
-// having to load the library, which can be expensive.
-int32_t __tgt_rtl_is_valid_binary(__tgt_device_image *Image);
-
-// Return an integer other than zero if the data can be exchaned from SrcDevId
-// to DstDevId. If it is data exchangable, the device plugin should provide
-// function to move data from source device to destination device directly.
-int32_t __tgt_rtl_is_data_exchangable(int32_t SrcDevId, int32_t DstDevId);
-
-// Initialize the requires flags for the device.
-int64_t __tgt_rtl_init_requires(int64_t RequiresFlags);
-
-// Initialize the specified device. In case of success return 0; otherwise
-// return an error code.
-int32_t __tgt_rtl_init_device(int32_t ID);
-
-// Pass an executable image section described by image to the specified
-// device and prepare an address table of target entities. In case of error,
-// return NULL. Otherwise, return a pointer to the built address table.
-// Individual entries in the table may also be NULL, when the corresponding
-// offload region is not supported on the target device.
-int32_t __tgt_rtl_load_binary(int32_t ID, __tgt_device_image *Image,
-                              __tgt_device_binary *Binary);
-
-// Look up the device address of the named symbol in the given binary. Returns
-// non-zero on failure.
-int32_t __tgt_rtl_get_global(__tgt_device_binary Binary, uint64_t Size,
-                             const char *Name, void **DevicePtr);
-
-// Look up the device address of the named kernel in the given binary. Returns
-// non-zero on failure.
-int32_t __tgt_rtl_get_function(__tgt_device_binary Binary, const char *Name,
-                               void **DevicePtr);
-
-// Allocate data on the particular target device, of the specified size.
-// HostPtr is a address of the host data the allocated target data
-// will be associated with (HostPtr may be NULL if it is not known at
-// allocation time, like for example it would be for target data that
-// is allocated by omp_target_alloc() API). Return address of the
-// allocated data on the target that will be used by libomptarget.so to
-// initialize the target data mapping structures. These addresses are
-// used to generate a table of target variables to pass to
-// __tgt_rtl_run_region(). The __tgt_rtl_data_alloc() returns NULL in
-// case an error occurred on the target device. Kind dictates what allocator
-// to use (e.g. shared, host, device).
-void *__tgt_rtl_data_alloc(int32_t ID, int64_t Size, void *HostPtr,
-                           int32_t Kind);
-
-// Pass the data content to the target device using the target address. In case
-// of success, return zero. Otherwise, return an error code.
-int32_t __tgt_rtl_data_submit(int32_t ID, void *TargetPtr, void *HostPtr,
-                              int64_t Size);
-
-int32_t __tgt_rtl_data_submit_async(int32_t ID, void *TargetPtr, void *HostPtr,
-                                    int64_t Size, __tgt_async_info *AsyncInfo);
-
-// Retrieve the data content from the target device using its address. In case
-// of success, return zero. Otherwise, return an error code.
-int32_t __tgt_rtl_data_retrieve(int32_t ID, void *HostPtr, void *TargetPtr,
-                                int64_t Size);
-
-// Asynchronous version of __tgt_rtl_data_retrieve
-int32_t __tgt_rtl_data_retrieve_async(int32_t ID, void *HostPtr,
-                                      void *TargetPtr, int64_t Size,
-                                      __tgt_async_info *AsyncInfo);
-
-// Copy the data content from one target device to another target device using
-// its address. This operation does not need to copy data back to host and then
-// from host to another device. In case of success, return zero. Otherwise,
-// return an error code.
-int32_t __tgt_rtl_data_exchange(int32_t SrcID, void *SrcPtr, int32_t DstID,
-                                void *DstPtr, int64_t Size);
-
-// Asynchronous version of __tgt_rtl_data_exchange
-int32_t __tgt_rtl_data_exchange_async(int32_t SrcID, void *SrcPtr,
-                                      int32_t DesID, void *DstPtr, int64_t Size,
-                                      __tgt_async_info *AsyncInfo);
-
-// De-allocate the data referenced by target ptr on the device. In case of
-// success, return zero. Otherwise, return an error code. Kind dictates what
-// allocator to use (e.g. shared, host, device).
-int32_t __tgt_rtl_data_delete(int32_t ID, void *TargetPtr, int32_t Kind);
-
-// Transfer control to the offloaded entry Entry on the target device.
-// Args and Offsets are arrays of NumArgs size of target addresses and
-// offsets. An offset should be added to the target address before passing it
-// to the outlined function on device side. If AsyncInfo is nullptr, it is
-// synchronous; otherwise it is asynchronous. However, AsyncInfo may be
-// ignored on some platforms, like x86_64. In that case, it is synchronous. In
-// case of success, return zero. Otherwise, return an error code.
-int32_t __tgt_rtl_run_target_region(int32_t ID, void *Entry, void **Args,
-                                    ptrdiff_t *Offsets, int32_t NumArgs);
-
-// Asynchronous version of __tgt_rtl_run_target_region
-int32_t __tgt_rtl_run_target_region_async(int32_t ID, void *Entry, void **Args,
-                                          ptrdiff_t *Offsets, int32_t NumArgs,
-                                          __tgt_async_info *AsyncInfo);
-
-// Similar to __tgt_rtl_run_target_region, but additionally specify the
-// number of teams to be created and a number of threads in each team. If
-// AsyncInfo is nullptr, it is synchronous; otherwise it is asynchronous.
-// However, AsyncInfo may be ignored on some platforms, like x86_64. In that
-// case, it is synchronous.
-int32_t __tgt_rtl_run_target_team_region(int32_t ID, void *Entry, void **Args,
-                                         ptrdiff_t *Offsets, int32_t NumArgs,
-                                         int32_t NumTeams, int32_t ThreadLimit,
-                                         uint64_t LoopTripcount);
-
-// Asynchronous version of __tgt_rtl_run_target_team_region
-int32_t __tgt_rtl_run_target_team_region_async(
-    int32_t ID, void *Entry, void **Args, ptrdiff_t *Offsets, int32_t NumArgs,
-    int32_t NumTeams, int32_t ThreadLimit, uint64_t LoopTripcount,
-    __tgt_async_info *AsyncInfo);
-
-// Device synchronization. In case of success, return zero. Otherwise, return an
-// error code.
-int32_t __tgt_rtl_synchronize(int32_t ID, __tgt_async_info *AsyncInfo);
-
-// Queries for the completion of asynchronous operations. Instead of blocking
-// the calling thread as __tgt_rtl_synchronize, the progress of the operations
-// stored in AsyncInfo->Queue is queried in a non-blocking manner, partially
-// advancing their execution. If all operations are completed, AsyncInfo->Queue
-// is set to nullptr. If there are still pending operations, AsyncInfo->Queue is
-// kept as a valid queue. In any case of success (i.e., successful query
-// with/without completing all operations), return zero. Otherwise, return an
-// error code.
-int32_t __tgt_rtl_query_async(int32_t ID, __tgt_async_info *AsyncInfo);
-
-// Set plugin's internal information flag externally.
-void __tgt_rtl_set_info_flag(uint32_t);
-
-// Print the device information
-void __tgt_rtl_print_device_info(int32_t ID);
-
-// Event related interfaces. It is expected to use the interfaces in the
-// following way:
-// 1) Create an event on the target device (__tgt_rtl_create_event).
-// 2) Record the event based on the status of \p AsyncInfo->Queue at the moment
-// of function call to __tgt_rtl_record_event. An event becomes "meaningful"
-// once it is recorded, such that others can depend on it.
-// 3) Call __tgt_rtl_wait_event to set dependence on the event. Whether the
-// operation is blocking or non-blocking depends on the target. It is expected
-// to be non-blocking, just set dependence and return.
-// 4) Call __tgt_rtl_sync_event to sync the event. It is expected to block the
-// thread calling the function.
-// 5) Destroy the event (__tgt_rtl_destroy_event).
-// {
-int32_t __tgt_rtl_create_event(int32_t ID, void **Event);
-
-int32_t __tgt_rtl_record_event(int32_t ID, void *Event,
-                               __tgt_async_info *AsyncInfo);
-
-int32_t __tgt_rtl_wait_event(int32_t ID, void *Event,
-                             __tgt_async_info *AsyncInfo);
-
-int32_t __tgt_rtl_sync_event(int32_t ID, void *Event);
-
-int32_t __tgt_rtl_destroy_event(int32_t ID, void *Event);
-// }
-
-int32_t __tgt_rtl_init_async_info(int32_t ID, __tgt_async_info **AsyncInfoPtr);
-int32_t __tgt_rtl_init_device_info(int32_t ID, __tgt_device_info *DeviceInfoPtr,
-                                   const char **ErrStr);
-
-// lock/pin host memory
-int32_t __tgt_rtl_data_lock(int32_t ID, void *HstPtr, int64_t Size,
-                            void **LockedPtr);
-
-// unlock/unpin host memory
-int32_t __tgt_rtl_data_unlock(int32_t ID, void *HstPtr);
-
-// Notify the plugin about a new mapping starting at the host address \p HstPtr
-// and \p Size bytes. The plugin may lock/pin that buffer to achieve optimal
-// memory transfers involving that buffer.
-int32_t __tgt_rtl_data_notify_mapped(int32_t ID, void *HstPtr, int64_t Size);
-
-// Notify the plugin about an existing mapping being unmapped, starting at the
-// host address \p HstPtr and \p Size bytes.
-int32_t __tgt_rtl_data_notify_unmapped(int32_t ID, void *HstPtr);
-
-// Set the global device identifier offset, such that the plugin may determine a
-// unique device number.
-int32_t __tgt_rtl_set_device_offset(int32_t DeviceIdOffset);
-
-int32_t __tgt_rtl_launch_kernel(int32_t DeviceId, void *TgtEntryPtr,
-                                void **TgtArgs, ptrdiff_t *TgtOffsets,
-                                KernelArgsTy *KernelArgs,
-                            ...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/87009


More information about the cfe-commits mailing list