[Openmp-commits] [PATCH] D134396: [OpenMP][libomptarget] New plugin infrastructure and new CUDA plugin

Kevin Sala Penad├ęs via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Sep 22 11:59:24 PDT 2022

kevinsala added a comment.

Thanks @jhuber6 @tianshilei1992 for your comments! I'll be addressing those issues. I added inline replies to some specific comments.

Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:59
   /// The size reserved for data in a shared memory slot.
-  const unsigned GV_Slot_Size;
+  unsigned GV_Slot_Size;
   /// The default value of maximum number of threads in a worker warp.
jhuber6 wrote:
> Is removing `const` really necessary here?
We are changing some values at run-time (e.g., the maximum number of teams) on each device. But since that modification is at device initialization, I believe we could construct a new GridValue with the definitive value instead of modifying the fields. I'll revert those changes.

Comment at: openmp/libomptarget/plugins-nextgen/common/PluginInterface/GlobalHandler.h:105
+  /// Map to store the ELF object files that have been loaded
+  std::unordered_map<int32_t, ELF64LEObjectFile> ELFObjectFiles;
+  std::mutex ELFObjectFilesMutex;
jhuber6 wrote:
Does `llvm::DenseMap` keep the references to the previously inserted elements valid when someone else is inserting a new element? I understood it invalidates them, but maybe I'm wrong. In case they are invalidated, we couldn't switch to `llvm::DenseMap` directly. I implemented this section thinking on multiple threads checking ELF object files, where some threads may be consulting an already inserted ELFObjectFile (without holding the map lock) while another potential thread may be inserting a new one (with the lock acquired).

Comment at: openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:112
+    __tgt_target_table TTTablePtr;
+    std::map<void *, unsigned> EntryMap;
+    std::vector<__tgt_offload_entry> Entries;
jhuber6 wrote:
> Do we need these sorted? `std::map` is very slow compared to `std::unordered_map` or `llvm::DenseMap`. prefer the LLVM data structures whenever possible as they are faster.
Actually, I think we can remove this map entirely and `OffloadEntryTableTy::getEntry()` function. It is not used anywhere.

Comment at: openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:615-616
+  /// Get a string description from a status code.
+  static const char *getErrorStr(const StatusCode &SC);
jhuber6 wrote:
> If we use LLVM's errors this is just `toString`
Yes, using LLVM's errors seems a good idea and should simplify many points where we are reporting errors. I'm going to integrate it too.

Comment at: openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp:1002
+GenericPluginTy &Plugin::get() {
+  static CUDAPluginTy CUDAPlugin;
+  assert(Plugin::isActive() && "Plugin is not active");
tianshilei1992 wrote:
> I'm not sure this is a good idea in terms of lifetime, but we can come back to this later once Joseph removes the "edge" between `libomptarget` and plugins when destructing `libomptarget`.
Sorry, I should have added a comment related to this issue. Initially, my plan was to use `__tgt_rtl_deinit_plugin` for deinitializing the plugin and the CUDA resources. The problem was that this plugin API function is called from a libomptarget destructor, and at that time, the plugin library and the CUDA driver are already shut down, so it was not possible (as far as I know) to perform a clean deinitialization using that mechanism.

Is there the possibility to move the `__tgt_rtl_deinit_plugin` call to a specific point before the destructors phase? Somewhere where we know the main function has finalized but the program is not exiting yet.

  rG LLVM Github Monorepo



More information about the Openmp-commits mailing list