[Openmp-commits] [PATCH] D94256: [OpenMP] Fixed the issue that memory manager calls plugin interface to release buffered memory even after the plugin has been destroyed

Vyacheslav Zakharin via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Jan 7 19:24:27 PST 2021


vzakhari added inline comments.


================
Comment at: openmp/libomptarget/src/rtl.cpp:444-445
 
   // TODO: Remove RTL and the devices it manages if it's not used anymore?
   // TODO: Write some RTL->unload_image(...) function?
 
----------------
tianshilei1992 wrote:
> vzakhari wrote:
> > tianshilei1992 wrote:
> > > vzakhari wrote:
> > > > tianshilei1992 wrote:
> > > > > vzakhari wrote:
> > > > > > Can we just implement this instead?
> > > > > At this point, the plugin has already been destroyed.
> > > > How is that possible, since we never call dlclose() for the plugins?
> > > > Is the problem specific to some particular execution conditions?
> > > > 
> > > > Should we just "deinitialize" the memory manager here instead of postponing the "deinitialization" to MemoryManagerTy desctructor, which will obviously be executed at the wrong time.
> > > That's most tough part of this question. `libomptarget` has no idea when its opened plugin will be destroyed. Actually, you can see there is no `dlclose` in current code. What we can know is, plugin will be destroyed before `libomptarget`. You can easily verify it by adding `DP(...)` to the destructor of the plugin.
> > This sounds really strange at least for Linux, since:
> > > The dynamic linker maintains reference counts for object handles, so a dynamically loaded shared object is not deallocated until dlclose() has been called on it as many times as dlopen() has succeeded on it. 
> > 
> > Do you have a reproducer for this kind of an issue?  I must be missing something here...
> > 
> > 
> You can use `openmp/libomptarget/test/offloading/memory_manager.cpp` and turn debug print on via setting env `LIBOMPTARGET_DEBUG` to a large number like `10`. Then you can see a lot of logs.
> ```
> ...
> Libomptarget --> omp_target_free deallocated device ptr
> PASS
> (1) Target CUDA RTL --> CUDA plugin destroyed
> (2) Libomptarget --> __tgt_unregister_lib
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x0000000000403870 is compatible with RTL 0x0000000000e4a8b0!
> Libomptarget --> Unregistered image 0x0000000000403870 from RTL 0x0000000000e4a8b0!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor 0x00000000004abb60
> Libomptarget --> Done unregistering library!
> Libomptarget --> Deinit target library!
> Target CUDA RTL --> Error returned from cuCtxSetCurrent
> Target CUDA RTL --> Unresolved CUDA error code: 4
> Target CUDA RTL --> Unsuccessful cuGetErrorString return status: 4
> Target CUDA RTL --> Error returned from cuCtxSetCurrent
> ...
> ```
> Line (1) and (2) were printed because I added `DP` to the destructor of CUDA plugin and function `__tgt_unregister_lib`. The last tens of lines are all error messages because CUDA plugin was destroyed before `libomptarget`.
Ah, I see.  The global destructor for `DeviceRTL` is called before `__tgt_unregister_lib` is called.  The plugin (the shared object) is not actually deallocated yet.  Thank you for the explanation.

I am not sure libomptarget has to be involved here, since this may be done exclusively at the plugin level, i.e. `DeviceRTL` may be dynamically allocated/deallocated using global init/fini functions, and then all plugin APIs may return early, if they are invoked after `DeviceRTL` has been deallocated.

Just to be clear, I am not against this change-set :)

Anyway, libomptarget has to be redesigned a bit because the plugins loading initiated in a global constructor (which calls `RTLsTy::RegisterLib`) is violating Microsoft conventions for DLL initialization (e.g. you cannot create a DLL with offload and then `LoadLibrary` this DLL, because this will result in `LoadLibrary`->...->`LoadLibrary` call stack, which is not allowed).  We should probably load the plugins on the first OpenMP construct/API encountered during the execution.  I assume we will have to define clear entry/exit points for plugins then, and this will solve the issue that you are addressing here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94256/new/

https://reviews.llvm.org/D94256



More information about the Openmp-commits mailing list