[Openmp-commits] [PATCH] D94256: [OpenMP] Fixed the issue that memory manager calls plugin interface to release buffered memory even after the plugin has been destroyed
Vyacheslav Zakharin via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Jan 7 19:24:27 PST 2021
vzakhari added inline comments.
================
Comment at: openmp/libomptarget/src/rtl.cpp:444-445
// TODO: Remove RTL and the devices it manages if it's not used anymore?
// TODO: Write some RTL->unload_image(...) function?
----------------
tianshilei1992 wrote:
> vzakhari wrote:
> > tianshilei1992 wrote:
> > > vzakhari wrote:
> > > > tianshilei1992 wrote:
> > > > > vzakhari wrote:
> > > > > > Can we just implement this instead?
> > > > > At this point, the plugin has already been destroyed.
> > > > How is that possible, since we never call dlclose() for the plugins?
> > > > Is the problem specific to some particular execution conditions?
> > > >
> > > > Should we just "deinitialize" the memory manager here instead of postponing the "deinitialization" to MemoryManagerTy desctructor, which will obviously be executed at the wrong time.
> > > That's most tough part of this question. `libomptarget` has no idea when its opened plugin will be destroyed. Actually, you can see there is no `dlclose` in current code. What we can know is, plugin will be destroyed before `libomptarget`. You can easily verify it by adding `DP(...)` to the destructor of the plugin.
> > This sounds really strange at least for Linux, since:
> > > The dynamic linker maintains reference counts for object handles, so a dynamically loaded shared object is not deallocated until dlclose() has been called on it as many times as dlopen() has succeeded on it.
> >
> > Do you have a reproducer for this kind of an issue? I must be missing something here...
> >
> >
> You can use `openmp/libomptarget/test/offloading/memory_manager.cpp` and turn debug print on via setting env `LIBOMPTARGET_DEBUG` to a large number like `10`. Then you can see a lot of logs.
> ```
> ...
> Libomptarget --> omp_target_free deallocated device ptr
> PASS
> (1) Target CUDA RTL --> CUDA plugin destroyed
> (2) Libomptarget --> __tgt_unregister_lib
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x0000000000403870 is compatible with RTL 0x0000000000e4a8b0!
> Libomptarget --> Unregistered image 0x0000000000403870 from RTL 0x0000000000e4a8b0!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor 0x00000000004abb60
> Libomptarget --> Done unregistering library!
> Libomptarget --> Deinit target library!
> Target CUDA RTL --> Error returned from cuCtxSetCurrent
> Target CUDA RTL --> Unresolved CUDA error code: 4
> Target CUDA RTL --> Unsuccessful cuGetErrorString return status: 4
> Target CUDA RTL --> Error returned from cuCtxSetCurrent
> ...
> ```
> Line (1) and (2) were printed because I added `DP` to the destructor of CUDA plugin and function `__tgt_unregister_lib`. The last tens of lines are all error messages because CUDA plugin was destroyed before `libomptarget`.
Ah, I see. The global destructor for `DeviceRTL` is called before `__tgt_unregister_lib` is called. The plugin (the shared object) is not actually deallocated yet. Thank you for the explanation.
I am not sure libomptarget has to be involved here, since this may be done exclusively at the plugin level, i.e. `DeviceRTL` may be dynamically allocated/deallocated using global init/fini functions, and then all plugin APIs may return early, if they are invoked after `DeviceRTL` has been deallocated.
Just to be clear, I am not against this change-set :)
Anyway, libomptarget has to be redesigned a bit because the plugins loading initiated in a global constructor (which calls `RTLsTy::RegisterLib`) is violating Microsoft conventions for DLL initialization (e.g. you cannot create a DLL with offload and then `LoadLibrary` this DLL, because this will result in `LoadLibrary`->...->`LoadLibrary` call stack, which is not allowed). We should probably load the plugins on the first OpenMP construct/API encountered during the execution. I assume we will have to define clear entry/exit points for plugins then, and this will solve the issue that you are addressing here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94256/new/
https://reviews.llvm.org/D94256
More information about the Openmp-commits
mailing list