[Openmp-dev] Declare target functions and libomptarget image registration order
Johannes Doerfert via Openmp-dev
openmp-dev at lists.llvm.org
Mon Oct 26 15:58:37 PDT 2020
This looks like a problem.
For me to understand this right, do you explicitly call any target
library functions, if so, which and in which order?
@Ravi, @Deepak, @Kelvin, please take a look as well
On 10/26/20 3:23 PM, Manoel Römmer wrote:
> Hi Johannes,
>
> so the debug ouptut is:
>
> Libomptarget --> Loading RTLs...
> Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
> Target ve RTL --> Found 8 VE devices
> Libomptarget --> Successfully loaded library 'libomptarget.rtl.ve.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.ve.so supporting 8
> devices!
> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so':
> libomptarget.rtl.ppc64.so: cannot open shared object file: No such
> file or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
> Libomptarget --> Successfully loaded library
> 'libomptarget.rtl.x86_64.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.x86_64.so supporting
> 4 devices!
> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.cuda.so':
> libomptarget.rtl.cuda.so: cannot open shared object file: No such file
> or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so':
> libomptarget.rtl.aarch64.so: cannot open shared object file: No such
> file or directory!
> Libomptarget --> RTLs loaded!
> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL
> libomptarget.rtl.ve.so!
> Libomptarget --> RTL 0x0000000000617580 has index 0!
> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL
> libomptarget.rtl.ve.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> New requires flags 1 compatible with existing 1!
> Libomptarget --> Image is compatible with RTL libomptarget.rtl.ve.so!
> Libomptarget --> Registering image 0x0000000000400bc0 with RTL
> libomptarget.rtl.ve.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> Call to omp_get_num_devices returning 8
> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory
> (devices were found)
> Libomptarget --> Entering target region with entry point
> 0x0000000000400b40 and device Id -1
> Libomptarget --> Checking whether device 0 is ready.
> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
> Target ve RTL --> Available VEO version: 9
> Libomptarget --> Device 0 is ready to use.
> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
> Target ve RTL --> Expecting to have 1 entries defined.
> Target ve RTL --> Wrote target image to /tmp/tmpfile_XwAVwB.
> ImageSize=5648
> Target ve RTL --> ELF Type: 3
> Target ve RTL --> Aurora device successfully initialized with loaded
> binary: proc_handle=0x621d10, ctx=0x623720
> [VE] ERROR: loadlib_handler() dlerror: /tmp/tmpfile_XwAVwB: undefined
> symbol: target_func_in_lib
> Target ve RTL --> veo_load_library() failed: LibHandle=0
> Name=/tmp/tmpfile_XwAVwB. Set env VEORUN_BIN for static linked target
> code.
> Libomptarget --> Unable to generate entries table for device id 0.
> Libomptarget --> Failed to init globals on device 0
> Libomptarget --> Failed to get device 0 ready
> Libomptarget fatal error 1: failure of target construct while
> offloading is mandatory
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x0000000000400bc0 is compatible with RTL
> 0x0000000000617580!
> Libomptarget --> Unregistered image 0x0000000000400bc0 from RTL
> 0x0000000000617580!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor
> 0x0000000000402210
> Libomptarget --> Done unregistering library!
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL
> 0x0000000000617580!
> Libomptarget --> Unregistered image 0x00002b57e6062a20 from RTL
> 0x0000000000617580!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor
> 0x00002b57e6064400
> Libomptarget --> Done unregistering library!
> Libomptarget --> Deinit target library!
>
>
> As you can see, we libomptarget finds two images 0x00002b57e6062a20
> (the shared library) and 0x0000000000400bc0 (the main program) in the
> correct order (the main program requires a symbol in the library so
> the library gets loaded first):
>
> > Libomptarget --> Registering image 0x00002b57e6062a20 with RTL
> libomptarget.rtl.ve.so!
> > ...
> > Libomptarget --> Registering image 0x0000000000400bc0 with RTL
> libomptarget.rtl.ve.so!
>
>
> But then libomptarget calls __tgt_rtl_load_binary() with the image
> 0x0000000000400bc0 (the main program) without first calling
> __tgt_rtl_load_binary() with the image for the library:
>
> > Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
>
> Which the leads to our plugin not being able to actually load the
> image due to unresolved symbols.
>
>
> On 10/21/20 5:18 AM, Johannes Doerfert wrote:
>> Hi Manoel,
>>
>> we briefly discussed it today in our meeting
>> https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing
>> If we don't solve this over the list, feel free to join next week.
>> In the meantime, I'm unsure I grasp the situation.
>> Could you use a debug enabled libomptarget and use the
>> LIBOMPTARGET_DEBUG environment variable to get the sequence of events?
>>
>> ~ Johannes
>>
>>
>> On 10/20/20 5:48 AM, Römmer, Manoel via Openmp-dev wrote:
>>> Hi,
>>>
>>> We have the following problem: We have a shared library containing a
>>> function which is declared with '#pragma omp declare target', and a
>>> main executable with a target region in which this function is
>>> called. Now the target image in the shared library is registered
>>> with libomptarget (__tgt_register_lib()) before the target image of
>>> the main executable.
>>> However, libomptarget then passes the target image of the main
>>> executable to our RTL plugin (with __tgt_rtl_load_binary()) before
>>> the target image of the shared library.
>>> This is a problem for us because our plugin then tries to load the
>>> main executable's image first and fails due to unresolved symbols.
>>>
>>> So, it seems to me, that libomptarget calls __tgt_rtl_load_binary()
>>> with images not in the order which they were registered but in the
>>> order they are placed in memory.
>>>
>>>
>>> Is this intended behaviour?
>>>
>>>
>>> Thanks,
>>>
>>> Manoel
>>>
>>>
>>> _______________________________________________
>>> Openmp-dev mailing list
>>> Openmp-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>
More information about the Openmp-dev
mailing list