[Openmp-dev] Declare target functions and libomptarget image registration order

Johannes Doerfert via Openmp-dev openmp-dev at lists.llvm.org
Mon Oct 26 15:58:37 PDT 2020


This looks like a problem.

For me to understand this right, do you explicitly call any target 
library functions, if so, which and in which order?


@Ravi, @Deepak, @Kelvin, please take a look as well


On 10/26/20 3:23 PM, Manoel Römmer wrote:
> Hi Johannes,
>
> so the debug ouptut is:
>
> Libomptarget --> Loading RTLs...
> Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
> Target ve RTL --> Found 8 VE devices
> Libomptarget --> Successfully loaded library 'libomptarget.rtl.ve.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.ve.so supporting 8 
> devices!
> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so': 
> libomptarget.rtl.ppc64.so: cannot open shared object file: No such 
> file or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
> Libomptarget --> Successfully loaded library 
> 'libomptarget.rtl.x86_64.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.x86_64.so supporting 
> 4 devices!
> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.cuda.so': 
> libomptarget.rtl.cuda.so: cannot open shared object file: No such file 
> or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': 
> libomptarget.rtl.aarch64.so: cannot open shared object file: No such 
> file or directory!
> Libomptarget --> RTLs loaded!
> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
> libomptarget.rtl.ve.so!
> Libomptarget --> RTL 0x0000000000617580 has index 0!
> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
> libomptarget.rtl.ve.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> New requires flags 1 compatible with existing 1!
> Libomptarget --> Image is compatible with RTL libomptarget.rtl.ve.so!
> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
> libomptarget.rtl.ve.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> Call to omp_get_num_devices returning 8
> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory 
> (devices were found)
> Libomptarget --> Entering target region with entry point 
> 0x0000000000400b40 and device Id -1
> Libomptarget --> Checking whether device 0 is ready.
> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
> Target ve RTL --> Available VEO version: 9
> Libomptarget --> Device 0 is ready to use.
> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
> Target ve RTL --> Expecting to have 1 entries defined.
> Target ve RTL --> Wrote target image to /tmp/tmpfile_XwAVwB. 
> ImageSize=5648
> Target ve RTL --> ELF Type: 3
> Target ve RTL --> Aurora device successfully initialized with loaded 
> binary: proc_handle=0x621d10, ctx=0x623720
> [VE] ERROR: loadlib_handler() dlerror: /tmp/tmpfile_XwAVwB: undefined 
> symbol: target_func_in_lib
> Target ve RTL --> veo_load_library() failed: LibHandle=0 
> Name=/tmp/tmpfile_XwAVwB. Set env VEORUN_BIN for static linked target 
> code.
> Libomptarget --> Unable to generate entries table for device id 0.
> Libomptarget --> Failed to init globals on device 0
> Libomptarget --> Failed to get device 0 ready
> Libomptarget fatal error 1: failure of target construct while 
> offloading is mandatory
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x0000000000400bc0 is compatible with RTL 
> 0x0000000000617580!
> Libomptarget --> Unregistered image 0x0000000000400bc0 from RTL 
> 0x0000000000617580!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor 
> 0x0000000000402210
> Libomptarget --> Done unregistering library!
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
> 0x0000000000617580!
> Libomptarget --> Unregistered image 0x00002b57e6062a20 from RTL 
> 0x0000000000617580!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor 
> 0x00002b57e6064400
> Libomptarget --> Done unregistering library!
> Libomptarget --> Deinit target library!
>
>
> As you can see, we libomptarget finds two images 0x00002b57e6062a20 
> (the shared library) and 0x0000000000400bc0 (the main program) in the 
> correct order (the main program requires a symbol in the library so 
> the library gets loaded first):
>
> > Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
> libomptarget.rtl.ve.so!
> > ...
> > Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
> libomptarget.rtl.ve.so!
>
>
> But then libomptarget calls __tgt_rtl_load_binary() with the image 
> 0x0000000000400bc0 (the main program) without first calling 
> __tgt_rtl_load_binary() with the image for the library:
>
> > Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
>
> Which the leads to our plugin not being able to actually load the 
> image due to unresolved symbols.
>
>
> On 10/21/20 5:18 AM, Johannes Doerfert wrote:
>> Hi Manoel,
>>
>> we briefly discussed it today in our meeting 
>> https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing
>> If we don't solve this over the list, feel free to join next week.
>> In the meantime, I'm unsure I grasp the situation.
>> Could you use a debug enabled libomptarget and use the 
>> LIBOMPTARGET_DEBUG environment variable to get the sequence of events?
>>
>> ~ Johannes
>>
>>
>> On 10/20/20 5:48 AM, Römmer, Manoel via Openmp-dev wrote:
>>> Hi,
>>>
>>> We have the following problem: We have a shared library containing a 
>>> function which is declared with '#pragma omp declare target', and a 
>>> main executable with a target region in which this function is 
>>> called. Now the target image in the shared library is registered 
>>> with libomptarget (__tgt_register_lib()) before the target image of 
>>> the main executable.
>>> However, libomptarget then passes the target image of the main 
>>> executable to our RTL plugin (with __tgt_rtl_load_binary()) before 
>>> the target image of the shared library.
>>> This is a problem for us because our plugin then tries to load the 
>>> main executable's image first and fails due to unresolved symbols.
>>>
>>> So, it seems to me, that libomptarget calls __tgt_rtl_load_binary() 
>>> with images not in the order which they were registered but in the 
>>> order they are placed in memory.
>>>
>>>
>>> Is this intended behaviour?
>>>
>>>
>>> Thanks,
>>>
>>> Manoel
>>>
>>>
>>> _______________________________________________
>>> Openmp-dev mailing list
>>> Openmp-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>


More information about the Openmp-dev mailing list