[Openmp-dev] : undefined symbol: ompt_start_tool

Jim Cownie via Openmp-dev openmp-dev at lists.llvm.org
Tue Oct 27 01:46:21 PDT 2020


> Message: 1
> Date: Mon, 26 Oct 2020 15:18:45 -0500
> From: Kelvin Li via Openmp-dev <openmp-dev at lists.llvm.org>
> To: openmp-dev at lists.llvm.org
> Subject: [Openmp-dev] undefined symbol: ompt_start_tool
> Message-ID:
> 	<OFF5259549.0EC65D66-ON8525860D.006EC181-8525860D.006F94A6 at notes.na.collabserv.com>
> 	
> Content-Type: text/plain; charset="utf-8"
> 
> Has anyone encounter the following error?  I am wondering if it is 
> something to do with how I build libomp.so.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 1 ./a.out
> a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
> undefined symbol: ompt_start_tool
> --------------------------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status, 
> thus causing
> the job to be terminated. The first process to do so was:
> 
>  Process name: [[14546,1],0]
>  Exit code:    127
> --------------------------------------------------------------------------
> 
> But it works without mpirun.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib ./a.out
> 0
> 1
> 2
> 3
> 
> 
> Kelvin
Are you confident that /homie/kli/clang-install/lib is the same on all of the nodes used by the MPI program?
And that it contains the same version of libomp.so everywhere?

Perhaps you should also set an envirable to have the OpenMP runtime print its version, something like this 
$ KMP_VERSION=1 ./a.out
LLVM OMP version: 5.0.20140926
LLVM OMP library type: performance
LLVM OMP link type: dynamic
LLVM OMP build time: no_timestamp
LLVM OMP build compiler: Clang 12.0
LLVM OMP alternative compiler support: yes
LLVM OMP API version: 5.0 (201611)
LLVM OMP dynamic error checking: no
LLVM OMP thread affinity support: no

> On 26 Oct 2020, at 23:44, via Openmp-dev <openmp-dev at lists.llvm.org> wrote:
> 
> Send Openmp-dev mailing list submissions to
> 	openmp-dev at lists.llvm.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> or, via email, send a message with subject or body 'help' to
> 	openmp-dev-request at lists.llvm.org
> 
> You can reach the person managing the list at
> 	openmp-dev-owner at lists.llvm.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Openmp-dev digest..."
> 
> 
> Today's Topics:
> 
>   1. undefined symbol: ompt_start_tool (Kelvin Li via Openmp-dev)
>   2. Re: Declare target functions and libomptarget image
>      registration order (Manoel Römmer via Openmp-dev)
>   3. Re: Declare target functions and libomptarget image
>      registration order (Johannes Doerfert via Openmp-dev)
>   4. Re: Declare target functions and libomptarget image
>      registration order (Narayanaswamy, Ravi via Openmp-dev)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 26 Oct 2020 15:18:45 -0500
> From: Kelvin Li via Openmp-dev <openmp-dev at lists.llvm.org>
> To: openmp-dev at lists.llvm.org
> Subject: [Openmp-dev] undefined symbol: ompt_start_tool
> Message-ID:
> 	<OFF5259549.0EC65D66-ON8525860D.006EC181-8525860D.006F94A6 at notes.na.collabserv.com>
> 	
> Content-Type: text/plain; charset="utf-8"
> 
> Has anyone encounter the following error?  I am wondering if it is 
> something to do with how I build libomp.so.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 1 ./a.out
> a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
> undefined symbol: ompt_start_tool
> --------------------------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status, 
> thus causing
> the job to be terminated. The first process to do so was:
> 
>  Process name: [[14546,1],0]
>  Exit code:    127
> --------------------------------------------------------------------------
> 
> But it works without mpirun.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib ./a.out
> 0
> 1
> 2
> 3
> 
> 
> Kelvin
> 
> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20201026/8ae7fa34/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 26 Oct 2020 21:23:28 +0100
> From: Manoel Römmer via Openmp-dev <openmp-dev at lists.llvm.org>
> To: Johannes Doerfert <johannesdoerfert at gmail.com>,
> 	"openmp-dev at lists.llvm.org" <openmp-dev at lists.llvm.org>
> Subject: Re: [Openmp-dev] Declare target functions and libomptarget
> 	image registration order
> Message-ID: <5e5949e5-732c-81a8-5fa6-a274d5ac5e31 at itc.rwth-aachen.de>
> Content-Type: text/plain; charset="utf-8"; format=flowed
> 
> Hi Johannes,
> 
> so the debug ouptut is:
> 
> Libomptarget --> Loading RTLs...
> Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
> Target ve RTL --> Found 8 VE devices
> Libomptarget --> Successfully loaded library 'libomptarget.rtl.ve.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.ve.so supporting 8 
> devices!
> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so': 
> libomptarget.rtl.ppc64.so: cannot open shared object file: No such file 
> or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
> Libomptarget --> Successfully loaded library 'libomptarget.rtl.x86_64.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.x86_64.so supporting 4 
> devices!
> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.cuda.so': 
> libomptarget.rtl.cuda.so: cannot open shared object file: No such file 
> or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': 
> libomptarget.rtl.aarch64.so: cannot open shared object file: No such 
> file or directory!
> Libomptarget --> RTLs loaded!
> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
> libomptarget.rtl.ve.so!
> Libomptarget --> RTL 0x0000000000617580 has index 0!
> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
> libomptarget.rtl.ve.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> New requires flags 1 compatible with existing 1!
> Libomptarget --> Image is compatible with RTL libomptarget.rtl.ve.so!
> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
> libomptarget.rtl.ve.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> Call to omp_get_num_devices returning 8
> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices 
> were found)
> Libomptarget --> Entering target region with entry point 
> 0x0000000000400b40 and device Id -1
> Libomptarget --> Checking whether device 0 is ready.
> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
> Target ve RTL --> Available VEO version: 9
> Libomptarget --> Device 0 is ready to use.
> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
> Target ve RTL --> Expecting to have 1 entries defined.
> Target ve RTL --> Wrote target image to /tmp/tmpfile_XwAVwB. ImageSize=5648
> Target ve RTL --> ELF Type: 3
> Target ve RTL --> Aurora device successfully initialized with loaded 
> binary: proc_handle=0x621d10, ctx=0x623720
> [VE] ERROR: loadlib_handler() dlerror: /tmp/tmpfile_XwAVwB: undefined 
> symbol: target_func_in_lib
> Target ve RTL --> veo_load_library() failed: LibHandle=0 
> Name=/tmp/tmpfile_XwAVwB. Set env VEORUN_BIN for static linked target code.
> Libomptarget --> Unable to generate entries table for device id 0.
> Libomptarget --> Failed to init globals on device 0
> Libomptarget --> Failed to get device 0 ready
> Libomptarget fatal error 1: failure of target construct while offloading 
> is mandatory
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x0000000000400bc0 is compatible with RTL 
> 0x0000000000617580!
> Libomptarget --> Unregistered image 0x0000000000400bc0 from RTL 
> 0x0000000000617580!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor 
> 0x0000000000402210
> Libomptarget --> Done unregistering library!
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
> 0x0000000000617580!
> Libomptarget --> Unregistered image 0x00002b57e6062a20 from RTL 
> 0x0000000000617580!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor 
> 0x00002b57e6064400
> Libomptarget --> Done unregistering library!
> Libomptarget --> Deinit target library!
> 
> 
> As you can see, we libomptarget finds two images 0x00002b57e6062a20 (the 
> shared library) and 0x0000000000400bc0 (the main program) in the correct 
> order (the main program requires a symbol in the library so the library 
> gets loaded first):
> 
>> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
> libomptarget.rtl.ve.so!
>> ...
>> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
> libomptarget.rtl.ve.so!
> 
> 
> But then libomptarget calls __tgt_rtl_load_binary() with the image 
> 0x0000000000400bc0 (the main program) without first calling 
> __tgt_rtl_load_binary() with the image for the library:
> 
>> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
> 
> Which the leads to our plugin not being able to actually load the image 
> due to unresolved symbols.
> 
> 
> On 10/21/20 5:18 AM, Johannes Doerfert wrote:
>> Hi Manoel,
>> 
>> we briefly discussed it today in our meeting 
>> https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing
>> If we don't solve this over the list, feel free to join next week.
>> In the meantime, I'm unsure I grasp the situation.
>> Could you use a debug enabled libomptarget and use the 
>> LIBOMPTARGET_DEBUG environment variable to get the sequence of events?
>> 
>> ~ Johannes
>> 
>> 
>> On 10/20/20 5:48 AM, Römmer, Manoel via Openmp-dev wrote:
>>> Hi,
>>> 
>>> We have the following problem: We have a shared library containing a 
>>> function which is declared with '#pragma omp declare target', and a 
>>> main executable with a target region in which this function is 
>>> called. Now the target image in the shared library is registered with 
>>> libomptarget (__tgt_register_lib()) before the target image of the 
>>> main executable.
>>> However, libomptarget then passes the target image of the main 
>>> executable to our RTL plugin (with __tgt_rtl_load_binary()) before 
>>> the target image of the shared library.
>>> This is a problem for us because our plugin then tries to load the 
>>> main executable's image first and fails due to unresolved symbols.
>>> 
>>> So, it seems to me, that libomptarget calls __tgt_rtl_load_binary() 
>>> with images not in the order which they were registered but in the 
>>> order they are placed in memory.
>>> 
>>> 
>>> Is this intended behaviour?
>>> 
>>> 
>>> Thanks,
>>> 
>>> Manoel
>>> 
>>> 
>>> _______________________________________________
>>> Openmp-dev mailing list
>>> Openmp-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 26 Oct 2020 17:58:37 -0500
> From: Johannes Doerfert via Openmp-dev <openmp-dev at lists.llvm.org>
> To: Manoel Römmer <roemmer at itc.rwth-aachen.de>,
> 	"openmp-dev at lists.llvm.org" <openmp-dev at lists.llvm.org>
> Cc: "Eachempati, Deepak" <deepak.eachempati at hpe.com>
> Subject: Re: [Openmp-dev] Declare target functions and libomptarget
> 	image registration order
> Message-ID: <18160c18-e591-71f8-e5a7-208e729a1fda at gmail.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
> 
> This looks like a problem.
> 
> For me to understand this right, do you explicitly call any target 
> library functions, if so, which and in which order?
> 
> 
> @Ravi, @Deepak, @Kelvin, please take a look as well
> 
> 
> On 10/26/20 3:23 PM, Manoel Römmer wrote:
>> Hi Johannes,
>> 
>> so the debug ouptut is:
>> 
>> Libomptarget --> Loading RTLs...
>> Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
>> Target ve RTL --> Found 8 VE devices
>> Libomptarget --> Successfully loaded library 'libomptarget.rtl.ve.so'!
>> Libomptarget --> Registering RTL libomptarget.rtl.ve.so supporting 8 
>> devices!
>> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so': 
>> libomptarget.rtl.ppc64.so: cannot open shared object file: No such 
>> file or directory!
>> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
>> Libomptarget --> Successfully loaded library 
>> 'libomptarget.rtl.x86_64.so'!
>> Libomptarget --> Registering RTL libomptarget.rtl.x86_64.so supporting 
>> 4 devices!
>> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.cuda.so': 
>> libomptarget.rtl.cuda.so: cannot open shared object file: No such file 
>> or directory!
>> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': 
>> libomptarget.rtl.aarch64.so: cannot open shared object file: No such 
>> file or directory!
>> Libomptarget --> RTLs loaded!
>> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
>> libomptarget.rtl.ve.so!
>> Libomptarget --> RTL 0x0000000000617580 has index 0!
>> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
>> libomptarget.rtl.ve.so!
>> Libomptarget --> Done registering entries!
>> Libomptarget --> New requires flags 1 compatible with existing 1!
>> Libomptarget --> Image is compatible with RTL libomptarget.rtl.ve.so!
>> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
>> libomptarget.rtl.ve.so!
>> Libomptarget --> Done registering entries!
>> Libomptarget --> Call to omp_get_num_devices returning 8
>> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory 
>> (devices were found)
>> Libomptarget --> Entering target region with entry point 
>> 0x0000000000400b40 and device Id -1
>> Libomptarget --> Checking whether device 0 is ready.
>> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
>> Target ve RTL --> Available VEO version: 9
>> Libomptarget --> Device 0 is ready to use.
>> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
>> Target ve RTL --> Expecting to have 1 entries defined.
>> Target ve RTL --> Wrote target image to /tmp/tmpfile_XwAVwB. 
>> ImageSize=5648
>> Target ve RTL --> ELF Type: 3
>> Target ve RTL --> Aurora device successfully initialized with loaded 
>> binary: proc_handle=0x621d10, ctx=0x623720
>> [VE] ERROR: loadlib_handler() dlerror: /tmp/tmpfile_XwAVwB: undefined 
>> symbol: target_func_in_lib
>> Target ve RTL --> veo_load_library() failed: LibHandle=0 
>> Name=/tmp/tmpfile_XwAVwB. Set env VEORUN_BIN for static linked target 
>> code.
>> Libomptarget --> Unable to generate entries table for device id 0.
>> Libomptarget --> Failed to init globals on device 0
>> Libomptarget --> Failed to get device 0 ready
>> Libomptarget fatal error 1: failure of target construct while 
>> offloading is mandatory
>> Libomptarget --> Unloading target library!
>> Libomptarget --> Image 0x0000000000400bc0 is compatible with RTL 
>> 0x0000000000617580!
>> Libomptarget --> Unregistered image 0x0000000000400bc0 from RTL 
>> 0x0000000000617580!
>> Libomptarget --> Done unregistering images!
>> Libomptarget --> Removing translation table for descriptor 
>> 0x0000000000402210
>> Libomptarget --> Done unregistering library!
>> Libomptarget --> Unloading target library!
>> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
>> 0x0000000000617580!
>> Libomptarget --> Unregistered image 0x00002b57e6062a20 from RTL 
>> 0x0000000000617580!
>> Libomptarget --> Done unregistering images!
>> Libomptarget --> Removing translation table for descriptor 
>> 0x00002b57e6064400
>> Libomptarget --> Done unregistering library!
>> Libomptarget --> Deinit target library!
>> 
>> 
>> As you can see, we libomptarget finds two images 0x00002b57e6062a20 
>> (the shared library) and 0x0000000000400bc0 (the main program) in the 
>> correct order (the main program requires a symbol in the library so 
>> the library gets loaded first):
>> 
>>> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
>> libomptarget.rtl.ve.so!
>>> ...
>>> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
>> libomptarget.rtl.ve.so!
>> 
>> 
>> But then libomptarget calls __tgt_rtl_load_binary() with the image 
>> 0x0000000000400bc0 (the main program) without first calling 
>> __tgt_rtl_load_binary() with the image for the library:
>> 
>>> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
>> 
>> Which the leads to our plugin not being able to actually load the 
>> image due to unresolved symbols.
>> 
>> 
>> On 10/21/20 5:18 AM, Johannes Doerfert wrote:
>>> Hi Manoel,
>>> 
>>> we briefly discussed it today in our meeting 
>>> https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing
>>> If we don't solve this over the list, feel free to join next week.
>>> In the meantime, I'm unsure I grasp the situation.
>>> Could you use a debug enabled libomptarget and use the 
>>> LIBOMPTARGET_DEBUG environment variable to get the sequence of events?
>>> 
>>> ~ Johannes
>>> 
>>> 
>>> On 10/20/20 5:48 AM, Römmer, Manoel via Openmp-dev wrote:
>>>> Hi,
>>>> 
>>>> We have the following problem: We have a shared library containing a 
>>>> function which is declared with '#pragma omp declare target', and a 
>>>> main executable with a target region in which this function is 
>>>> called. Now the target image in the shared library is registered 
>>>> with libomptarget (__tgt_register_lib()) before the target image of 
>>>> the main executable.
>>>> However, libomptarget then passes the target image of the main 
>>>> executable to our RTL plugin (with __tgt_rtl_load_binary()) before 
>>>> the target image of the shared library.
>>>> This is a problem for us because our plugin then tries to load the 
>>>> main executable's image first and fails due to unresolved symbols.
>>>> 
>>>> So, it seems to me, that libomptarget calls __tgt_rtl_load_binary() 
>>>> with images not in the order which they were registered but in the 
>>>> order they are placed in memory.
>>>> 
>>>> 
>>>> Is this intended behaviour?
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Manoel
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Openmp-dev mailing list
>>>> Openmp-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 26 Oct 2020 23:42:14 +0000
> From: "Narayanaswamy, Ravi via Openmp-dev" <openmp-dev at lists.llvm.org>
> To: Johannes Doerfert <johannesdoerfert at gmail.com>, Manoel Römmer
> 	<roemmer at itc.rwth-aachen.de>, "openmp-dev at lists.llvm.org"
> 	<openmp-dev at lists.llvm.org>
> Cc: "Eachempati, Deepak" <deepak.eachempati at hpe.com>
> Subject: Re: [Openmp-dev] Declare target functions and libomptarget
> 	image registration order
> Message-ID:
> 	<BY5PR11MB4210ED46396813590081925897190 at BY5PR11MB4210.namprd11.prod.outlook.com>
> 	
> Content-Type: text/plain; charset="utf-8"
> 
> The different scenarios of offloading are
> 1)   offloading from the main executable.
>   a)    the kernel is self contained.  Ie  has no calls to routines outside the main image.
>   b)   has calls to routine in know device libraries like device rtl and math
>   c)  has calls to user libraries.  I am considering only .so here since .a would have been linked in during compile time.
> 
> 2)  offload from dynamic library .so
>   a)  the kernel is self contained.  Ie  has no calls to routines outside the .so image.
>   b)  has calls to routine in know device libraries like device rtl and math
>   c) has calls to user libraries
> 
> You are requesting support for 1c and 2c.  All we need it to register the dynamic libraries with the plugin since they are registered 1st to libomptarget.  
> The plugin keeps a table of all registered images and when any binary is compiled it also looks up all the registered image to resolve any required symbols.
> So your plugin needs to support some sort of dynamic linking..  Does your plugin for NEC SX-Aurora  have this support?
> 
> 
> -----Original Message-----
> From: Johannes Doerfert <johannesdoerfert at gmail.com> 
> Sent: Monday, October 26, 2020 3:59 PM
> To: Manoel Römmer <roemmer at itc.rwth-aachen.de>; openmp-dev at lists.llvm.org
> Cc: Narayanaswamy, Ravi <ravi.narayanaswamy at intel.com>; Eachempati, Deepak <deepak.eachempati at hpe.com>; Kelvin Li <kli at ca.ibm.com>
> Subject: Re: [Openmp-dev] Declare target functions and libomptarget image registration order
> 
> This looks like a problem.
> 
> For me to understand this right, do you explicitly call any target library functions, if so, which and in which order?
> 
> 
> @Ravi, @Deepak, @Kelvin, please take a look as well
> 
> 
> On 10/26/20 3:23 PM, Manoel Römmer wrote:
>> Hi Johannes,
>> 
>> so the debug ouptut is:
>> 
>> Libomptarget --> Loading RTLs...
>> Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
>> Target ve RTL --> Found 8 VE devices
>> Libomptarget --> Successfully loaded library 'libomptarget.rtl.ve.so'!
>> Libomptarget --> Registering RTL libomptarget.rtl.ve.so supporting 8 
>> devices!
>> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so': 
>> libomptarget.rtl.ppc64.so: cannot open shared object file: No such 
>> file or directory!
>> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
>> Libomptarget --> Successfully loaded library 
>> 'libomptarget.rtl.x86_64.so'!
>> Libomptarget --> Registering RTL libomptarget.rtl.x86_64.so supporting 
>> 4 devices!
>> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.cuda.so': 
>> libomptarget.rtl.cuda.so: cannot open shared object file: No such file 
>> or directory!
>> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': 
>> libomptarget.rtl.aarch64.so: cannot open shared object file: No such 
>> file or directory!
>> Libomptarget --> RTLs loaded!
>> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
>> libomptarget.rtl.ve.so!
>> Libomptarget --> RTL 0x0000000000617580 has index 0!
>> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
>> libomptarget.rtl.ve.so!
>> Libomptarget --> Done registering entries!
>> Libomptarget --> New requires flags 1 compatible with existing 1!
>> Libomptarget --> Image is compatible with RTL libomptarget.rtl.ve.so!
>> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
>> libomptarget.rtl.ve.so!
>> Libomptarget --> Done registering entries!
>> Libomptarget --> Call to omp_get_num_devices returning 8
>> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory 
>> (devices were found)
>> Libomptarget --> Entering target region with entry point 
>> 0x0000000000400b40 and device Id -1
>> Libomptarget --> Checking whether device 0 is ready.
>> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
>> Target ve RTL --> Available VEO version: 9
>> Libomptarget --> Device 0 is ready to use.
>> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
>> Target ve RTL --> Expecting to have 1 entries defined.
>> Target ve RTL --> Wrote target image to /tmp/tmpfile_XwAVwB. 
>> ImageSize=5648
>> Target ve RTL --> ELF Type: 3
>> Target ve RTL --> Aurora device successfully initialized with loaded 
>> binary: proc_handle=0x621d10, ctx=0x623720
>> [VE] ERROR: loadlib_handler() dlerror: /tmp/tmpfile_XwAVwB: undefined 
>> symbol: target_func_in_lib
>> Target ve RTL --> veo_load_library() failed: LibHandle=0 
>> Name=/tmp/tmpfile_XwAVwB. Set env VEORUN_BIN for static linked target 
>> code.
>> Libomptarget --> Unable to generate entries table for device id 0.
>> Libomptarget --> Failed to init globals on device 0
>> Libomptarget --> Failed to get device 0 ready
>> Libomptarget fatal error 1: failure of target construct while 
>> offloading is mandatory
>> Libomptarget --> Unloading target library!
>> Libomptarget --> Image 0x0000000000400bc0 is compatible with RTL 
>> 0x0000000000617580!
>> Libomptarget --> Unregistered image 0x0000000000400bc0 from RTL 
>> 0x0000000000617580!
>> Libomptarget --> Done unregistering images!
>> Libomptarget --> Removing translation table for descriptor 
>> 0x0000000000402210
>> Libomptarget --> Done unregistering library!
>> Libomptarget --> Unloading target library!
>> Libomptarget --> Image 0x00002b57e6062a20 is compatible with RTL 
>> 0x0000000000617580!
>> Libomptarget --> Unregistered image 0x00002b57e6062a20 from RTL 
>> 0x0000000000617580!
>> Libomptarget --> Done unregistering images!
>> Libomptarget --> Removing translation table for descriptor 
>> 0x00002b57e6064400
>> Libomptarget --> Done unregistering library!
>> Libomptarget --> Deinit target library!
>> 
>> 
>> As you can see, we libomptarget finds two images 0x00002b57e6062a20 
>> (the shared library) and 0x0000000000400bc0 (the main program) in the 
>> correct order (the main program requires a symbol in the library so 
>> the library gets loaded first):
>> 
>>> Libomptarget --> Registering image 0x00002b57e6062a20 with RTL 
>> libomptarget.rtl.ve.so!
>>> ...
>>> Libomptarget --> Registering image 0x0000000000400bc0 with RTL 
>> libomptarget.rtl.ve.so!
>> 
>> 
>> But then libomptarget calls __tgt_rtl_load_binary() with the image 
>> 0x0000000000400bc0 (the main program) without first calling 
>> __tgt_rtl_load_binary() with the image for the library:
>> 
>>> Target ve RTL --> Dev 0: load binary from 0x0000000000400bc0 image
>> 
>> Which the leads to our plugin not being able to actually load the 
>> image due to unresolved symbols.
>> 
>> 
>> On 10/21/20 5:18 AM, Johannes Doerfert wrote:
>>> Hi Manoel,
>>> 
>>> we briefly discussed it today in our meeting 
>>> https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing
>>> If we don't solve this over the list, feel free to join next week.
>>> In the meantime, I'm unsure I grasp the situation.
>>> Could you use a debug enabled libomptarget and use the 
>>> LIBOMPTARGET_DEBUG environment variable to get the sequence of events?
>>> 
>>> ~ Johannes
>>> 
>>> 
>>> On 10/20/20 5:48 AM, Römmer, Manoel via Openmp-dev wrote:
>>>> Hi,
>>>> 
>>>> We have the following problem: We have a shared library containing a 
>>>> function which is declared with '#pragma omp declare target', and a 
>>>> main executable with a target region in which this function is 
>>>> called. Now the target image in the shared library is registered 
>>>> with libomptarget (__tgt_register_lib()) before the target image of 
>>>> the main executable.
>>>> However, libomptarget then passes the target image of the main 
>>>> executable to our RTL plugin (with __tgt_rtl_load_binary()) before 
>>>> the target image of the shared library.
>>>> This is a problem for us because our plugin then tries to load the 
>>>> main executable's image first and fails due to unresolved symbols.
>>>> 
>>>> So, it seems to me, that libomptarget calls __tgt_rtl_load_binary() 
>>>> with images not in the order which they were registered but in the 
>>>> order they are placed in memory.
>>>> 
>>>> 
>>>> Is this intended behaviour?
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Manoel
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Openmp-dev mailing list
>>>> Openmp-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> 
> 
> ------------------------------
> 
> End of Openmp-dev Digest, Vol 82, Issue 10
> ******************************************

-- Jim
James Cownie <jcownie at gmail.com>
Mob: +44 780 637 7146




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20201027/1b00a2ca/attachment-0001.html>


More information about the Openmp-dev mailing list