[Openmp-dev] : undefined symbol: ompt_start_tool

Kelvin Li via Openmp-dev openmp-dev at lists.llvm.org
Wed Oct 28 09:16:24 PDT 2020


Hi Joachim,

Thanks.  I still think that both "./a.out" and "mpirun -np 1 ./a.out" use 
the same library.  Here are the ldd output.

$ ldd a.out
        linux-vdso64.so.1 (0x00007fffa4810000)
        libomp.so => /home/kli/clang-install/lib/libomp.so 
(0x00007fffa46b0000)
        libpthread.so.0 => /lib64/power9/libpthread.so.0 
(0x00007fffa4650000)
        libc.so.6 => /lib64/power9/libc.so.6 (0x00007fffa4440000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fffa4410000)
        /lib64/ld64.so.2 (0x00007fffa4830000)

$ LD_LIBRARY_PATH=/home/kli/clang-install/lib ldd a.out
        linux-vdso64.so.1 (0x00007fffb54d0000)
        libomp.so => /home/kli/clang-install/lib/libomp.so 
(0x00007fffb5370000)
        libpthread.so.0 => /lib64/power9/libpthread.so.0 
(0x00007fffb5310000)
        libc.so.6 => /lib64/power9/libc.so.6 (0x00007fffb5100000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fffb50d0000)
        /lib64/ld64.so.2 (0x00007fffb54f0000)

$ LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 1 ldd a.out
        linux-vdso64.so.1 (0x0000200000050000)
        .../spectrum_mpi/latest/container/../lib/libpami_cudahook.so 
(0x0000200000070000)
        libomp.so => /home/kli/clang-install/lib/libomp.so 
(0x00002000000a0000)
        libpthread.so.0 => /lib64/power9/libpthread.so.0 
(0x0000200000210000)
        libc.so.6 => /lib64/power9/libc.so.6 (0x0000200000260000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000200000470000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002000004a0000)
        libm.so.6 => /lib64/power9/libm.so.6 (0x00002000006d0000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000200000800000)
        /lib64/ld64.so.2 (0x0000200000000000)


Preloading the libomp.so does not seem to help.

$ LD_PRELOAD=/home/kli/clang-install/lib/libomp.so mpirun -np 1 ./a.out
./a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
undefined symbol: ompt_start_tool
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, 
thus causing
the job to be terminated. The first process to do so was:

  Process name: [[53352,1],0]
  Exit code:    127
--------------------------------------------------------------------------

I tried both commands with LD_DEBUG.  It seems that somehow the 
libarcher.so cannot be found to resolve ompt_start_tool.

$ LD_DEBUG=bindings LD_LIBRARY_PATH=/home/kli/clang-install/lib ./a.out 
2>&1| grep ompt_start_tool
      9105:     binding file /home/kli/clang-install/lib/libomp.so [0] to 
/home/kli/clang-install/lib/libomp.so [0]: normal symbol `ompt_start_tool' 
[VERSION]
      9105:     /home/kli/clang-install/lib/libomp.so: error: symbol 
lookup error: undefined symbol: ompt_start_tool (fatal)
      9105:     binding file /home/kli/clang-install/lib/libarcher.so [0] 
to /home/kli/clang-install/lib/libarcher.so [0]: normal symbol 
`ompt_start_tool'

$ LD_DEBUG=bindings LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 
1 ./a.out 2>&1| grep ompt_start_tool
     27652:     binding file /home/kli/clang-install/lib/libomp.so [0] to 
/home/kli/clang-install/lib/libomp.so [0]: normal symbol `ompt_start_tool' 
[VERSION]
     27652:     /home/kli/clang-install/lib/libomp.so: error: symbol 
lookup error: undefined symbol: ompt_start_tool (fatal)
./a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
undefined symbol: ompt_start_tool


Thanks,
Kelvin




From:   Joachim Protze <protze.joachim at gmail.com>
To:     Kelvin Li <kli at ca.ibm.com>, openmp-dev at lists.llvm.org
Cc:     Jim Cownie <jcownie at gmail.com>
Date:   2020/10/28 03:57 AM
Subject:        [EXTERNAL] Re: [Openmp-dev] : undefined symbol: 
ompt_start_tool



Hi Kelvin,

while this LD_PRELOAD is a workaround for the symptom (as it happen to
also implement ompt_start_tool), it does not explain your issue.

For my local installation I get:

$ readelf --syms libomp.so | grep ompt_start_tool
   658: 0000000000098050    54 FUNC    WEAK   DEFAULT   12
ompt_start_tool@@VERSION
   776: 00000000000c8f40     8 OBJECT  LOCAL  DEFAULT   26
_ZL22ompt_start_tool_resu
  2821: 0000000000098050    54 FUNC    WEAK   DEFAULT   12 ompt_start_tool

So, the runtime has a (weak) implementation of this function.


My suspicion is that mpirun adds some path to LD_LIBRARY_PATH, so that a
different libomp is loaded.  You might compare

$ ldd a.out
and
$ LD_LIBRARY_PATH=/home/kli/clang-install/lib ldd a.out
and
$ LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 1 ldd a.out

Instead of preloading libarcher, you can also preload a specific OpenMP
runtime to be used for execution with:

$ LD_PRELOAD=/home/kli/clang-install/lib/libomp.so

Is this the same runtime used, when you execute without mpirun?
Do you get the same error, when preloading this runtime without mpirun?

Best
Joachim


Am 27.10.20 um 21:22 schrieb Kelvin Li via Openmp-dev:
> I figure out how to make it work.  I need to preload libarcher.so.  I 
> don't understand why it cannot be done automatically in the "mpirun ... 
> ./a.out" case.
> 
> $ LD_PRELOAD=/home/kli/clang-install/lib/libarcher.so 
> LD_LIBRARY_PATH=/home/kli/clang-install/lib KMP_VERSION=1 mpirun -np 1 
> ./a.out
> LLVM OMP version: 5.0.20140926
> LLVM OMP library type: performance
> LLVM OMP link type: dynamic
> LLVM OMP build time: no_timestamp
> LLVM OMP build compiler: Clang 11.0
> LLVM OMP alternative compiler support: yes
> LLVM OMP API version: 5.0 (201611)
> LLVM OMP dynamic error checking: no
> LLVM OMP plain barrier branch bits: gather=2, release=2
> LLVM OMP forkjoin barrier branch bits: gather=2, release=2
> LLVM OMP reduction barrier branch bits: gather=1, release=1
> LLVM OMP plain barrier pattern: gather=hyper, release=hyper
> LLVM OMP forkjoin barrier pattern: gather=hyper, release=hyper
> LLVM OMP reduction barrier pattern: gather=hyper, release=hyper
> LLVM OMP lock type: run time selectable
> LLVM OMP thread affinity support: not used
> 0
> 1
> 2
> 3
> 
> Kelvin
> 
> 
> 
> 
> From:   Kelvin Li via Openmp-dev <openmp-dev at lists.llvm.org>
> To:     Jim Cownie <jcownie at gmail.com>
> Cc:     via Openmp-dev <openmp-dev at lists.llvm.org>
> Date:   2020/10/27 01:24 PM
> Subject:        [EXTERNAL] Re: [Openmp-dev] :  undefined symbol: 
> ompt_start_tool
> Sent by:        "Openmp-dev" <openmp-dev-bounces at lists.llvm.org>
> 
> 
> 
> Hi Jim,
> 
> Here is what I get with KMP_VERSION=1.
> 
> $ LD_LIBRARY_PATH=$HOME/clang-install/lib KMP_VERSION=1 ./a.out
> LLVM OMP version: 5.0.20140926
> LLVM OMP library type: performance
> LLVM OMP link type: dynamic
> LLVM OMP build time: no_timestamp
> LLVM OMP build compiler: Clang 11.0
> LLVM OMP alternative compiler support: yes
> LLVM OMP API version: 5.0 (201611)
> LLVM OMP dynamic error checking: no
> LLVM OMP plain barrier branch bits: gather=2, release=2
> LLVM OMP forkjoin barrier branch bits: gather=2, release=2
> LLVM OMP reduction barrier branch bits: gather=1, release=1
> LLVM OMP plain barrier pattern: gather=hyper, release=hyper
> LLVM OMP forkjoin barrier pattern: gather=hyper, release=hyper
> LLVM OMP reduction barrier pattern: gather=hyper, release=hyper
> LLVM OMP lock type: run time selectable
> LLVM OMP thread affinity support: not used
> 0
> 1
> 3
> 2
> 
> For the mpirun case, 
> 
> $ KMP_VERSION=1 mpirun -np 1 ./a.out
> ./a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
> undefined symbol: ompt_start_tool
> 
--------------------------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> 
--------------------------------------------------------------------------
> 
--------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status, 
> thus causing
> the job to be terminated. The first process to do so was:
> 
>   Process name: [[52030,1],0]
>   Exit code:    127
> 
--------------------------------------------------------------------------
> 
> 
> Kelvin
> 
> 
> 
> 
> From:        Jim Cownie <jcownie at gmail.com>
> To:        Kelvin Li <kli at ca.ibm.com>
> Cc:        via Openmp-dev <openmp-dev at lists.llvm.org>
> Date:        2020/10/27 11:09 AM
> Subject:        [EXTERNAL] Re: [Openmp-dev] :  undefined symbol: 
> ompt_start_tool
> 
> 
> 
> On 27 Oct 2020, at 15:00, Kelvin Li wrote: I don't think that is the...  
 
> 
> 
> 
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> 
> 
> 
> 
> 
> On 27 Oct 2020, at 15:00, Kelvin Li <kli at ca.ibm.com> wrote:
> 
> I don't think that is the case.  There is only one task "-np 1" on one 
> node.  Both './a.out' and 'mpirun -np 1 ./a.out' are issued on the same 
> node which has the same library in /home/kli/clang-install/lib.  That is 

> puzzling me!
> 
> It really looks as if you’re getting two different versions of the 
> runtime, though, so having the runtime tell you its properties is still 
> likely useful.
> If nothing else, it may show up that you’re not propagating envirables 
as 
> you might have hoped (if the MPI version doesn’t print anything !)
> 
> -- Jim
> James Cownie <jcownie at gmail.com>
> Mob: +44 780 637 7146
> Kelvin
> 
> 
> 
> 
> From:        Jim Cownie via Openmp-dev <openmp-dev at lists.llvm.org>
> To:        via Openmp-dev <openmp-dev at lists.llvm.org>, 
> openmp-dev-request at lists.llvm.org
> Date:        2020/10/27 04:46 AM
> Subject:        [EXTERNAL] Re: [Openmp-dev] :  undefined symbol: 
> ompt_start_tool
> Sent by:        "Openmp-dev" <openmp-dev-bounces at lists.llvm.org>
> 
> 
> 
> Message: 1 Date: Mon, 26 Oct 2020 15:18:45 -0500 From: Kelvin Li via 
> Openmp-dev... 
> 
> 
> 
> This Message Is From an External Sender
> This message came from outside your organization.
> 
> 
> 
> 
> 
> Message: 1
> Date: Mon, 26 Oct 2020 15:18:45 -0500
> From: Kelvin Li via Openmp-dev <openmp-dev at lists.llvm.org>
> To: openmp-dev at lists.llvm.org
> Subject: [Openmp-dev] undefined symbol: ompt_start_tool
> Message-ID:
> <
> 
OFF5259549.0EC65D66-ON8525860D.006EC181-8525860D.006F94A6 at notes.na.collabserv.com
>>
> 
> Content-Type: text/plain; charset="utf-8"
> 
> Has anyone encounter the following error?  I am wondering if it is 
> something to do with how I build libomp.so.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 1 ./a.out
> a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
> undefined symbol: ompt_start_tool
> 
--------------------------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> 
--------------------------------------------------------------------------
> 
--------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status, 
> thus causing
> the job to be terminated. The first process to do so was:
> 
> Process name: [[14546,1],0]
> Exit code:    127
> 
--------------------------------------------------------------------------
> 
> But it works without mpirun.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib ./a.out
> 0
> 1
> 2
> 3
> 
> 
> Kelvin
> Are you confident that /homie/kli/clang-install/libis the same on all of 

> the nodes used by the MPI program?
> And that it contains the same version oflibomp.so everywhere?
> 
> Perhaps you should also set an envirable to have the OpenMP runtime 
print 
> its version, something like this 
> $ KMP_VERSION=1 ./a.out
> LLVM OMP version: 5.0.20140926
> LLVM OMP library type: performance
> LLVM OMP link type: dynamic
> LLVM OMP build time: no_timestamp
> LLVM OMP build compiler: Clang 12.0
> LLVM OMP alternative compiler support: yes
> LLVM OMP API version: 5.0 (201611)
> LLVM OMP dynamic error checking: no
> LLVM OMP thread affinity support: no
> 
> On 26 Oct 2020, at 23:44, via Openmp-dev <openmp-dev at lists.llvm.org> 
> wrote:
> 
> Send Openmp-dev mailing list submissions to
> openmp-dev at lists.llvm.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 
https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev 

> or, via email, send a message with subject or body 'help' to
> openmp-dev-request at lists.llvm.org
> 
> You can reach the person managing the list at
> openmp-dev-owner at lists.llvm.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Openmp-dev digest..."
> 
> 
> Today's Topics:
> 
> 1. undefined symbol: ompt_start_tool (Kelvin Li via Openmp-dev)
> 2. Re: Declare target functions and libomptarget image
>    registration order (Manoel Römmer via Openmp-dev)
> 3. Re: Declare target functions and libomptarget image
>    registration order (Johannes Doerfert via Openmp-dev)
> 4. Re: Declare target functions and libomptarget image
>    registration order (Narayanaswamy, Ravi via Openmp-dev)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 26 Oct 2020 15:18:45 -0500
> From: Kelvin Li via Openmp-dev <openmp-dev at lists.llvm.org>
> To: openmp-dev at lists.llvm.org
> Subject: [Openmp-dev] undefined symbol: ompt_start_tool
> Message-ID:
> 
<OFF5259549.0EC65D66-ON8525860D.006EC181-8525860D.006F94A6 at notes.na.collabserv.com>
> 
> Content-Type: text/plain; charset="utf-8"
> 
> Has anyone encounter the following error?  I am wondering if it is 
> something to do with how I build libomp.so.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib mpirun -np 1 ./a.out
> a.out: symbol lookup error: /home/kli/clang-install/lib/libomp.so: 
> undefined symbol: ompt_start_tool
> 
--------------------------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> 
--------------------------------------------------------------------------
> 
--------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status, 
> thus causing
> the job to be terminated. The first process to do so was:
> 
> Process name: [[14546,1],0]
> Exit code:    127
> 
--------------------------------------------------------------------------
> 
> But it works without mpirun.
> 
> $ LD_LIBRARY_PATH=/home/kli/clang-install/lib ./a.out
> 0
> 1
> 2
> 3
> 
> 
> Kelvin
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> 
https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev 
 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> 
https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev 

> 





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20201028/05d36e36/attachment-0001.html>


More information about the Openmp-dev mailing list