[cfe-users] Segmentation fault with memory sanitizer and OpenMPI's mpirun

Schlottke-Lakemper, Michael via cfe-users cfe-users at lists.llvm.org
Tue Nov 24 06:08:07 PST 2015


Hi Evgenii,

I can confirm that adding 

LIBS=“-lutil”

to the configure command for OpenMPI resolved the reported issue. Thanks again for your help!

Michael


> On 23 Nov 2015, at 20:05 , Evgenii Stepanov <eugeni.stepanov at gmail.com> wrote:
> 
> I think so. What probably happens here is MSan confuses the configure
> script into thinking that openpty is available without -lutil, but
> what's actually available is just a stub that tries calling the real
> openpty and fails, unless libutil is linked.
> 
> On Mon, Nov 23, 2015 at 10:41 AM, Schlottke-Lakemper, Michael
> <m.schlottke-lakemper at aia.rwth-aachen.de> wrote:
>> Hi Evgenii,
>> 
>> Just to clarify: you mean I should re-compile OpenMPI and add “-lutil” to LDFLAGS at configure time?
>> 
>> Yours
>> 
>> Michael
>> 
>>> On 23 Nov 2015, at 18:07 , Evgenii Stepanov <eugeni.stepanov at gmail.com> wrote:
>>> 
>>> This is caused by missing -lutil.
>>> FTR, http://llvm.org/viewvc/llvm-project?rev=245619&view=rev
>>> 
>>> On Mon, Nov 23, 2015 at 7:33 AM, Schlottke-Lakemper, Michael via
>>> cfe-users <cfe-users at lists.llvm.org> wrote:
>>>> Hi folks,
>>>> 
>>>> When running “mpirun” of an msan-instrumented installation of OpenMPI, I get the following error:
>>>> 
>>>> $> mpirun -n 1 hostname
>>>> [aia308:48324] *** Process received signal ***
>>>> [aia308:48324] Signal: Segmentation fault (11)
>>>> [aia308:48324] Signal code: Address not mapped (1)
>>>> [aia308:48324] Failing at address: (nil)
>>>> [aia308:48324] [ 0] /pds/opt/openmpi-1.8.7-clang-msan/lib64/libopen-pal.so.6(+0x123ca1)[0x7f9e21c90ca1]
>>>> [aia308:48324] [ 1] mpirun[0x42b602]
>>>> [aia308:48324] [ 2] /lib64/libpthread.so.0(+0xf890)[0x7f9e21250890]
>>>> [aia308:48324] *** End of error message ***
>>>> Segmentation fault
>>>> 
>>>> Running it through gdb and printing the stacktrace, I get the following additional information:
>>>> $> gdb -ex r --args mpirun -n 1 hostname
>>>> #0  0x0000000000000000 in ?? ()
>>>> #1  0x000000000042bd5b in __interceptor_openpty () at /pds/opt/install/llvm/llvm-20151121-r253770-src/projects/compiler-rt/lib/msan/msan_interceptors.cc:1355
>>>> #2  0x00007ffff7705a7c in opal_openpty (amaster=0x7fffffffacf8, aslave=0x7fffffffacfc, name=0x0, termp=0x0, winp=0x0) at ../../../openmpi-1.8.7/opal/util/opal_pty.c:116
>>>> #3  0x00007ffff7b31e9a in orte_iof_base_setup_prefork (opts=0x7ffffffface8) at ../../../../openmpi-1.8.7/orte/mca/iof/base/iof_base_setup.c:89
>>>> #4  0x00007fffefea465a in odls_default_fork_local_proc (context=0x72400000bb80, child=0x72400000b880, environ_copy=0x73400001ec00, jobdat=0x722000009d80) at ../../../../../openmpi-1.8.7/orte/mca/odls/default/odls_default_module.c:860
>>>> #5  0x00007ffff7b3cfb8 in orte_odls_base_default_launch_local (fd=-1, sd=4, cbdata=0x70600002da80) at ../../../../openmpi-1.8.7/orte/mca/odls/base/odls_base_default_fns.c:1544
>>>> #6  0x00007ffff77459d7 in event_process_active_single_queue (base=0x72a00000fc80, activeq=0x71000000bdc0) at ../../../../../../openmpi-1.8.7/opal/mca/event/libevent2021/libevent/event.c:1367
>>>> #7  0x00007ffff773bb92 in event_process_active (base=0x72a00000fc80) at ../../../../../../openmpi-1.8.7/opal/mca/event/libevent2021/libevent/event.c:1437
>>>> #8  0x00007ffff7738fd7 in opal_libevent2021_event_base_loop (base=0x72a00000fc80, flags=1) at ../../../../../../openmpi-1.8.7/opal/mca/event/libevent2021/libevent/event.c:1647
>>>> #9  0x000000000048f651 in orterun (argc=4, argv=0x7fffffffcea8) at ../../../../openmpi-1.8.7/orte/tools/orterun/orterun.c:1133
>>>> #10 0x000000000048b20c in main (argc=4, argv=0x7fffffffcea8) at ../../../../openmpi-1.8.7/orte/tools/orterun/main.c:13
>>>> 
>>>> I suspect that has something to do with using some non-msan-instrumented system libraries, but how can I find out which library is the problem and what to do to fix it? Any ideas?
>>>> 
>>>> Regards,
>>>> 
>>>> Michael
>>>> 
>>>> P.S.: I compiled OpenMPI with the following configure command (the wildcard blacklist was necessary because OpenMPI just has too many issues with msan…):
>>>> 
>>>> printf "fun:*\n" > blacklist.txt && \
>>>> CC=clang CXX=clang++ \
>>>> CFLAGS="-g -fsanitize=memory -fno-omit-frame-pointer -fsanitize-memory-track-origins -fsanitize-blacklist=`pwd`/blacklist.txt" \
>>>> CXXFLAGS="-g -fsanitize=memory -fno-omit-frame-pointer -fsanitize-memory-track-origins -fsanitize-blacklist=`pwd`/blacklist.txt" \
>>>> ../openmpi-1.8.7/configure --prefix=/pds/opt/openmpi-1.8.7-clang-msan --disable-mpi-fortran
>>>> _______________________________________________
>>>> cfe-users mailing list
>>>> cfe-users at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-users
>> 



More information about the cfe-users mailing list