[LLVMdev] radr://12777299, "potential pthread/eh bug exposed by libsanitizer"
Jack Howarth
howarth at bromo.med.uc.edu
Fri Nov 30 06:56:28 PST 2012
On Fri, Nov 30, 2012 at 01:41:05PM +0400, Kostya Serebryany wrote:
> Just want to remind everyone that we plan to stop using mach_override in
> asanin favor of OSX's native function interposition.
> So, we probably don't want to spend too much effort fixing mach_override.
>
> --kcc
Kostya,
Is the native function interposition that is being adopted based on...
https://github.com/rentzsch/mach_inject
? I assume that any method used will be transparent to the user and not require
manually setting DYLD_INSERT_LIBRARIES, correct?
Jack
>
> On Fri, Nov 30, 2012 at 4:46 AM, Alexander Potapenko <glider at google.com>wrote:
>
> > Looks like this happens on x86_64 because the position of __cxa_throw
> > is too far from the allocated branch island (should be <2G). This can
> > be solved by allocating the branch islands somewhere near the text
> > segment (look for kIslandEnd in asan_mac.cc, this is currently
> > 0x7fffffdf0000) or by patching the function with a longer instruction
> > sequence that stores the jump target in a register and jumps to that
> > target (which is a bit more complex to implement).
> >
> > Once this problem is fixed, another one is going to arise. This is how
> > the first bytes of __cxa_throw look like:
> >
> > 0x0020c49ba5d916e0 <__cxa_throw+0>: lea 0xb4f01(%rip),%rax #
> > 0x20c49ba5e465e8 <_ZN10__cxxabiv120__unexpected_handlerE>
> > 0x0020c49ba5d916e7 <__cxa_throw+7>: push %rbx
> > 0x0020c49ba5d916e8 <__cxa_throw+8>: lea -0x20(%rdi),%rbx
> >
> > If we move the relative LEA instruction somewhere, we must fix the
> > constant in order to keep it pointing to the same address.
> > mach_override already does this for relative CALL and JMP
> > instructions, but not for LEA. This should be fairly simple to fix.
> >
> > Note that the 32-bit variant crashes on another invalid address:
> >
> > ASAN:SIGSEGV
> > =================================================================
> > ==89768== ERROR: AddressSanitizer: SEGV on unknown address 0xcccccccc
> > (pc 0x00061f8c sp 0xbffa8bd0 bp 0xbffa8cc8 T0)
> > AddressSanitizer can not provide additional info.
> > #0 0x61f8b
> > (/Users/glider/src/gcc-asan/inst/lib/i386/libstdc++.6.dylib+0x3f8b)
> > #1 0x91391724 (/usr/lib/system/libdyld.dylib+0x2724)
> > #2 0x0
> > Stats: 0M malloced (0M for red zones) by 3 calls
> > Stats: 0M realloced by 0 calls
> > Stats: 0M freed by 0 calls
> > Stats: 0M really freed by 0 calls
> > Stats: 1M (256 full pages) mmaped in 2 calls
> > mmaps by size class: 7:4095; 8:2047;
> > mallocs by size class: 7:1; 8:2;
> > frees by size class:
> > rfrees by size class:
> > Stats: malloc large: 0 small slow: 2
> > ==89768== ABORTING
> >
> > My guess is that this is caused by the following code being moved to a
> > branch island:
> >
> > Dump of assembler code for function __cxa_throw:
> > 0x00008f60 <__cxa_throw+0>: push %esi
> > 0x00008f61 <__cxa_throw+1>: push %ebx
> > 0x00008f62 <__cxa_throw+2>: call 0x7a60 <__x86.get_pc_thunk.bx>
> >
> > Perhaps this makes __x86.get_pc_thunk.bx return an incorrect value.
> >
> > Since libstdc++-v3 is built together with gcc, the two issues related
> > to instructions being moved to another place can be solved by padding
> > __cxa_throw() with five NOP instructions (enough to hold a JMP). I
> > believe this should be acceptable, because the performance penalty for
> > additional NOPs is negligible, and __cxa_throw() isn't a hot point.
> >
> > On Thu, Nov 29, 2012 at 1:01 PM, Nick Kledzik <kledzik at apple.com> wrote:
> > > I debugged this a bit and it seems the mach_override patching of
> > __cxa_throw is bogus. The start of that function is patched to jump to
> > garbage.
> > >
> > > Breakpoint 1, 0x0000000100001c19 in main ()
> > > (gdb) display/i $pc
> > > 2: x/i $pc 0x100001c19 <main+318>: callq 0x100016386
> > <dyld_stub___cxa_throw>
> > > (gdb) si
> > > 0x0000000100016386 in dyld_stub___cxa_throw ()
> > > 2: x/i $pc 0x100016386 <dyld_stub___cxa_throw>: jmpq
> > *0xae1c(%rip) # 0x1000211a8
> > > (gdb)
> > > 0x0000000102244870 in __cxa_throw ()
> > > 2: x/i $pc 0x102244870 <__cxa_throw>: jmpq 0xffd27000
> > > (gdb) # the above its __cxa_throw in gcc's libstdc++.6.dylib. The
> > first instruction has been patch to jump to a garbage address.
> > >
> > > (gdb) x/8i 0x102244870-8
> > > 0x102244868
> > <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+56>:
> > std
> > > 0x102244869
> > <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+57>:
> > (bad)
> > > 0x10224486a
> > <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+58>:
> > decl (%rdi)
> > > 0x10224486c
> > <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+60>:
> > (bad)
> > > 0x10224486d
> > <_ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP17_Unwind_Exception+61>:
> > add %r8b,(%rax)
> > > 0x102244870 <__cxa_throw>: jmpq 0xffd27000
> > > 0x102244875 <__cxa_throw+5>: or (%rax),%eax
> > > 0x102244877 <__cxa_throw+7>: push %rbx
> > > (gdb)
> > > (gdb) watch *0x102244870
> > > Hardware watchpoint 2: *4330899568
> > > (gdb) r
> > >
> > > Old value = -788165304
> > > New value = -1373139991
> > > 0x0000000100016203 in __asan_mach_override_ptr_custom ()
> > > (gdb) bt
> > > #0 0x0000000100016203 in __asan_mach_override_ptr_custom ()
> > > #1 0x0000000100015a9e in __interception::OverrideFunction ()
> > > #2 0x00007fff5fc13378 in ImageLoaderMachO::doModInitFunctions ()
> > > #3 0x00007fff5fc13762 in ImageLoaderMachO::doInitialization ()
> > > #4 0x00007fff5fc1006e in ImageLoader::recursiveInitialization ()
> > > #5 0x00007fff5fc0feba in ImageLoader::runInitializers ()
> > > #6 0x00007fff5fc01fc0 in dyld::initializeMainExecutable ()
> > > #7 0x00007fff5fc05b04 in dyld::_main ()
> > > #8 0x00007fff5fc01397 in dyldbootstrap::start ()
> > > #9 0x00007fff5fc0105e in _dyld_start ()
> > > (gdb) x/8i 0x102244870
> > > 0x102244870 <__cxa_throw>: jmpq 0xffd27000
> > > 0x102244875 <__cxa_throw+5>: or (%rax),%eax
> > > 0x102244877 <__cxa_throw+7>: push %rbx
> > > 0x102244878 <__cxa_throw+8>: lea -0x20(%rdi),%rbx
> > > 0x10224487c <__cxa_throw+12>: mov %rsi,-0x70(%rdi)
> > > # Here is where the patching is being done
> > >
> > > -Nick
> > >
> > > On Nov 29, 2012, at 11:07 AM, Alexander Potapenko wrote:
> > >>> On Thu, Nov 29, 2012 at 9:55 PM, Jack Howarth <
> > howarth at bromo.med.uc.edu>
> > >>> wrote:
> > >>>>
> > >>>> Nick,
> > >>>> Can you take a quick look at the asan_eh_bug.tar.bz testcase
> > >>>> I uploaded into the newly opened radr://12777299, "potential
> > >>>> pthread/eh bug exposed by libsanitizer". The FSF gcc developers
> > >>>> have ported llvm.org's asan code into FSF gcc (and are keeping
> > >>>> it synced to the upstream llvm.org code). I have been helping
> > >>>> with the darwin build and testing -fsanitize=address against the
> > >>>> complete FSF gcc testsuite. This seems to have exposed a potential
> > >>>> bug in pthread or eh on darwin under libasan. Hundreds of test cases
> > >>>> in the g++ and libstdc++ testsuites fail under -fsanitize=address
> > >>>> in the following manner...
> > >>>>
> > >>>> ASAN:SIGSEGV
> > >>>> =================================================================
> > >>>> ==2738== ERROR: AddressSanitizer: SEGV on unknown address
> > 0x0000ffd27000
> > >>>> (pc 0x0000ffd27000 sp 0x7fff55e40828 bp 0x7fff55e408f0 T0)
> > >>>> AddressSanitizer can not provide additional info.
> > >>>> #0 0xffd26fff
> > (/Users/howarth/asan_eh_bug/./cond1_asan.exe+0xf5f67fff)
> > >>>> #1 0x7fff8bd827e0 (/usr/lib/system/libdyld.dylib+0x27e0)
> > >>>> #2 0x0
> > >>>> Stats: 0M malloced (0M for red zones) by 3 calls
> > >>>> Stats: 0M realloced by 0 calls
> > >>>> Stats: 0M freed by 0 calls
> > >>>> Stats: 0M really freed by 0 calls
> > >>>> Stats: 1M (384 full pages) mmaped in 3 calls
> > >>>> mmaps by size class: 7:4095; 8:2047; 9:1023;
> > >>>> mallocs by size class: 7:1; 8:1; 9:1;
> > >>>> frees by size class:
> > >>>> rfrees by size class:
> > >>>> Stats: malloc large: 0 small slow: 3
> > >>>> ==2738== ABORTING
> > >>>>
> > >>>> The failure of...
> > >>>>
> > >>>> FAIL: g++.dg/eh/cond1.C -std=c++98 execution test
> > >>>>
> > >>>> was used as the test case for the radar report and compiled with...
> > >>>>
> > >>>> g++-fsf-4.8 -static-libasan -fsanitize=address -std=c++98 cond1.C -g
> > -O0
> > >>>> -o cond1_asan.exe
> > >>>>
> > >>>> to produce the above failure. When compiled without libasan as...
> > >>>>
> > >>>> g++-fsf-4.8 -std=c++98 cond1.C -g -O0 -o cond1_no_asan.exe
> > >>>>
> > >>>> the resulting executable runs fine. Debugging this in gdb seems to
> > show
> > >>>> that the failure
> > >>>> is occuring in the final call to dyld_stub_pthread_once (). The same
> > test
> > >>>> case
> > >>>> compiles fine with -fsanitize=address under llvm 3.2 clang++ and
> > produces
> > >>>> no runtime errors
> > >>>> but the code execution path is very different in that case (because
> > of the
> > >>>> different
> > >>>> libstdc++).
> > >>>> Can you take a quick peek at this and determine if this is a darwin
> > >>>> pthread or unwinder
> > >>>> bug or an issue with libasan that FSF gcc's compiler is exposing?
> > Thanks
> > >>>> in advance for
> > >>>> any help on this.
> > >>>> Jack
> > >>>> _______________________________________________
> > >>>> LLVM Developers mailing list
> > >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > >>>
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Alexander Potapenko
> > >> Software Engineer
> > >> Google Moscow
> > >
> >
> >
> >
> > --
> > Alexander Potapenko
> > Software Engineer
> > Google Moscow
> >
More information about the llvm-dev
mailing list