[PATCH] D30818: [lsan] Don't handle DTLS of thread under destruction

Mon Apr 3 11:08:58 PDT 2017

m.ostapenko updated this revision to Diff 93888.
m.ostapenko added a comment.

Turned out I was wrong again. Looking to verbose log more carefully, one can see that segfault happens when **GetRegistersAndSP** fails with errno 3 (ESRCH):

  ==13657==Attached to thread 14860.
  ==13657==Attached to thread 14868.
  ==13657==Attached to thread 13349.
  ==13657==Attached to thread 13357.
  ==13657==Could not get registers from thread 13349 (errno 3).
  ==13657==Unable to get registers from thread 13349.
  Tracer caught signal 11: addr=0x7ff4b259b000 pc=0x4239d0 sp=0x7ff4b1d99d90
  ==13657==Process memory map follows:
  ...
          0x7ff4b1d9b000-0x7ff4b259b000    0x000000000003 (rw)
          0x7ff4b259b000-0x7ff4b259c000    0x000000000000
  ...
  ==13657==End of process memory map.
  ==13657==Detached from thread 14860.
  ==13657==Detached from thread 14868.
  ==13657==Could not detach from thread 13349 (errno 3).
  ==13657==Detached from thread 13357.
  ==14860==LeakSanitizer has encountered a fatal error.
  ==14860==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
  ==14860==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)

Although LSan successfully attached to thread 13349, it seems that this thread was killed by concurrent SIGKILL signal. Thus stack boundaries extracted by **GetRegistersAndSP** are already invalid and we access "bad" memory (guard page in this case).
According to ptrace manual, when user tries to get information from ptrace stopped thread he should always be ready to handle ESRCH error:

  The tracer cannot assume that the ptrace-stopped tracee exists.
  There are many scenarios when the tracee may die while stopped (such
  as SIGKILL).  Therefore, the tracer must be prepared to handle an
  ESRCH error on any ptrace operation.  Unfortunately, the same error
  is returned if the tracee exists but is not ptrace-stopped (for
  commands which require a stopped tracee), or if it is not traced by
  the process which issued the ptrace call.  The tracer needs to keep
  track of the stopped/running state of the tracee, and interpret ESRCH
  as "tracee died unexpectedly" only if it knows that the tracee has
  been observed to enter ptrace-stop.  Note that there is no guarantee
  that waitpid(WNOHANG) will reliably report the tracee's death status
  if a ptrace operation returned ESRCH.  waitpid(WNOHANG) may return 0
  instead.  In other words, the tracee may be "not yet fully dead", but
  already refusing ptrace requests.

I'm adjusting the patch to handle ESRCH properly in LSan.

Repository:
  rL LLVM

https://reviews.llvm.org/D30818

Files:
  lib/lsan/lsan_common.cc
  lib/sanitizer_common/sanitizer_stoptheworld.h
  lib/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc
  lib/sanitizer_common/sanitizer_tls_get_addr.cc
  lib/sanitizer_common/sanitizer_tls_get_addr.h

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D30818.93888.patch
Type: text/x-patch
Size: 5355 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170403/af102a86/attachment.bin>