[cfe-dev] Why clang needs to fork into itself?

Wed Jan 29 08:18:40 PST 2014

On 01/29/2014 06:18 AM, Yuri wrote:
> On 01/29/2014 02:57, Yury Gribov wrote:
>> I'm not sure e.g. exec calls realloc and I'm not sure what kinds of
>> crazy stuff CreateProcess may want to do behind your back.
>
> Create process calls (clone or fork) never call realloc, they are atomic
> system calls. They don't do any crazy stuff. Same should be true on
> windows. But, in the suggested way no additional process creation is
> required.

This is not quite true.  fork() is not async-signal-safe in a 
multithreaded process.  Quoting from POSIX 2008 
(http://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html)

<quote>
When the application calls fork() from a signal handler and any of the 
fork handlers registered by pthread_atfork() calls a function that is 
not async-signal-safe, the behavior is undefined.
</quote>

In particular, glibc establishes pthread_atfork() handlers that can 
cause calls to fork() to hang in a signal handler (in a multi-threaded 
process).  I've experienced this many times when a thread was 
interrupted while in a call to malloc() (usually due to some memory 
mismanagement issue).  The hang occurs because one of the glibc 
pthread_atfork() registered handlers attempts to acquire a lock 
previously acquired by malloc().

The Austin Group has a defect report tracking the addition of an _Fork() 
interface that would be async-signal-safe in all processes.  See 
http://austingroupbugs.net/view.php?id=18.  I'm not sure what the 
current state of that proposal is.

The above applies only to multi-threaded processes and only to the use 
of fork() in a signal handler, so does not necessarily apply here since, 
as far as I know, the Clang driver is not currently multi-threaded. 
However, the driver would have to be specifically built and linked 
without support for threads (whether they are created or not) to avoid 
problems with the use of fork() in a signal handler (or would need to 
use an equivalent to _Fork()).

Despite the above, I agree that writing a signal handler to produce 
defect reports from a crashing process is *possible*.  I've done it 
before (and had to write my own implementation of _Fork() - available as 
sigfork() at 
http://permalink.gmane.org/gmane.comp.standards.posix.austin.general/160). 
  However, it is *extremely* difficult to make reliable across all 
platforms and will very likely require platform specific coding. 
Reliable signal programming is *hard*.  And in this case, the 
requirements are even more strict than async-signal-safety because 
synchronous signals raised due to a processor exception (such as 
SIGSEGV) cannot be blocked which means that 
pthread_sigmask()/sigprocmask() protected critical sections may be 
interrupted.  This means you can't even reliably call async-signal-safe 
functions from the signal handler.

Tom.