[PATCH] D45646: [tsan] Zero out the shadow memory for the stack and TLS in ThreadFinish

Tue Apr 17 11:32:52 PDT 2018

kubamracek added a comment.

In https://reviews.llvm.org/D45646#1068580, @dvyukov wrote:

> Two things that I don't like here:
>
> 1. This imposes cost of zeroing of up to 32MB (standard 8MB stack x 4x shadow) per thread creation/destruction for all OSes. Some programs create threads like insane.
> 2. I don't think this fixes the actual root cause, only makes it even harder to localize. Note that cur_thread_finalize already clears the shadow slot, so if pthread reuses stack/tls wholesale, then the slot should be zero already. However, tsan does not generally keep shadow clear (e.g. munmap does not clear shadow too, and most likely a bunch of other things). So if the slot reuses memory from a previous mmap, it will crash the same way. I wonder if moving the slot to _meta_ shadow is the right things to do. We actually clear meta shadow on unmap. I don't see where we clear stack, but we should, otherwise we can leak lots of sync objects on stack.

Is it performance or memory usage you're worried about? Don't we *already* fill the shadow memory for the entire stack at thread creation? Shouldn't MemoryResetRange followed by DontNeedShadowFor release the pages back to the OS anyway?

Second problem is that I still don't the exact root cause, nor am I able to reliably reproduce this in a small example. We have a giant app that will trigger this after a long period of time, and so far my theory is that a thread's stack is allocated in a place of a previously-existing thread but it's not exactly the same region, it's just overlapping. In that case, the value from the previous MemoryRangeImitateWrite in ThreadStart is in the shadow slot.

Repository:
  rCRT Compiler Runtime

https://reviews.llvm.org/D45646