[PATCH] D27656: [tsan] Implement __tsan_get_alloc_stack to query the allocation stack of a heap pointer

Thu Dec 15 04:09:50 PST 2016

dvyukov added inline comments.

================
Comment at: lib/tsan/rtl/tsan_debugging.cc:168
+int __tsan_get_alloc_stack(uptr addr, uptr *trace, uptr size, int *thread_id,
+                           uptr *os_id) {
+  MBlock *b = 0;
----------------
kubabrecka wrote:
> dvyukov wrote:
> > Since the interface is different from asan, it may makes sense to return block begin and size as well. What do you think?
> Makes sense.  What about going further and adding a TSan version of `__asan_locate_address` (see asan_debugging.cc), which will "describe" a pointer and return its bounds if known, i.e. having two APIs:
> 
>     __tsan_locate_address(addr) ... returns region name, start address, block size
>     __tsan_get_alloc_stack(addr) ... returns trace, thread ID
> 
> Does that sound better or worse than having just one
> 
>     __tsan_get_alloc_stack(addr) ... returns trace, thread ID, start address and block size
> 
> ?
__tsan_locate_address(addr) ... returns region name, start address, block size
__tsan_get_alloc_stack(addr) ... returns trace, thread ID

sounds better, more orthogonal and some use cases may not need stack/thread

================
Comment at: lib/tsan/rtl/tsan_debugging.cc:181
+  ThreadContextBase *tctx = ctx->thread_registry->GetThreadLocked(b->tid);
+  *os_id = tctx->os_id;
+
----------------
kubabrecka wrote:
> dvyukov wrote:
> > os_id can be stale if status is not ThreadStatusRunning
> > I think it's better to return something more predictable in such case, e.g. 0.
> Hm.  At least on Darwin, GetTid() returns a unique ID for the current boot of the OS.  So it's correct even after the thread finished, and even after the process has ended (as long as you don't reboot).  Is gettid() on Linux different?
> 
> I'm already using os_id to "match" threads in TSan reports:  If we see the same os_id, we know it's the same thread even when that thread doesn't exist anymore.  I'd like not to lose this feature.
> 
> Does Linux reuse the thread ID soon enough for this to be a problem?  If yes, how about we reset the os_id in the ThreadRegistry when a thread ends (for Linux, and keep the value for Darwin)?
Linux reuses PIDs FIFO, but it still reuses them.
On closer inspection, there is a more realistic source of bogus values -- tid's are recycled, and ThreadContext we obtain can be for a different thread. Tid does not uniquely identify ThreadContext, it's tid+epoch that does. But we don't have epoch in MBLock.
So I think it's fine to leave code as is.

Repository:
  rL LLVM

https://reviews.llvm.org/D27656