[PATCH] D12554: tsan: speed up race deduplication

Thu Sep 3 04:20:28 PDT 2015

dvyukov added inline comments.

================
Comment at: lib/tsan/rtl/tsan_interface_ann.cc:66
@@ -65,3 +65,3 @@
   ExpectRace *prev;
-  int hitcount;
-  int addcount;
+  atomic_uintptr_t hitcount;
+  atomic_uintptr_t addcount;
----------------
samsonov wrote:
> I don't see why you're not using atomic_uint32_t here, but leaving this to you.
Because there is no reason to use atomic_uint32_t.

================
Comment at: lib/tsan/rtl/tsan_interface_ann.cc:163
@@ -161,3 +162,3 @@
     (*unique_count)++;
-    if (race->*counter == 0)
+    if (atomic_load_relaxed(&(race->*counter)) == 0)
       continue;
----------------
samsonov wrote:
> Any reason for not hoisting all the loads from race->*counter here?
done

================
Comment at: lib/tsan/rtl/tsan_rtl_report.cc:384
@@ -391,2 +383,3 @@
           tid, (uptr)epoch, (uptr)ebegin, (uptr)eend, partidx);
-  InternalScopedBuffer<uptr> stack(kShadowStackSize);
+  Vector<uptr> stack(MBlockReportStack);
+  stack.Resize(hdr->stack0.size + 64);
----------------
samsonov wrote:
> Why do you need this change? If you want to improve performance and get rid of mmaps, consider changing `__tsan::VarSizeStackTrace` to somehow take ownership of the buffer passed into it. Or just make it use Vector internally. 
__tsan::VarSizeStackTrace is OK. It uses internal_alloc which is cheap for small blocks.
The issue is exactly with this call, it allocates a huge chunk of memory for worst case. Internal_alloc just calls mmap for huge blocks. To fix this, we need to not allocate a huge block. That's why I use growable Vector.


http://reviews.llvm.org/D12554