[PATCH] [LSan] Common leak checking module.

Wed May 15 04:49:39 PDT 2013

  > In general: can you split this CL even further and commit it in parts, each part with tests?

  The only way to split it that I can think of is into (ReportLeaked + everything needed for it to compile and pass tests) and (DoLeakCheck). That would lead to some painful merging though.

================
Comment at: lib/lsan/lsan_common.h:97
@@ +96,3 @@
+void ProcessGlobalRegions(InternalVector<uptr> *frontier);
+void ProcessSpecialAllocations(InternalVector<uptr> *frontier, Flags *flags);
+
----------------
Kostya Serebryany wrote:
> is this called anywhere at all? Do you need to include dead code into this CL?
Who says it's dead code? ClassifyAllChunks calls it.

================
Comment at: lib/lsan/lsan_common.cc:331
@@ +330,3 @@
+  bool is_directly_leaked = (tag == TAG_DIRECTLY_LEAKED);
+  for (uptr i = 0; i < leaks_.size(); i++)
+    if (leaks_[i].stack_trace_id == stack_trace_id &&
----------------
Kostya Serebryany wrote:
> stop this after e.g. 10000 leaks (configurable by a flag. can we reuse max_leaks?)
> otherwise you may die in quadratic algorithm w/o being able to report anything useful. 
The number of distinct leaks is typically in the dozens, so it's not that bad. Having a cut-off point here doesn't make a lot of sense because we'd end up reporting an arbitrary subset of leaks.

Generally I'm planning to replace this with an InternalHashmap (this is not the only place where we would need one).

================
Comment at: lib/lsan/lsan_common.cc:350
@@ +349,3 @@
+  InternalSort(&leaks_, leaks_.size(), IsLarger);
+  max_leaks = max_leaks > 0 ? Min(max_leaks, leaks_.size()) : leaks_.size();
+  for (uptr i = 0; i < max_leaks; i++) {
----------------
Kostya Serebryany wrote:
> you can simplify the logic if you set max_leaks to e.g. 10000 by default. 
Is removing this line really worth introducing another hard-coded limit?

================
Comment at: lib/lsan/lsan_common_linux.cc:99
@@ +98,3 @@
+  if (m.allocated() && m.tag() != TAG_REACHABLE) {
+    if (linker->containsAddress(GetCallerPC(m.stack_trace_id()))) {
+      m.set_tag(TAG_REACHABLE);
----------------
Kostya Serebryany wrote:
> is this functionality testable? 
> 
There's a test (LeakSanitizer.DynamicTLS) that fails if the call to ProcessSpecialAllocations is commented out.

================
Comment at: lib/lsan/lsan_common.cc:122
@@ +121,3 @@
+    if (!thread_found) {
+      // If a thread can't be found in the thread registry, it's probably in the
+      // process of destruction. Log this event and move on.
----------------
Kostya Serebryany wrote:
> isn't it better to continue here instead of using else?
> 
> if (!thread_found) {
> ...
>   continue;
> }
> // the rest w/o else
> 
> maybe also outline the loop body into ProcessOnThread() 
Ok. I also refactored the loop body somewhat.

================
Comment at: lib/lsan/lsan_common_linux.cc:11
@@ +10,3 @@
+// This file is a part of LeakSanitizer.
+// Implementation of common leak checking functionality.
+//
----------------
Kostya Serebryany wrote:
> update comment (about being linux-specific)
fixed

================
Comment at: lib/lsan/lsan_common_linux.cc:27
@@ +26,3 @@
+
+const char kLinkerName[] = "ld";
+char linker_placeholder[sizeof(LoadedModule)] ALIGNED(64);
----------------
Kostya Serebryany wrote:
> any reason to not make these static?
fixed

================
Comment at: lib/lsan/lsan_common_linux.cc:60
@@ +59,3 @@
+    uptr end = begin + phdr->p_memsz;
+    uptr allocator_begin, allocator_end;
+    GetAllocatorGlobalRange(&allocator_begin, &allocator_end);
----------------
Kostya Serebryany wrote:
> initialize with 0, just in case.
done

================
Comment at: lib/lsan/lsan_common.h:74
@@ +73,3 @@
+// Normal leak check. Find leaks and print a report according to flags.
+void DoLeakCheck();
+
----------------
Kostya Serebryany wrote:
> Sergey Matveev wrote:
> > Kostya Serebryany wrote:
> > > Can DoLeakCheck call ReportLeaked? 
> > > This way we will be testing the actual stuff, not some test code.
> > No, because there's boilerplate code in both functions (locks, stoptheworld).
> So? the boilerplate code is exactly the thing which I want *not* to duplicate.
Duplication is a completely separate issue. I factored out some of the boilerplate code. Still, ReportLeaked reports less information than DoLeakCheck needs (metadata is not included - and once we have resumed threads, we can't get that old metadata by chunk address anymore because it could have changed).

Eventually ReportLeaked would probably change to a two-stage algorithm (i.e., gather all leaked blocks and the associated metadata / pointer graph structure into a hashmap, and do any post-processing / apply suppressions later). When that happens, we could probably unify those two functions. However, 1) there's no real need for that until we have suppressions, 2) there are data structures that must be implemented first, 3) there might be performance implications that need to be considered.

http://llvm-reviews.chandlerc.com/D787