[PATCH] D32190: Make sure to scan mmap'd memory regions for root pointers on OS X

Wed Apr 19 10:11:40 PDT 2017

kubamracek added inline comments.

================
Comment at: lib/lsan/lsan_common_mac.cc:150-156
   while (err == KERN_SUCCESS) {
     struct vm_region_submap_info_64 info;
     err = vm_region_recurse_64(port, &address, &size, &depth,
                                (vm_region_info_t)&info, &count);
-    if (info.user_tag == VM_MEMORY_OS_ALLOC_ONCE) {
-      ScanRangeForPointers(address, address + size, frontier,
-                           "GLOBAL", kReachable);
-      return;
-    }
+    callback(frontier, address, size, info, arg);
     address += size;
   }
----------------
fjricci wrote:
> kubamracek wrote:
> > fjricci wrote:
> > > fjricci wrote:
> > > > kubamracek wrote:
> > > > > How many times is this called?  This should really be called only once per the whole scan, and it's hard to see if that's the case or not.
> > > > > 
> > > > > The naming is weird: `ProcessMemoryRegion` suggests that this processes some specific region, but in fact, this function *scans* for a new region.  I don't follow why this is called from ProcessPlatformSpecificRootRegion.
> > > > This would be called once when checking for the kernel alloc once page, and once when processing the root regions. I could probably change it to do both in one loop, if that's preferable, but it would require changing the ProcessPlatformSpecificAllocations API to take root region pointers (probably not a big deal, since only mac uses that API currently).
> > > > 
> > > > It needs to be called from ProcessPlatformSpecificRootRegion to find root regions contained within mmap'd memory regions.
> > > Actually that refactor would probably be even more complicated than this version, as we'd need to encapsulate the functionality of `ProcessRootRegions` (which iterates over the root regions, scanning each one). Currently we avoid duplicating this behavior by doing the platform-specific scan inside the call tree of that function.
> > Can you try doing something else then:  Create a `GetKernelAllocOnceRegion`, which will return the start and size of the VM_MEMORY_OS_ALLOC_ONCE region, and then it will cache the result; 2nd calls to this will just return the cached values.  Then you can call this function as many times as you need.
> I don't think that will change the performance at all, but perhaps the design here isn't super clear.
> 
> `ProcessMemoryRegion` is called (1 + num_root_regions) times, but it's only called with `AllocOncePageCb` one time. So there should be no duplication or caching benefit there.
> 
> I do see now how this could become slow if a large number of root regions was registered, since we'll iterate through all the pages for each root region. Perhaps I should refactor this such that `ProcessMemoryRegion` is only called once, and for each block does the `VM_MEMORY_OS_ALLOC_ONCE` check, in addition to iterating through the set of registered root regions to check if any are a match.
> 
> I think it comes down to whether we'd rather:
> 1) Iterate through the root regions once, and on each iteration, traverse the list of memory regions
> 2) Iterate through the list of memory regions once, and on each iteration run through all of the root regions.
> 
> My assumption is that 1) is simpler, but slower (and is what this patch does). 2) should be quite a bit more performant, but will also be a bit messier.
> 
> I'll put up a diff for 2) but save a copy of this one in case that turns out to be too ugly.
Right, my suggestion is not to address performance, but to improve readability, and to make it clear what each function does.  Currently, the name `ProcessMemoryRegion` doesn't suggest why we should be iterating over memory chunks with `vm_region_recurse_64`.

In any case, if the purpose of this loop is to find the alloc once region, we should have a separate function that does just that and nothing else.

https://reviews.llvm.org/D32190