[PATCH] D153771: [BOLT][Instrumentation] Fix hash table memory corruption and append-pid option

Denis Revunov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 26 07:37:44 PDT 2023


treapster created this revision.
treapster added a project: bolt.
Herald added a reviewer: rafauler.
Herald added a subscriber: ayermolo.
Herald added a reviewer: Amir.
Herald added a reviewer: maksfb.
Herald added a project: All.
treapster requested review of this revision.
Herald added subscribers: llvm-commits, yota9.
Herald added a project: LLVM.

This diff fixes a few related issues:

Shared counters when using instrumentation-file-append-pid.  
The point of append-pid option is to record separate profiles for separate forks, which is impossible when counters are the same for every process. It leads to a sum of all profiles in every file, plus GlobalWriteProfileMutex located in a shared memory prevents some processes from dumping their data at all. So, in this patch we only map counters as shared when append-pid is not used, and provide a test to ensure that different processes don't pollute each other's profiles.

Hash table corruption  
In absence of instrumentation-file-append-pid option, global allocator uses shared pages for allocation. However, since it is a global variable, it gets COW'd after fork if instrumentation-sleep-time is used, or each time a process forks by itself. This means it handles the same pages to every process which causes hash table corruption - different entries overwrite each other, sometimes creating endless cycles. Thus, if we want shared pages, we need to put the allocator itself in a shared page, which we do in patch commit in __bolt_instr_setup.

Unspecified behavior of instrumentation-file-append-pid combined with instrumentation-{sleep-time,wait-forks}  
The point of instrumentation-sleep-time option is to have a watcher process which shares memory with all other forks and dumps a common profile each n seconds. The append-pid, is the opposite - it should record a private profile of each process. Combining the two suggests that we should get a private profile of each fork every n seconds, but such behavior is not implemented currently and is not easy to implement in general, because we somehow need to intercept each individual fork, launch a watcher process just for that fork, and also map counters so that they're only shared with that single fork. Since we're not doing that, the most reasonable thing to do seems to be to disallow such combination of options. I can make a separate diff for that if you think it doesn't fit here.

Also, while debugging all that a created a simple report() function to understand what's happening, which i include it here in case other hash table issues arise.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D153771

Files:
  bolt/lib/RuntimeLibs/InstrumentationRuntimeLibrary.cpp
  bolt/runtime/common.h
  bolt/runtime/instr.cpp
  bolt/test/runtime/instrumentation-indirect-2.c

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D153771.534547.patch
Type: text/x-patch
Size: 15008 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230626/5eb650e1/attachment-0001.bin>


More information about the llvm-commits mailing list