[llvm] [llvm-profgen][SPGO] Support profiles with multiple concurrent processes (PR #169353)

via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 12 07:50:39 PST 2025


================
@@ -198,7 +198,14 @@ class ProfiledBinary {
   // Options used to configure the symbolizer
   symbolize::LLVMSymbolizer::Options SymbolizerOpts;
   // The runtime base address that the first executable segment is loaded at.
-  uint64_t BaseAddress = 0;
+  // The binary may be loaded at different addresses in different processes,
+  // so we use a map to store base address by PID.
+  // If the profile doesn't contain PID info we use the default PID value
+  // (DefaultPID defined in PerfReaderBase).
+  std::unordered_map<int32_t, uint64_t> BaseAddressByPID;
----------------
Heath123 wrote:

Okay, I didn't consider forking; it looks like forking changes the PID, but doesn't generate a new `PERF_RECORD_MMAP2` event. I'll need to update the code listen for `PERF_RECORD_FORK` and copy the base address to the new PID and write tests for that.

The intent of the PR is to allow using multiple instances of the same binary invoked separately. For example, it can be used to profile a build where ninja invokes Clang many times.

The base address is not a property of the binary itself, but is decided when the binary is loaded. While a fork would copy the existing memory layout, separate invocations will cause this to change due to address space layout randomisation.

e.g.
```
$ perf script --show-mmap-events -ispgo.data | grep PERF_RECORD_MMAP2 | grep clang-22 | head -n 10
         clang++  884158 2271036.130764: PERF_RECORD_MMAP2 884158/884158: [0x5aa9b8aee000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884160 2271036.131184: PERF_RECORD_MMAP2 884160/884160: [0x6012266cb000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884162 2271036.131287: PERF_RECORD_MMAP2 884162/884162: [0x639094f1e000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884164 2271036.131608: PERF_RECORD_MMAP2 884164/884164: [0x586d53654000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884166 2271036.131697: PERF_RECORD_MMAP2 884166/884166: [0x578c30f63000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884168 2271036.131975: PERF_RECORD_MMAP2 884168/884168: [0x5cabe1dd0000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884170 2271036.132125: PERF_RECORD_MMAP2 884170/884170: [0x590878777000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884172 2271036.132365: PERF_RECORD_MMAP2 884172/884172: [0x571e17705000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884174 2271036.132528: PERF_RECORD_MMAP2 884174/884174: [0x63e26251b000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
         clang++  884176 2271036.132782: PERF_RECORD_MMAP2 884176/884176: [0x5c7d63604000(0x4446000) @ 0x1f30000 103:02 17175371 442143544]: r-xp /path/removed/bin/clang-22
```
The same binary of Clang is loaded at `0x5aa9b8aee000`, `0x6012266cb000`, `0x639094f1e000` and so on due to ASLR.

This loading at multiple addresses now works with my PR, but it seems I accidently broke forking and this will need to be fixed.

https://github.com/llvm/llvm-project/pull/169353


More information about the llvm-commits mailing list