[llvm] [BOLT] Heatmap fix on large binaries and printing mappings (PR #92815)

Paschalis Mpeis via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 25 02:18:07 PDT 2024


https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/92815

>From ff020d9ae49deae831e07922cad0f738d9ee4cef Mon Sep 17 00:00:00 2001
From: "Paschalis Mpeis (aws-mem-aarch64)" <paschalis.mpeis at arm.com>
Date: Wed, 24 Apr 2024 08:25:59 +0000
Subject: [PATCH] [BOLT] Fix heatmaps on large BOLTE'd binaries.

Large binaries get two text segments mapped when loaded in memory, ie:
```
2f7200000-2fabca000 r--p 00000000        bolted-binary <- 1st text segment
2fabd9000-2fe47c000 r-xp 039c9000        bolted-binary
2fe48b000-2fe61d000 r--p 0727b000        bolted-binary
2fe62c000-2fe660000 rw-p 0740c000        bolted-binary
2fe660000-2fea4c000 rw-p 00000000
2fec00000-303dad000 r-xp 07a00000        bolted-binary <- 2nd. only on bolted binary
```

BOLT processes only the first, which is not having a correct BaseAddress,
causing a wrong computation of a BinaryMMapInfo's size.

Consequently, BOLT wrongly thinks that many of the samples fall outside
the binary and ignores them. As a result, the computed heatmap is
incomplete, and the section hotness statistics are wrong.

This bug is present in both the AArch64 and x86 backends.
---
 bolt/lib/Profile/DataAggregator.cpp | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index ce6ec0a04ac16..1bac83d200961 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -2009,9 +2009,6 @@ std::error_code DataAggregator::parseMMapEvents() {
       return MI.second.PID == FileMMapInfo.second.PID;
     });
 
-    if (PIDExists)
-      continue;
-
     GlobalMMapInfo.insert(FileMMapInfo);
   }
 
@@ -2067,7 +2064,21 @@ std::error_code DataAggregator::parseMMapEvents() {
       }
     }
 
-    BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo));
+    // In some larger binaries, the loaded binary gets a a second text segment
+    // memory mapped, right after its read-write segments. Below, we encounter
+    // and process such text segments, and recompute the size of the binary.
+    // When this happens, the correctly computed size comes from this second
+    // memory mapping, as the one processed earlier has an incorrect
+    // BaseAddress.
+    if (!BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo)).second) {
+      auto EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
+      auto Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
+      if (Size != BinaryMMapInfo[MMapInfo.PID].Size) {
+        LLVM_DEBUG(outs() << "MMap size fixed: " << Twine::utohexstr(Size)
+                          << " \n");
+        BinaryMMapInfo[MMapInfo.PID].Size = Size;
+      }
+    }
   }
 
   if (BinaryMMapInfo.empty()) {



More information about the llvm-commits mailing list