[lld] [lld][InstrProf] Sort startup functions for compression (PR #107348)

Thu Sep 5 13:16:52 PDT 2024

ellishg wrote:

> However, I'm wondering how much of an impact this will have depending on how many startup symbols there are compared to the total number of symbols. If there aren't many startup symbols, maybe shuffling them around won't really change performance or size much. But if there are a lot, this could potentially make startup slower while reducing the size more significantly. Could you share some info on how many startup symbols are involved in this experiment?

In these tests I ordered about 45K functions for startup and 2.1M functions for compression. However, I suspect the functions in the startup set are quite large because ordering them does impact compressed size. I did not measure the size of these sets in bytes.
I'm actually trying to record page faults on my local device to compare these modes. I will post the results when I get them.

> Also, I'm a bit confused about what `Remove Startup Hashes` actually means. Are you completely ignoring the startup set and just focusing on compression to measure the upper bound of the compressed size? Or are you still keeping the startup set but not considering the startup traces when ordering them?

I ignored the startup traces when ordering those startup symbols. The purpose was to show an upper bound for how bad page faults can get and how good compression can get. Basically, this shows that "Startup Compression" is halfway between ordering completely for startup and ordering completely for compression.
```diff

diff --git a/lld/MachO/BPSectionOrderer.cpp b/lld/MachO/BPSectionOrderer.cpp
index 71758070ddc8..92032af67daa 100644
--- a/lld/MachO/BPSectionOrderer.cpp
+++ b/lld/MachO/BPSectionOrderer.cpp
@@ -286,6 +286,7 @@ DenseMap<const InputSection *, size_t> lld::macho::runBalancedPartitioning(

   for (auto &[sectionIdx, compressionUns] : unsForStartupFunctionCompression) {
     auto &uns = startupSectionIdxUNs[sectionIdx];
+    uns.clear();
     uns.append(compressionUns);
     llvm::sort(uns);
     uns.erase(std::unique(uns.begin(), uns.end()), uns.end());
```

https://github.com/llvm/llvm-project/pull/107348