[lld] [lld][ELF] Extend profile guided function ordering to ELF binaries (PR #117514)

Tue Dec 17 04:27:57 PST 2024

================
@@ -102,16 +102,41 @@ class BPSectionELF : public BPSectionBase {
                             &sectionToIdx) const override {
     constexpr unsigned windowSize = 4;
 
-    // Calculate content hashes
     size_t size = isec->content().size();
     for (size_t i = 0; i < size; i++) {
       auto window = isec->content().drop_front(i).take_front(windowSize);
       hashes.push_back(xxHash64(window));
     }
 
-    // TODO: Calculate relocation hashes.
-    // Since in ELF, relocations are complex, but the effect without them are
-    // good enough, we just use 0 as their hash.
+    for (const auto &r : isec->relocations) {
----------------
Colibrow wrote:

Hi Ellis,

I tested this on my project and realized the code I shared earlier was incorrect. When iterating over the `window + relocHash`, the loop needs to end at the relocation code size, e.g., `2` for functions and `3` for data in Mach-O.

I ran the test on my project, which builds an AArch64 ELF with the following relocation distribution:

| Relocation Type       | Count  |
|------------------------|--------|
| `R_AARCH64_ABS64`     | 387    |
| `R_AARCH64_GLOB_DAT`  | 34     |
| `R_AARCH64_JUMP_SLOT` | 711    |
| `R_AARCH64_RELATIVE`  | 9396   |

For testing, I hardcoded `r.length` as `3`, and here are the results:

| Binary Name                                      | Size (bytes) | Gzipped Size (bytes) |
|--------------------------------------------------|--------------|----------------------|
| `libsample-aarch64.so`                           | 3,181,560    | 1,512,245            |
| `libsample-aarch64-noreloc-compressed-function.so` | 3,181,560  | 1,487,043            |
| `libsample-aarch64-reloc-compressed-function.so`   | 3,181,560  | 1,487,032            |

Since different `relType` values require various relocation forms and the size optimization is minimal, I decided to revert the relocation-hash commit.

Do you have any ideas on how to proceed?


https://github.com/llvm/llvm-project/pull/117514