<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/61370>61370</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [BOLT] Issue aggregating no-LBR perf samples on aarch64
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          aaupov
      </td>
    </tr>
</table>

<pre>
    Host: Ubuntu 22.04, Ampere Altra A1 on Oracle Cloud
Sampled clang-17 bootstrapped binary built from recent trunk using [aaupov/llvm-project/nolbr](https://github.com/aaupov/llvm-project/commits/nolbr) and [aaupov/llvm-devmtg-2022/altra](https://github.com/aaupov/llvm-devmtg-2022/tree/altra).

perf2bolt fails with
```
PERF2BOLT-WARNING: unable to find base address of the binary when memory mapped at 0xaaaaac8c4000 using file offset 0x1e14000. Ignoring profile data for this mapping
PERF2BOLT-ERROR: could not find a profile matching binary "/tmp/tmp.E1ksjF8aZ3/BOLT_BASELINE/tools/clang/stage2-bins/bin/clang-17". Profile for the following binary name(s) is available:
 clang-17
...
```

The warning hints to the source of the issue:
https://github.com/llvm/llvm-project/blob/b884f4ef0a2de3d0f24111411dff663fd68c2eb0/bolt/lib/Profile/DataAggregator.cpp#L2055-L2063

I sprinkled printf debug statements to look at the calculation done in `getBaseAddressForMapping`:
https://github.com/llvm/llvm-project/blob/b884f4ef0a2de3d0f24111411dff663fd68c2eb0/bolt/lib/Core/BinaryContext.cpp#L1871-L1880
```
0x1e14000 is the FileOffset
the following lines show SegInfo.FileOffset, SegInfo.Alignment, and alignDown(SegInfo.FileOffset, SegInfo.Alignment)
offset 0x0, align 0x10000 => 0x0 == 0x1e14000
offset 0x1e14240, align 0x10000 => 0x1e10000 == 0x1e14000
offset 0x5877c60, align 0x10000 => 0x5870000 == 0x1e14000
offset 0x5c59ad8, align 0x10000 => 0x5c50000 == 0x1e14000
```

Checking ELF segments:
```
Program Headers:
  Type           Offset   VirtAddr           PhysAddr FileSiz  MemSiz   Flg Align
  LOAD           0x1e14240 0x0000000001e24240 0x0000000001e24240 0x3a63a20 0x3a63a20 R E 0x10000
```

and `perf script --show-mmap-events`
```
        clang-17 179425 160317.159883: PERF_RECORD_MMAP2 179425/179425: [0xaaaaac8c4000(0x3a64000) @ 0x1e14000 08:01 2411305 3556563589]: r-xp /tmp/tmp.E1ksjF8aZ3/BOLT_BASELINE/tools/clang/stage2-bins/bin/clang-17
```

Clearly, we use an incorrect calculation for finding the base address, since Align doesn't necessarily imply an alignment of mmapped address. ELF spec doesn't mandate that:
> [p_align] integral power of 2, and p_vaddr should equal p_offset, modulo p_align.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEVl1v47oR_TX0y8ACRVmy_OAHJ7Z7A2Q3C--2BfoSUNJI4g1FqiQVr_vri5Fsx7s3G9z7UFQwJJofh8OZM3MovVeNQVyz9I6l25kcQmvdWsqht6-zwlan9W_WB5Zs4O_FYMIAQkR8wcQ9bLoeHcJGBydhE4M18ORkqRHutR0qxreMb77KrtdYQamlaebxEgprgw9O9j1WUCgj3QmKQekAtbMdOCzRBAhuMC8weGUaYOndZA4Te61fu3nv7O9YBib2xurCsXTLRN6G0HuWbJjYM7FvVGiHIiptx8T-V6tL23Uq-CuOWIE01R_3q_C1C81ccCEIjc77F_f8ESE4xCuQWEWTp6Z3j64WhSV3SKU9HFVoz-MZP__Gv192h724e3r8Nv_n5vD54fPfKESDkYVGCBZqZSoopEeQVeXQe7A1hBYvLj-2aKDDzroTdFMwZAD-XdJT5uWCc372f600gq1rjzQhxpjGInhojHU03js7TqlkkFBbB6FVfgRVpvnZ2N3h8HQgU0s76AqMDZOp8grTyVC2hHu2lE0-6_rpHe3iF__7Ppf_SpjYE-bz3ebr7vHh844mWKspoCPdmNj7IBsU80IZ6i2UuYzN4yUTIoIv510nu-mrtT3ebG9kh0zkntihPMhXqTQ5mcI-nu1K7elvFEXvxmt6f2sRjtIZ2qBVJniKFe3r7eBKvARJeT-87fABzYhdfyR2oW1Bnzxf1AusuRQVJhWvxSKO40UcV3WdZUldZXkpsOA01WpaqBWtO_uEif1WBrlpGoeNDNZFZd8zkTwKnqbzR8Gz5PZoD-B7p8wLZTs1Qg0VFkMDPsiAHZ4Pq619IarRMUupy0HLoKyByhoEZYBlvMFwJz1uJuLurft05lLG_28-ubeOHHI3kuLemoDfw8Ufcb6M549xnvN3I3_NGeIPHXuvND6N-TRN-JF3Whn04Ft7hK_YPJjaRjcLxP21d6NVY8iv1EmFS1LH1h4NE_mfXrqabLjmNx_RaJySnZPZLNmyZEdjU3P7VgZ-Wky9YvERRIw3Hb8GSvPlssw-Akrz5Z8CKtOVrPKPgMr0A6B3s_i-xfKFYrV73IPHZiT3lZo_F2pnGyc7-A1lhe5tGsC3U4_w9kxhAoB_KBeI_DdjX9qTH7sonl_VfwA-YTd-Ya8bGON5gX182mxvll6jQgG8PDGKX3clMkukuG0dYHdx3AduGcUz4yRh4Eun-gDzORF53nWyn-Pr6KbsfYiLudeLQrxcLUQKccaTeBnF6SrPExIO0pLnw-7-6bB9_vRp80WcZzKxPzeSDWn4j1rGRD4eZ2qvgI0nvSQmz1my4TFQMUh4CkmaZmmWpPmKhD7ZgJt_7-F_JEMf8EyjdPpE5D0iDCTmBpQprXNYhh_KJ-kXCSmxcpT5G-mn9V6ZEieeQGXRGyaWAQyW6L10Sp9Adb0-0QbyUhxIjbrL5WCCiibK91jeoHTSVDIghFaGtyxIdhSF_nmEY-kWlAnYOKmht0d0BC4ulat_fqUNqOrRrQD_PdC0Z3stXJ2tBm3hjBbNqnVSrZKVnOE6zpZ5tlotsmzWritciUXBV_Uy4SWmMc_zuiiqqlhWKc8ynKm14CLhSZzwlViILOJVIVc8l7UsOZYlsgXHTiodkX5E1jWzUYzXWZws-UzLArUfL8tCGDyelVoIuju79ag5xdB4tuBa-eDfUIIKerxlE1fIHQ-0EuRZXSluxs4f7w4w5c94b_Z0qZbSlW22mA1Or_-y7I32EelG-_8bAAD__9Ymn0A">