<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/114605>114605</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            SIGSEGV in __llvm_write_binary_ids due to bad NOTE pointers
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          cuviper
      </td>
    </tr>
</table>

<pre>
    Ref: https://bugzilla.redhat.com/show_bug.cgi?id=2322754

Fedora rawhide has a new version of binutils that ends up splitting NOTEs into separate LOAD segments, and one of those has an extra 0x10000 memory offset, relative to its file offset. This isn't accounted for in `__llvm_write_binary_ids`, and on aarch64 it ends up trying to read *between* segments for a SIGSEGV. I believe the code is wrong regardless of arch though.

Steps to Reproduce:
1. `echo 'int main() {}' >main.c`
2. `clang -fprofile-instr-generate -fcoverage-mapping main.c -o main`
3. `./main`

With the old binutils, we would get `readelf -Wl` like this:

```
Program Headers:
  Type           Offset   VirtAddr PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R   0x8
 INTERP         0x00027c 0x000000000040027c 0x000000000040027c 0x00001b 0x00001b R   0x1
      [Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0093b8 0x0093b8 R E 0x10000
  LOAD           0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x006510 0x008788 RW  0x10000
 DYNAMIC        0x00fd38 0x000000000041fd38 0x000000000041fd38 0x0001e0 0x0001e0 RW  0x8
  NOTE           0x000238 0x0000000000400238 0x0000000000400238 0x000044 0x000044 R   0x4
  GNU_EH_FRAME   0x0078c4 0x00000000004078c4 0x00000000004078c4 0x000474 0x000474 R   0x4
 GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x000308 0x000308 R   0x1
```

After the update, the instrumented program crashes in `__llvm_write_binary_ids`. The headers look like:

```
Program Headers:
  Type           Offset VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x000230 0x000230 R 0x8
  INTERP         0x000294 0x0000000000400294 0x0000000000400294 0x00001b 0x00001b R   0x1
      [Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0093f8 0x0093f8 R E 0x10000
  LOAD           0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x0087a8 0x0087a8 RW  0x10000
 DYNAMIC        0x00fd38 0x000000000041fd38 0x000000000041fd38 0x0001e0 0x0001e0 RW  0x8
  NOTE           0x000270 0x0000000000400270 0x0000000000400270 0x000024 0x000024 R   0x4
  NOTE           0x018480 0x0000000000428480 0x0000000000428480 0x000020 0x000020 R   0x4
 GNU_EH_FRAME   0x007904 0x0000000000407904 0x0000000000407904 0x000474 0x000474 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x00fcf8 0x000000000041fcf8 0x000000000041fcf8 0x000308 0x000308 R 0x1
```

It crashes because the following line computes the second "Note = 0x400000 + 0x018480 = 0x418480", which is between LOAD segments, and the note is actually loaded at `0x428480`.

https://github.com/llvm/llvm-project/blob/f54cdc5d6ee5532da117f2489c105148c94dcb39/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c#L218-L219

That code is in the first branch of an `if-else`, which by comments is intended for inspecting files, while the `else` is for inspecting memory. However, the condition for `memsz == filesz` is met regardless. The other case still wouldn't compute the right address either, because `ElfHeader + vaddr` would double-count the `0x40000` base address. That would probably be right for PIE or SOs though.

https://github.com/llvm/llvm-project/blob/f54cdc5d6ee5532da117f2489c105148c94dcb39/compiler-rt/lib/profile/InstrProfilingPlatformLinux.c#L227-L228

---

I reproduced this using Fedora's clang build, but the code in question hasn't changed since 2021 in commit f261e258ecc0fc5b8e8a70dbe45752d1bb3c2d69.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUWF9v27oV_zT0y4ENiZIs-cEP7rXdBkvbwMlusaeAEo8k7tKkRlJx0k8_kJJjx-l1sWEbVsGQj47I858_8ohZKxqFuCTZB5KtJ6x3rTbLqn8SHZpJqfnLcoc1SVbQOtdZkqwI3RK6Lfvmu5CSzQzylrlZpfeEbm2rD49l38yqRpBkKzhJ1jShNM9SEq1JtBruW-TaMDDs0AqO0DILDBQe4AmNFVqBrqEUqndCWnAtc4CKW-g7sJ0UzgnVwJevDxsLQjkNFjtmmEO4_bpag8Vmj8pZQn8DpjhohV6ea7UdVSnAZ2cYRM9xFEUR7HGvzQvourbo_DSDkjnxhOA0CGehFhLH1zN4aIUFYRWhuQNWVbpXDjnU2oBQQObR46OUT_vHgxEOH0uhmHl5FNySeXQyCRgzVTtPQZx8c-bFO-Y0GGQcCF2V6A6IitDVq1dBD4P7m4_3m4-_z-AGSpQCva0tQqU5grBwMFo1YLBhhku01gfAK_RR6Jt2dp6Me4ed9Vp32BnN-wp9ksOreOb9warVQGgulIM9E4rQgtAFkPwDydeE5kCSjefPKu9imEjDxEoy1cC07oz2EZwKZZ2ZNqgwZGtaV_oJDWtwumdd530fxMBUD4qO4pIgbkbo9g17uH8Trg3Oa8lfq8ZH-oBw0L3k0KDz831UUdYw_SbJPAIp_vBBE_bV3fE-j8ZfeLwzujFsD5_8bHMaDPDw0iGcrq-hPgDgd2HcinMDd-2LDcTp2gqJ9-I7wGfch3_YygZWUjTqKPbu03oH0bOvzDQaieFKr7PiujgROwCInotR6M2Xh83u7tWKMIrm1aWoq6y4PBGD9Phosr9I9mGH_-jRhtXZjVETyqHpDDo0HkMI3UpR-jufSqH65-m4DmZWz2KSrY8Sw0qGtwb7pXrp-5-zFklZnIgdbI7L_YqKujqG8CgvvsaaZ_Ggq8iLAnbf4ELF-m9fVp9vfnujgSfvxF1jxRidiEHDMacBAS9jRC9l_YSVpidiSGp6FP_xy18fN58et7vV580oPi-q9ELWVVaanxFvxXvp9w-r3_5yJcE_Z52IY_TPrd9tbndf_-3sRkl0Rryp-AuIGO6r2qEJQNR3nDn0EOSfAur1HryRv66LyjDbov3pjuG3G4R2wB6QWv8RcOs_C1mvgHW6_sfQRZMzYnde5D9ErsVlyV1n_XLIVRcn4r-CXEXOzoj_D-TK3wXkKoumJ-ICud6Lj4u0uJBFr7PoGfEeuS5xcRG9A8FrrCu4-AsC43VYvHGvYFdixXo7nFRrLaU--AUnhfIH133XO7ThncVKK38Apl-0QyDJ2gdocIfQD6eEjm_CA6E0HPpaUbX-CDyenX_cE3gtyssWFljleiblC0jNOHJg4bAYPQ_lEM6dZ-687YIa4dq-HJsfj-Hj37Qz-u9YOd8oSe1ho87SilcZnyNmWUI5i-O8pmmxqOIoi9OiWqS8KpMFoVsfCyHRTI17RZ3xCE3o9sZvJ3fhUajmTjJXa7O_9ZA0qwhNbmlcTG9pvDi3-sH3UMfmQKghA8JYB6VhqmpDfxC2IlFPUVocm5UhmuWLz8_QfoT5DhV_bXlsh1VATm-gHWfJIcu-exik-YkX44e2awaf9AGf0Bz3S5974XwX6MeTebTHvf3uc-3THZR8HwXu0Z01OcNeqV2LBipmEawTUg49wNCtjVUW1BjRtA4Y58b3Ryj8NG_CsUjJPNrIethCQ9U9-bFe8dBUcN2XEqeh_Tv6OlapH1R6A0bp3jDmxnmd0SUr5QuURxu8m3c3G9AG7r_aH7Vov0DR0Xx6S2lxbvV0On0DBGCODSYPfRf01pfB8DGA0NzC0DCWvZA8pKJ3Z12tgmGP1sp38WNCW6Ya5GCFqhBoRGM_zhercFDTeYw0K7CqorrKygILlke8xDTLM8rjskwqyueL2YQvE75IFmyCyzhPonSRzRfJpF0WcZlEJc3zeVzmfMGKNOVxFKdYsDhOajYRSxrRNI6jmMZxmtFZlFOaV1WapXyep_OUpBHumZAzn6CZNs1EWNvjMo7TeZRNJCtR2vDxhVKFBwhvPZRl64lZhqyWfWNJGklhnT2JccJJXI6fAbzTf3KGBN6Hzxgl48O-2OlwsrGT3sjlv1xYwT5L6HZ04GlJ_xkAAP__-fAieg">