[llvm] [BOLT] Check if symbol is in data area of function (PR #160143)

Asher Dobrescu via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 10 09:01:19 PST 2025


Asher8118 wrote:

Hello, I looked some more into this issue and wanted to write my findings. I have tried looking into data being accessed as code at the postCFG route, which wasn't successful. I briefly looked into  data being accessed as code at the `buildCFG` phase, which looks more promising, but I am struggling to find any branches or calls that hit this issue from `buildCFG`.

I have also tried to find a better way to get the symbol address, hoping to make the warning more useful to the user:
```
+std::optional<uint64_t> BinaryContext::findSymbolAddr(const BinaryFunction &BF,
+                                        const MCSymbol *Symbol) const {
+  for (const auto &KV : BF.Labels)
+    if (KV.second == Symbol)
+      return BF.getAddress() + KV.first;
+
+  if (BF.Islands)
+    for (const auto &KV : BF.Islands->Offsets)
+      if (KV.second == Symbol)
+        return BF.getAddress() + KV.first;
+
+  return std::nullopt;
+}
+
  ```
  But this was unsuccessful.
  
I've spent some more time looking over this issue with @paschalis-mpeis . We tried to understand in what circumstances we hit this issue. Below are some test cases we have looked over:

Example1.s:
```
.text
.global main
main:
nop
$d.L1:
.Lp:
bl L2
$x.L2:
L2:
ret
```
Example2.s:
```
.text
.global main
main: # text
        add     x0, x1, x1
        bl      main_d1
        ret
$d.main1:
main_d1:
        add     x0, x1, x1
        ret
```
Example3.s:
```
.text
.global main
main:# text
        add     x0, x1, x1
        bl      foo_d1
        ret
$d.main:
foo_d1:
        bl      foo_x1
        ret
$x.main:
foo_x1:
        add     x0, x1, x1
        ret
```
Example4.s:
```
.text
.global main
main:
$d.main:
        bl      bar2_x1
        ret
$x.bar2x1:
bar2_x1:
        add     x0, x1, x1
        ret
```
Example5.s:
```
.text
.global main
main:
$d.main:
        b      bar2_x1
        ret
$x.bar2x1:
bar2_x1:
        add     x0, x1, x1
        ret  
```
Example6.s:
```
.text
.global main
main:
$d.main:
        .word 0x1241234
        .word 0x1241234
$x.bar3x1:
bar3_x1:
        add     x0, x1, x1
        ret
```
Example7.s:
```
.text
.global main
$d.main:
main:
        bl      bar_x1
        ret
$x.barx1:
bar_x1:
        add     x0, x1, x1
        ret
```
edge_case1: (As per Paschalis' comment above )
```
.text
.global main
$d.main:
main:
        nop
$x.main:
L2:
        ret
```
edge_case2:
```
.text
.global main
$d.main:
main:
$x.main:
L2:
        ret
```

>From this list, we can break the tests into 2 categories: those that hit the issue, and those that do not hit the issue.
Tests that HIT the issue: examples 3,4,5,6,7  and edge_case1.s.
Examples 1 and 2 as well as edge_case2.s do not hit this issue.

Based on the tests that do hit this issue, what they seem to have in common is that they break (b/bl) from a data area into a code area. The only exception seems to be edge_case1.s, as there is nothing connecting the data area to the code area. This case is a bit of a mystery and I have been unable to ascertain what exactly makes it crash. However, I have only looked at that test through `objdump` and `nm`. Perhaps if I were to look into bolt it would make more sense. So we have a few test cases that consistently hit the issue, while others do so for reasons that aren't clear. This problem might be more complex than it initially appeared.

  

  

https://github.com/llvm/llvm-project/pull/160143


More information about the llvm-commits mailing list