[llvm] [BOLT] Check if symbol is in data area of function (PR #160143)
Asher Dobrescu via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 10 09:01:19 PST 2025
Asher8118 wrote:
Hello, I looked some more into this issue and wanted to write my findings. I have tried looking into data being accessed as code at the postCFG route, which wasn't successful. I briefly looked into data being accessed as code at the `buildCFG` phase, which looks more promising, but I am struggling to find any branches or calls that hit this issue from `buildCFG`.
I have also tried to find a better way to get the symbol address, hoping to make the warning more useful to the user:
```
+std::optional<uint64_t> BinaryContext::findSymbolAddr(const BinaryFunction &BF,
+ const MCSymbol *Symbol) const {
+ for (const auto &KV : BF.Labels)
+ if (KV.second == Symbol)
+ return BF.getAddress() + KV.first;
+
+ if (BF.Islands)
+ for (const auto &KV : BF.Islands->Offsets)
+ if (KV.second == Symbol)
+ return BF.getAddress() + KV.first;
+
+ return std::nullopt;
+}
+
```
But this was unsuccessful.
I've spent some more time looking over this issue with @paschalis-mpeis . We tried to understand in what circumstances we hit this issue. Below are some test cases we have looked over:
Example1.s:
```
.text
.global main
main:
nop
$d.L1:
.Lp:
bl L2
$x.L2:
L2:
ret
```
Example2.s:
```
.text
.global main
main: # text
add x0, x1, x1
bl main_d1
ret
$d.main1:
main_d1:
add x0, x1, x1
ret
```
Example3.s:
```
.text
.global main
main:# text
add x0, x1, x1
bl foo_d1
ret
$d.main:
foo_d1:
bl foo_x1
ret
$x.main:
foo_x1:
add x0, x1, x1
ret
```
Example4.s:
```
.text
.global main
main:
$d.main:
bl bar2_x1
ret
$x.bar2x1:
bar2_x1:
add x0, x1, x1
ret
```
Example5.s:
```
.text
.global main
main:
$d.main:
b bar2_x1
ret
$x.bar2x1:
bar2_x1:
add x0, x1, x1
ret
```
Example6.s:
```
.text
.global main
main:
$d.main:
.word 0x1241234
.word 0x1241234
$x.bar3x1:
bar3_x1:
add x0, x1, x1
ret
```
Example7.s:
```
.text
.global main
$d.main:
main:
bl bar_x1
ret
$x.barx1:
bar_x1:
add x0, x1, x1
ret
```
edge_case1: (As per Paschalis' comment above )
```
.text
.global main
$d.main:
main:
nop
$x.main:
L2:
ret
```
edge_case2:
```
.text
.global main
$d.main:
main:
$x.main:
L2:
ret
```
>From this list, we can break the tests into 2 categories: those that hit the issue, and those that do not hit the issue.
Tests that HIT the issue: examples 3,4,5,6,7 and edge_case1.s.
Examples 1 and 2 as well as edge_case2.s do not hit this issue.
Based on the tests that do hit this issue, what they seem to have in common is that they break (b/bl) from a data area into a code area. The only exception seems to be edge_case1.s, as there is nothing connecting the data area to the code area. This case is a bit of a mystery and I have been unable to ascertain what exactly makes it crash. However, I have only looked at that test through `objdump` and `nm`. Perhaps if I were to look into bolt it would make more sense. So we have a few test cases that consistently hit the issue, while others do so for reasons that aren't clear. This problem might be more complex than it initially appeared.
https://github.com/llvm/llvm-project/pull/160143
More information about the llvm-commits
mailing list