[llvm] [feature][riscv] handle target address calculation in llvm-objdump disassembly for riscv (PR #109914)
Arjun Patel via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 24 08:25:10 PST 2024
arjunUpatel wrote:
I am consistently failing one of the tests and have rattled my brain quite a bit about it. Here is whats going + guidance needed:
The test I am failing is the 5th test in llvm-project/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test. This particular test ensures that the absolute symbol is used for target address resolution if no symbol is found in the candidate section. Here is the structure of the test ELF file:
Sections:
- Name: .caller
Type: SHT_PROGBITS
Flags: [SHF_ALLOC, SHF_EXECINSTR]
Address: 0x0
Content: e800000000 ## Call instruction to next address.
- Name: .first
Type: SHT_PROGBITS
Flags: [SHF_ALLOC, SHF_EXECINSTR]
Address: 0x5
Size: [[SIZE1]]
- Name: .second
Type: SHT_PROGBITS
Flags: [SHF_ALLOC, SHF_EXECINSTR]
Address: 0x5
Size: [[SIZE2]]
Symbols:
- Name: target
Section: [[SECTION]]
Value: 0x5
- Name: other
Index: [[INDEX]]
Value: 0x0
Here is the disassembly of the file when the parameters are set in the following way (according to test 5):
SIZE1=1, SIZE2=0, SECTION=.caller, INDEX=SHN_ABS
Disassembly of section .caller:
0000000000000000 <.caller>:
0: e8 00 00 00 00 callq 0x5
Disassembly of section .first:
0000000000000005 <.first>:
5: 00 <unknown>
Target address resolution shall occur at address the callq instruction in the .caller section. Currently, the expected resolution is <other+0x5>. This would occur because the current implementation strictly checks the set of sections with the same address where the address is the closest to and less than or equal to the target. Lets call this set of sections $A$. In the current implementation if all the sections in $A$ are empty, then the absolute symbol is used. But what about symbols that occur in sections before $A$? Should the sections before $A$ also not be checked for valid symbols and the address resolution printed relative to one of those symbols? Following this scheme, the address resolution in this case will switch to <target> since the address of target is 0x5.
Checking sections before $A$ is exactly what binutils seems to be doing leading to discrepancies and failing tests when trying to fully mimic its behavior. Which scheme should I proceed with?
https://github.com/llvm/llvm-project/pull/109914
More information about the llvm-commits
mailing list