[PATCH] D144079: [AArch64InstPrinter][llvm-objdump] Print ADR PC-relative label as a target address hexadecimal form

Kristina Bessonova via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 15 07:37:05 PST 2023


krisb added a comment.

In D144079#4128441 <https://reviews.llvm.org/D144079#4128441>, @peter.smith wrote:

> No objections from me. I guess that `--no-print-imm-hex` is not honoured as we are not printing the immediate anymore?

Right.

> My one caveat is that this can give some strange answers for some code that I doubt anyone would write in practice like
>
>   .text
>   adr x0, -4096
>
> Which when disassembling an object will wrap around the address space. With gnu objdump:
>
>   0000000000000000 <.text>:
>      0:   10ff8000        adr     x0, fffffffffffff000 <.text+0xfffffffffffff000>
>
> which is less helpful than the current llvm-objdump (I'm not using --no-print-imm-hex)
>
>   0000000000000000 <$x.0>:
>          0: 10ff8000      adr     x0, #-0x1000
>
> However that should only happen for code that does a relative offset outside the section which is unlikely to work in practice.

I agree that llvm-objdump output w/o the patch looks a bit better, but the example seems to be rather artificial. So, I'm not sure we need to make this more complex, but we can, for example, check that a calculated address is still in +/-1MB range and opt out if it's not.

> I suggest updating the description with a slightly different example as it looks like it is just changing from decimal to hex. It is doing a bit more than that. It is changing a decimal offset from the PC to the absolute address of PC + offset.

Done.

In D144079#4128592 <https://reviews.llvm.org/D144079#4128592>, @simon_tatham wrote:

> I think that //most// ideally, if you don't know the true address of the instruction being disassembled, the best way to disassemble ADR is to indicate explicitly that the address you're going to end up with depends on that unknown, and not make it //look// as if it's a constant.
> So if you were disassembling a single encoding in the complete absence of any context, you'd write something like `adr x0, .+0x1234` (adjusted as necessary for the representation of 'here' in the assembler syntax you're outputting). And if you're in a context like an object file, where your ADR instruction is at a known offset from the start of the section but the section in turn might end up anywhere in memory after linking, you could say something relative to the section name or to one of the symbols in it, like `adr x0, .text+0x12` or `adr x0, .L.stringliteral`.

This looks like how a target address label works, which is mostly what I want to achieve by this patch. In `llvm/test/tools/llvm-objdump/ELF/AArch64/pcrel-address.yaml` the following example

  adr x0, 0x201004 <_start+0xf04>

demonstrates this. Label in '<>' (target address label, not sure about the notion, just using the wording from llvm-objdump code) printed relatively to a known symbol `_start`. 
But llvm-objdump doesn't output any label if it isn't able to resolve an address against any known symbol. May be we can improve this.



================
Comment at: llvm/test/tools/llvm-objdump/ELF/AArch64/pcrel-address.yaml:6
+# CHECK-NEXT:    adr x0, 0x220000
+# CHECK-NEXT:    adr x0, 0x201004 <_start+0xf04>
 # CHECK-NEXT:    adrp x2, 0x220000
----------------
simon_tatham wrote:
> Not that it matters for testing purposes, but did you mean to point that at 0x201000? Your other new adr matches the adrp that was there already.
I didn't mean to fully match addresses from `adrp` example, just reused the offsets w/o any specific intent.
It seems to be more readable if to remove `--no-leading-addr` and check for instruction address as well to demonstrate that the printed address is PC + offset. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144079/new/

https://reviews.llvm.org/D144079



More information about the llvm-commits mailing list