[Lldb-commits] [PATCH] D155905: lldb RFC: Exposing set/get address masks, Fix*Address methods in SBProcess

Thu Jul 27 09:51:50 PDT 2023

jasonmolenda added a comment.

In D155905#4537892 <https://reviews.llvm.org/D155905#4537892>, @DavidSpickett wrote:

>> but I could imagine some harvard architecture target that behaved differently (surely this is why Linux has two address masks)
>
> I'm not privy to the exact reasoning, but at least part of it comes from the architecture itself. You could have a target that enables top byte ignore and pointer authentication for data addresses, but only enables pointer authentication for code addresses. So ptrace will show you different values for the pointer authentication masks in that case. I'm not sure you can actually configure a kernel that way today, but it's viable.
>
> For the debugger, the result is the same. When top byte ignore is off, pointer authentication just uses that free space for itself. We end up removing the same set of bits either way.
>
> For very specific tools you might want to only remove pointer authentication bits. Making this up, but maybe you want to take pointers from a pointer authenticated ABI application and pass them to a shared library without those protections. Niche, but ptrace leaves the door open for that rather than breaking userspace later by adding it.

On macOS, fwiw, we have one set of the system libraries which are built to use ptrauth ABI ("arm64e"), but non-system processes are all running a non-ptrauth ABI ("arm64"), with most of the signing keys zeroed out so a "signed" function pointer is the same as the unsigned function pointer, making it possible to call between them.  iirc there's one key that is still enabled so the ptrauth-using code can sign its link register before spilling to memory, because that doesn't have ABI impact.

Like you've suggested above, the FixAddress methods are used before lldb accesses memory, the goal is to clear/set all the non-addressable bits.  Whether they were TBI metadata or ptrauth signing doesn't make any difference to lldb, it needs to find the actual VA that this uint64_t is referring to.  But maybe retaining that distinction will be useful for something some day.

In the AArch64 macOS ABI I have it clear/set the top byte always, even if the address mask is "all bits are used for addressing", so if the processor was not running in TBI mode and the user has a pointer variable with a smashed top byte, lldb would dereference it fine but the program would crash when it does the same.  I didn't think this was a problem worth trying to handle correctly.  (and we run our main application processors in TBI mode)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155905/new/

https://reviews.llvm.org/D155905