[Lldb-commits] [PATCH] D155905: lldb RFC: Exposing set/get address masks, Fix*Address methods in SBProcess

Fri Jul 21 07:43:19 PDT 2023

jasonmolenda added a comment.

Thanks for the feedback David.

> You are missing a way to fix a high mem address, is that intentional? Or would you get the high mem mask, change the setting, fix the address, as needed by context.

The FixAddress method will choose whether to use the high mem mask or the low mem mask based on the address.

================
Comment at: lldb/include/lldb/API/SBProcess.h:401

+  /// Get the current address mask that may be applied to addresses
+  /// before reading from memory.  The bits which are not relevant/valid
----------------
DavidSpickett wrote:
> may -> should?
> 
> Though for API uses that only run on simple systems, calling this would just be unnecessary overhead.
Probably the right call.  The masking is implemented in the ABI so there are systems that don't have this concept and have no mask function, which is what I was thinking of with "maybe".  

It makes me wonder if doing it in the ABI is correct; if we have an address mask for the system, is it a part of the ABI?  On an ARMv8.3 ptrauth using process, the ABI defines what things are signed and need to be cleared before dereferencing (where we need to call FixAddress from within lldb), but we solve this by calling FixAddress at all sites that any ABI might use; ultimately they all need to resolve to an addressable address before reading, so if one ABI signs the C++ vtable pointer and one does not, having both call FixAddress on that is fine.

Of course I'm coming at this with an AArch64 perspective.  On AArch64 we need to look at b55 to decide if the bits that are masked off are set to 0 or 1  (bits 56-63 possibly being TBI non-address bits), which is not a generic behavior that would apply on other processors that might need to apply an address mask.  So maybe it's an architecture specific method instead of an ABI method, I should say.

================
Comment at: lldb/include/lldb/API/SBProcess.h:402-403
+  /// Get the current address mask that may be applied to addresses
+  /// before reading from memory.  The bits which are not relevant/valid
+  /// for addressing should be set.
+  ///
----------------
DavidSpickett wrote:
> "should be set in the mask". Or in reverse "should be cleared by the mask". Just be clear whether it's the mask or the address.
Ah very good point.

================
Comment at: lldb/include/lldb/API/SBProcess.h:416
+  lldb::addr_t GetCodeAddressMask();
+  lldb::addr_t GetDataAddressMask();
+
----------------
DavidSpickett wrote:
> Should we note somewhere, maybe below where the `any` method is, the logic behind the code/data split?
> 
> Such as it is. Overall it's that data (load store) and code (fetch) addresses may have a different arrangement of addressing bits, so choose the one that makes sense in context. If we don't know, use any. It'd be an obvious question from anyone integrating this into an IDE.
Yeah I mean the reality is that on all our systems where code and data addresses are the same, FixCodeAddress may take advantage of the alignment of instructions to add a couple of bits, whereas FixDataAddress has to be able to address at a byte boundary so we really have "FixCodeAddress" and "FixAnyAddress", and it's only the Linux address mask namings that got us "FixDataAddress".  

Of course there exist systems where instruction addresses and data addresses may not be as interchangeable, and I could even imagine some Harvard architecture system where the number of bits used for addressing of instructions is different than data, and applying the Data address mask to an instruction could corrupt it.  

But I should explain the ideal use of these methods.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155905/new/

https://reviews.llvm.org/D155905