[Lldb-commits] [PATCH] D149040: Refactor and generalize debugserver code for setting hardware watchpoints on AArch64

Fri Apr 28 16:38:06 PDT 2023

jasonmolenda marked 7 inline comments as done.
jasonmolenda added inline comments.

================
Comment at: lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp:833-835
+std::vector<DNBArchMachARM64::watchpoint_spec>
+DNBArchMachARM64::AlignRequestedWatchpoint(nub_addr_t user_addr,
+                                           nub_size_t user_size) {
----------------
jasonmolenda wrote:
> JDevlieghere wrote:
> > Do you ever expect this to return more than two watchpoints? Seems like this could be a `struct` that holds two optional `WatchpointSpec`s. I don't feel strongly about it, but it looks like you never iterate over the list and the way you have to check after the recursive call is a bit awkward. 
> > 
> > edit: nvm, it looks like you do actually iterate over them in `EnableHardwareWatchpoint`
> I was thinking about how this current scheme only ever splits 1 watchpoint into 2, but a future design that could expand a user requested watch into a larger number of hardware watchpoints would expand it further.  If a user asks to watch a 32 byte object, and the target only supports 8 byte watchpoints, and the object is doubleword aligned, we could watch it with 4 hardware watchpoints.  The older code in debugserver was written in terms of "one or two" but I switched to a vector of hardware implementable watchpoints that may expand as we evaluate the hardware capabilities of the target/stub.
There's some extent where this debugserver implementation is me sketching out the first part of the WatchpointLocation work I want to do in lldb.  I will likely be copying this code up in to lldb, so it's written with an eye to where I'm heading there, agreed it's not necessary tho.  tbh once that WatchpointLocation stuff exists in lldb, all of this can be pulled from debugserver.

================
Comment at: lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp:856
+  // user_size == 16 -> aligned_size == 16
+  aligned_size = 1ULL << (addr_bit_size - __builtin_clzll(aligned_size - 1));
+
----------------
tschuett wrote:
> JDevlieghere wrote:
> > Beautiful. Once we have C++20 we can use `std::bit_ceil` (https://en.cppreference.com/w/cpp/numeric/bit_ceil)
> Is the builtin available on all supported platforms and compilers? There are some alignment functions in MathExtras.h.
Ah, that would be very nice and a lot clearer for readers.  I might add that as a comment.

================
Comment at: lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.cpp:923

-  // Otherwise, can't watch more than 8 bytes per WVR/WCR pair
-  if (size > 8)
-    return INVALID_NUB_HW_INDEX;
-
-  // Aarch64 watchpoints are in one of two forms: (1) 1-8 bytes, aligned to
-  // an 8 byte address, or (2) a power-of-two size region of memory; minimum
-  // 8 bytes, maximum 2GB; the starting address must be aligned to that power
-  // of two.
-  //
-  // For (1), 1-8 byte watchpoints, using the Byte Address Selector field in
-  // DBGWCR<n>.BAS.  Any of the bytes may be watched, but if multiple bytes
-  // are watched, the bytes selected must be contiguous.  The start address
-  // watched must be doubleword (8-byte) aligned; if the start address is
-  // word (4-byte) aligned, only 4 bytes can be watched.
-  //
-  // For (2), the MASK field in DBGWCR<n>.MASK is used.
-  //
-  // See the ARM ARM, section "Watchpoint exceptions", and more specifically,
-  // "Watchpoint data address comparisons".
-  //
-  // debugserver today only supports (1) - the Byte Address Selector 1-8 byte
-  // watchpoints that are 8-byte aligned.  To support larger watchpoints,
-  // debugserver would need to interpret the mach exception when the watched
-  // region was hit, see if the address accessed lies within the subset
-  // of the power-of-two region that lldb asked us to watch (v. ARM ARM,
-  // "Determining the memory location that caused a Watchpoint exception"),
-  // and silently resume the inferior (disable watchpoint, stepi, re-enable
-  // watchpoint) if the address lies outside the region that lldb asked us
-  // to watch.
-  //
-  // Alternatively, lldb would need to be prepared for a larger region
-  // being watched than it requested, and silently resume the inferior if
-  // the accessed address is outside the region lldb wants to watch.
-
-  nub_addr_t aligned_wp_address = addr & ~0x7;
-  uint32_t addr_dword_offset = addr & 0x7;
-
-  // Do we need to split up this logical watchpoint into two hardware watchpoint
-  // registers?
-  // e.g. a watchpoint of length 4 on address 6.  We need do this with
-  //   one watchpoint on address 0 with bytes 6 & 7 being monitored
-  //   one watchpoint on address 8 with bytes 0, 1, 2, 3 being monitored
-
-  if (addr_dword_offset + size > 8) {
-    DNBLogThreadedIf(LOG_WATCHPOINTS, "DNBArchMachARM64::"
-                                      "EnableHardwareWatchpoint(addr = "
-                                      "0x%8.8llx, size = %zu) needs two "
-                                      "hardware watchpoints slots to monitor",
-                     (uint64_t)addr, size);
-    int low_watchpoint_size = 8 - addr_dword_offset;
-    int high_watchpoint_size = addr_dword_offset + size - 8;
-
-    uint32_t lo = EnableHardwareWatchpoint(addr, low_watchpoint_size, read,
-                                           write, also_set_on_task);
-    if (lo == INVALID_NUB_HW_INDEX)
-      return INVALID_NUB_HW_INDEX;
-    uint32_t hi =
-        EnableHardwareWatchpoint(aligned_wp_address + 8, high_watchpoint_size,
-                                 read, write, also_set_on_task);
-    if (hi == INVALID_NUB_HW_INDEX) {
-      DisableHardwareWatchpoint(lo, also_set_on_task);
+  if (wps.size() > 1) {
+    std::vector<uint32_t> wp_slots_used;
----------------
JDevlieghere wrote:
> Why is this `> 1` and not `>= 1` (i.e. no 0 which we checked on line 916). TBH I don't really understand what this function is doing. 
Clarified this.  We have two cases: the user's watchpoint request needs either one hardware watchpoint register (one WatchpointSpec) or two hardware watchpoint registers (two WatchpointSpecs).  There's the missing pass from debugserver which takes an aligned one or two WatchpointSpecs which could expand the WatchpointSpecs more if we only supported 8 byte watchpoints and wanted to use many hardware watchpoint registers to implement them.  So I'm writing more general code than debugserver is actually going to do any time (I'll be adding MASK watchpoints next, for larger regions).

================
Comment at: lldb/tools/debugserver/source/MacOSX/arm64/DNBArchImplARM64.h:82-83
                                  bool also_set_on_task) override;
+  std::vector<watchpoint_spec> AlignRequestedWatchpoint(nub_addr_t user_addr,
+                                                        nub_size_t user_size);
   uint32_t EnableHardwareWatchpoint(nub_addr_t addr, nub_size_t size, bool read,
----------------
JDevlieghere wrote:
> Nit: I know what `user_` means in the context of this patch, but I'd go with either `requested_addr` and `requested_size` or even just `addr` and `size`. Another option would be `user_specified_addr` but that seems pretty verbose, although it would be consistent with the members of the struct. 
Yeah, in the argument list it looks overly verbose, but then in the method where we're using a combination of the user's original intent and the actual aligned addresses & sizes, it gets a little confusing to call them addr & size.  I'm fine with `requested_`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149040/new/

https://reviews.llvm.org/D149040