[Lldb-commits] [PATCH] D117299: [lldb] Ignore non-address bits in "memory find" arguments

Fri Jan 21 02:31:35 PST 2022

DavidSpickett marked an inline comment as done.
DavidSpickett added inline comments.

================
Comment at: lldb/test/API/linux/aarch64/tagged_memory_access/TestAArch64LinuxTaggedMemoryAccess.py:83
+        # Then that the location found does not display the tag bits.
+        self.assertNotRegexpMatches(out, "data found at location: 0x(34|56)[0-9A-Fa-f]+")
----------------
omjavaid wrote:
> so I am thinking do we actually need to omit tags from location address. Internally we want LLDB to ignore tags thats fine but user must be seeing the tagged address of buf so why omit tags in output of memory find?
> 
TLDR: The logic here is that the tag (or more generally non-address) bits of the pointer are a property of the pointer, not the memory pointed to.

More detail, since I've thought about this a lot. Please challenge if it doesn't make sense.

You could argue that you're showing the user a "view" of memory through that pointer. My problem there is that pointer auth and memory tagging break that argument because you cannot assume they stay the same as the memory location increments. Since you can't just assume that the signature for `ptr` and `ptr+16` would be the same, or the memory tag for those locations would be the same.

The only minor benefit there is the user can copy a tagged pointer to a later location, but those tags may not actually let the program access the memory. (though the debugger can always but it didn't need tags to do that in any case)

This is what the full output looks like. If you wanted to display the tags in the output which tag would you use? Well, simple choice is the start address tag. However, it's really an arbitrary choice that I think doesn't help much.

```
(lldb) p ptr1
(char *) $2 = 0x3400000000411029 ""
(lldb) p ptr2
(char *) $3 = 0x5600000000411029 ""
(lldb) memory find -s '?' ptr2 ptr1+16
data found at location: 0x411038
0x00411038: 3f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ?...............
0x00411048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
```

I agree there is an initial "why did my pointer get changed" reaction but think I not showing non-address bits when showing actual memory is going to be clearer in the long run. Granted this is mostly intuition, we don't have a lot of user feedback about this at this stage.

Actual "memory tags" do have a tag that is attached to the memory location in hardware, so you may want to show that. That's something I've been addressing for "memory read" and will probably port that over to memory find as well. It'll look something like:

```
(lldb) memory find -s '?' ptr2 ptr1+16
data found at location: 0x411038
0x00411038: 3f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ?............... (tag: 0x1)
0x00411048: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................ (tag: 0x2)
```

Crucially here we're showing the real tag for each line, instead of assuming that the one in the pointer can be applied to them all.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117299/new/

https://reviews.llvm.org/D117299