[lldb-dev] RFC: AArch64 Linux Memory Tagging Support for LLDB

Mon Aug 10 03:41:39 PDT 2020

Hi all,

What follows is my proposal for supporting AArch64's memory tagging
extension in LLDB. I think the link in the first paragraph is a good
introduction if you haven't come across memory tagging before.

I've also put the document in a Google Doc if that's easier for you to
read: https://docs.google.com/document/d/13oRtTujCrWOS_2RSciYoaBPNPgxIvTF2qyOfhhUTj1U/edit?usp=sharing
(please keep comments to this list though)

Any and all comments welcome. Particularly I would like opinions on
the naming of the commands, as this extension is AArch64 specific but
the concept of memory tagging itself is not.
(I've added some people on Cc who might have particular interest)

Thanks,
David Spickett.

<begin doc>

# RFC: AArch64 Linux Memory Tagging Support for LLDB

## What is memory tagging?

Memory tagging is an extension added in the Armv8.5-a architecture for AArch64.
It allows tagging pointers and storing those tags so that hardware can validate
that a pointer matches the memory address it is trying to access. These paired
tags are stored in the upper bits of the pointer (the “logical” tag) and in
special memory in hardware (the “allocation” tag). Each tag is 4 bits in size.

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety

## Definitions

* memtag - This is the clang name for the extension as in
“-march=armv8.5-a+memtag”
* mte - An alternative name for mmtag, also the llvm backend name for
the extension.
  This document may use memtag/memory tagging/MTE at times, they mean
the same thing.
* logical tag - The tag stored inside a pointer variable (accessible
via normal shift and mask)
* allocation tag - The tag stored in tag memory (which the hardware provides)
  for a particular tag granule
* tag granule - The amount of memory that a single tag applies to,
which is 16 bytes.

## Existing Tool Support

* GCC/Clang can generate MTE instructions
* Clang has an option to memory tag the stack (discussed later)
* QEMU support has been merged
* Linux Kernel patches are in progress
  (git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
“devel/mte-v5” branch)
* GDB support is in review and this design takes a lot of direction from that
  (https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/luisgpm/aarch64-mte-v2)
  (originally proposed
https://sourceware.org/pipermail/gdb-patches/2019-August/159881.html)

## New lldb features

Assuming your software is acting correctly, memory tagging can “just work”
without debugger support. This assumes the compiler/toolchain/user are
always correct.

For when that isn’t the case we want to be able to:
* Read/write the logical tags in a pointer
* Read/write the allocation tags assigned to a given area of memory
* Test whether the logical tag in a pointer matches the allocation tag of the
  memory it refers to
* Read/write memory even when tags are mismatched

The most obvious use case for this is working through issues where bugs in the
toolchain don’t generate correct code. On the other hand there’s a good case for
deliberately messing with pointers in your code to prove that such protection
actually works.

Note: potential extensions to scripting such as tags as attributes of values and
such are not being proposed here. Of course the new commands will be
added in the
standard ways so you can use those.

## New Commands

### Command Availability

Note: commands will be listed in tab completion and help regardless of
these checks

* The remote server must support memory tagging packets. lldb will send/check
  for the “memory-tagging” feature in the qSupported packet. (this
name aligns with gdb)
* The process must have MTE available. We check HWCAP2_MTE for this.
* The process must have enabled tagged addressing using prctl
  (see “New Registers” for details)
* The address given must be in a range that has MTE enabled, since you can mmap
  with or without MTE. (this information is in /proc/.../smaps)

#### Interaction With Clang’s Stack Tagging

We’re relying on the kernel to tell us if MTE is enabled, so stack tagging will
not be visible to the debugger this way.
(https://github.com/google/sanitizers/wiki/Stack-instrumentation-with-ARM-Memory-Tagging-Extension-(MTE))

E.g. {int x; use(&x); } where x is void x(int* ptr);
“ptr” will have a memory tag but the kernel won’t know this.

To work around this a setting will be added to tell lldb to assume that MTE is
enabled, so that you can at least see the logical tags of a pointer.
(see “New Settings”)

### General Properties/Errors

* <address expression> must resolve to some value that can be handled as an
  address by lldb. (though it need not be a pointer specifically)
* Tags will be printed in hexadecimal to reflect the fact that they are a 4 bit
  field. (and since tags are randomly generated, ordering is unlikely
to be a concern)
* Packed tags will be 1 tag per byte (matches what ptrace expects)
* Addresses will be rounded down to the nearest granule (not always by lldb
  itself but what the user sees will look like this)
* Ranges are rounded up to a whole number of granules
* It is an error to use a command on an address that does not have MTE enabled.
  (with the exception of “mtag check”)

### Commands

#### Avoiding Architecture Specific Naming

One problem you might see with the commands below is that they use l/a for
logical/allocation tags. These names are specific to MTE, for instance SPARC’s
ADI talks about “versions” instead. This limits the reuse of these
commands in the future.
(https://sourceware.org/gdb/current/onlinedocs/gdb/Sparc64.html#ADI-Support)

Instead we could first put them under “memory”, then merge the a/l tag commands
into “memory showtag” and “memory settag” (check -> checktag,
getconfig -> tagconfig).
Which avoids the arch specific names, though the output will still be.

(lldb) memory showtag <addr> <length in bytes>
<addr>: logical 0x1 allocation: 0x1 0x2 0x3 ...
(lldb) memory settag <addr> <logical tag> <length in bytes> <allocation tags...>

Length and allocation tags would be optional. We could assume that if
we only get
the logical tag arg, we should set both kinds of tag. This accommodates future
systems where there is only one type of tag, or you can only set them
all at once.

Whatever way you do it, there’s some kind of Arch dependent behaviour.

Another option would be to call them the “pointer tag” and the “memory tag”.
(which lends itself to being “memory tag/ptag” not “mtag mtag/ptag”
which is just confusing)

(lldb) memory showptrtag <addr>
(lldb) memory showtag <addr> <length>
(lldb) memory checktag <addr>

This makes the most sense to me and avoids having variable numbers of arguments
to commands.

#### mtag showltag <address expression>

Show the logical tag contained in the address given.

(lldb) mtag showltag a_ptr
0xF

Error conditions:
* As described above

#### mtag setltag <address expression> <tag value expression>

Set the logical tag of the variable that <address expression> resolves to, to
the value <tag value expression> resolves to.

(lldb) mtag setltag a_ptr 0xE

Error conditions:
* Address variable is not writable, e.g ptr+10 we can set a new tag but have
  nowhere to write it back to.
* Tag value is out of 0x0 to 0xF range. (this limit is specific to AArch64)

#### mtag showatag <address expression> <optional length>

Show the allocation tag(s) associated with the granule of memory that
<address expression> points to. (this is reading target memory so the work will
be done in lldb-server)

<length> will default to 1 granule, otherwise you can provide a value in bytes
which will be rounded up to a whole number of granules. E.g 28 bytes becomes 32
bytes which is two granules so two tags.
(note that length of 0 also becomes 1 granule)

(lldb) mtag showatag a_ptr
[0xfffff7ffa000, 0xfffff7ffa010) : 0xE
0xE
(lldb) mtag showatag a_ptr 28
[0xfffff7ffa000, 0xfffff7ffa010) : 0xE
[0xfffff7ffa010, 0xfffff7ffa020) : 0xF

Error conditions:
* General failure to read tag memory on the target (a ptrace failure)
* Failure to read tags because MTE is not enabled
* Given <length> is less than zero

#### mtag setatag <address expression> <length> <tags...>

Set the allocation tags of the memory in range <address expression> to
<address expression> + <length> (where length is rounded up to a whole number of
granules, meaning length <16 = 1 granule) to the tags in <tags>.

Where <tags…> is one or more tag arguments either in hex or decimal. Once these
are validated they will be each packed with 1 byte per tag in the data
sent to lldb-server.

Note: this is a break from the current gdb design that has the user type the raw
bytes. For example:
(gdb) mtag setatag a_ptr 32 040F

This does make the command more flexible as validation is done server side but
we’re doing some validation client side for logical tags anyway. The
question is,
is this added convenience enough to break with gdb?
(though if we go with the alternate “memory …” naming scheme proposed
above, we might as well)

In the example below we’re giving granule 1 at a_ptr a tag of 0x4 and granule 2
at a_ptr+16 a tag of 0xF. The second example sets the tag of the
granule at a_ptr to 0x5.

(lldb) mtag setatag a_ptr 32 0x4 15
(lldb) mtag setatag a_ptr 1 5

In the case that the number of tags given is not enough to cover the
memory range,
lldb-server will keep repeating the set until it does. Meaning a set of 2 tags
would be repeated once to cover 4 granules. A set of 3 tags would be
written once
with the first tag used again for the 4th granule.

Error conditions:
* Length is not a valid number or is less than 0
* One or more tags are out of the valid range of 0-0xF

#### mtag check <address expression>

Check that the logical tag in <address expression> matches the allocation tag
set for the granule it points to.

(lldb) mtag check a_ptr
Failed: logical tag 0x1 does not match allocation tag 0x2
(lldb) mtag check non_mte_ptr
Memory tagging is not enabled for address non_mte_ptr
(lldb) mtag check another_ptr
Passed: logical tag 0x1 matches allocation tag 0x1

Showing tags for a passed check seems redundant but I think it’s good to have as
a shortcut. That way you can use “mtag check” instead of “mtag showltag” then
“mtag showatag” if you want both tags.

Error conditions:
* Standard handling

#### mtag getconfig

This command will read the TAGGED_ADDR_CTRL register (see “New Register”) and
pretty print its values. It's nice to have but certainly isn’t as good as being
able to pretty print a register in general. (which I don’t think is
possible right now)

(lldb) mtag getconfig
Tagged addressing: Enabled
Fault Mode: Synchronous
Included Tags: 0b1111000011110000
(lldb) mtag getconfig
Target process is not MTE enabled.

Formatting up for debate of course, the point is you don’t have to shift things
in your head just to sanity check the debugee’s usage.

Note: no “set” for this at this time as I think that’s going to be a
much rarer occurrence.

## Modified Commands

### memory region

Will use the extra information from the qMemoryRegionInfo packet to show the
VmFlags where possible. For example:

(lldb) memory region addr
[0x00007ffff7ed2000-0x00007ffff7fd2000) rw- /dev/zero (deleted)
flags:  rd ex mr mw me dw sd mt

### memory read

Will not check that logical and allocation tags match, allowing reads
regardless.
Since most of the time checking is not the user’s intent when doing a read and
even if it is, there’s “mtag check” for that.

It will show allocation tags for memory that is MTE enabled. This is
on by default
on the basis that some subset of memory will be MTE so if you’re working with it
then tags are probably relevant. (new setting added to control this)

In the ideal scenario this looks like:
(lldb) memory read the_page
<Allocation tag 0x1 for range [0xfffff7ffa000, 0xfffff7ffa010)>
0xfffff7ffa000: 66 66 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ff..............
<Allocation tag 0x1 for range [0xfffff7ffa010, 0xfffff7ffa020)>
0xfffff7ffa010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
................

Obviously there’s a lot of formatting freedom with the read command so
this won’t
always be as neat. It could be better to put the tags in the lines like:
0xfffff7ffa000 (tag 0x1): 66 66 00 00 00 00 00 00 00 00 00 00 00 00 00
00  ff..............

Then if the lines are <16 bytes each you can repeat the tag in the next line.
Or for >16 bytes do “(tag 0x1, 0x2)”. This needs some experimentation, it could
get very confusing if we’re showing the same tag next to two ranges and it looks
like two separate tags. For example here we’re showing the same tag twice:

0xfffff7ffa000 (tag 0x1): 66 66 00 00 00 00 00 00 ff......
0xfffff7ffa008 (tag 0x1): 00 00 00 00 00 00 00 00 ........

### memory write

Will allow writes where the tags are mismatched.

It will print warnings for granules where the tags do not match. Even
if we assume
we’re writing a lot of data, if the program is MTE enabled then most of the time
tags will match. So it’ll only be noise in rare situations. A setting will be
added to disable them if needed.

lldb will read ahead for the tags. So for a write of 64 bytes we read 4 tags,
do the write then warn about any granules that didn’t match.

(lldb) memory write the_page 99
(lldb) memory write mismtached_ptr 99
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb000, 0xfffff7ffb010)
(lldb) memory write mismatched_ptr <17 bytes of data>
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb000, 0xfffff7ffb010)
Warning: Logical Tag 0x1 did not match Allocation tag 0x2 for range
[0xfffff7ffb010, 0xfffff7ffb020)

Hopefully “Warning” is enough to indicate that the write was still
done despite the mismatch.

## New Settings

Like the commands these settings will be present/visible in help even when MTE
is not available. The category name will be “memory-tagging”.

* assume-tagging-enabled - When handling logical tags in pointers assume that
  the memory they point to is MTE enabled. This allows you to debug/test things
  such as Clang’s stack tagging that are not handled by the kernel.
(default False)
* warn-on-write-tag-mismatch - Print warnings for each mismatching granule when
  writing with “memory write”. (default True)
* show-tags-in-read - Show tags in “memory read” output. (default True)

## New Registers

MTE adds 1 new register to the ptrace interface, which is the TAGGED_ADDR_CTRL
register. User programs use this same register via prctl to enable MTE.

It contains:
* A 16 bit include mask for tag generation. So with 0xFFFF you only get tags of
  0, with 0xFFFE you would get tags of 0 or 1, etc.
  (the hardware register GCR_EL1 actually has the opposite, an exclude mask)
* 1 bit to say whether tagged addresses are enabled at all
* 1 bit to set the fault mode for mismatched tags. This can be none
(ignore failures),
  asynchronous or synchronous.

So assuming we’re ok with pseudo registers like this being available via
“register read/write” it’ll be added to those. Probably under a “MTE Registers:”
or perhaps “Control Registers:” category.
(the latter could include future config regs such as pointer auth settings).

I say assuming because the current set are all what you’d call
hardware registers.
(though SVE might change this I’m not sure)

In “New Commands” I’ve also sketched out a command to read and pretty print the
register. Since I think most of the value will come from double
checking that you
passed the right flags to prctl, rather than modifying it on the fly.
(which could be done manually with “register write” if you really wanted to)

## SIGSEGV Handling

MTE faults raise a SIGSEV with a specific si_code for synchronous or
asynchronous.
The former includes the address where the fault happened. So this will fit into
the existing handlers quite easily.

(lldb) run
<...>
Process 19648 stopped
* thread #1, name = 'main', stop reason = signal SIGSEGV: Asynchronous
tag check fault
(lldb) run
<...>
Process 19648 stopped
* thread #1, name = 'main', stop reason = signal SIGSEGV: Synchronous
tag check fault (fault address: 0x100000000, allocation tag: 0x1)

Showing the allocation tag here is a nice to have, making an extra call for this
one fault might be awkward. You’d want to look at the logical tag for
the pointer
that caused the fault, so “mtag check <ptr>” gives you both regardless.

Note that the fault address does not include the logical tag used to access it.
I think we could show the logical tag assuming lldb knows what the destination
register of the faulting instruction is. I haven’t done the research here so I’m
not proposing that we should do it for this round of support.

## Corefiles

The format of corefiles for MTE is currently undecided so there is nothing to
mention here yet. Obviously we want to use the new commands to work with them
once they’re available. Discussion on that design will start shortly.

## Remote Protocol Changes

Note: some of lldb-server’s interpretation of packed tags is also
described in the “mtag setatag” section above.

### Extending qMemoryRegion

qMemoryRegion currently gives us the start/size/permissions and name
of a mapping.
For MTE we need to view the VmFlags line of the /proc/.../smaps file.
This contains
the “mt” flag, showing MTE was enabled for that memory.

Example entry:
00400000-004f4000 r-xp 00000000 fc:00 6431901
  /bin/bash
Size:                976 kB
<...>
VmFlags: rd ex mr mw me dw sd

To do this we will add an optional “flags” tuple to the response packet.

flags:<flags>;
Contain the flags shown on the VmFlags line, encoded as ASCII text just like the
“name” field is. (spaces remain as the delimiter)

This tuple will be optional because Linux kernels before 3.10 do not
have this file.
(also “flags” not “vmflags” to not be Linux specific)

### qSupported feature

The name will be “memory-tagging” to align with the GDB implementation. If this
feature is supported by the server it means it understands the new
packets and the target supports MTE.

### qMemTags (new)

Used to read memory tags from the target. (lldb-server will use
PTRACE_PEEKMTETAGS to do so)

(the “addr,length” format is derived from the existing m/M packets for
read/write memory)
qMemTags:addr,length:type

* addr - big endian hex address of the start of the range to read from.
  (the ptrace interface will take care of rounding this down to the
nearest granule)
* length - big endian hex number of bytes of memory to read tags from.
  This will be interpreted by the server to decide how many tags to return.
* type - a signed int indicating the type of tags being sent. This will just be
  one value at this time, meaning MTE, but leaves room for future
multi tag type systems.

Note: The length is interpreted by the server so the packet spec
doesn’t tell you
how you should do that. For AArch64 MTE lldb-server will be rounding up to the
nearest granule then returning 1 tag per. So 24 bytes becomes 32,
meaning 2 tags.

The reply is either:
* “mXX...” - (literal ‘m’) where XX is the hex encoded bytes of the tags read.
  (one tag per byte)
* “E nn” - An error code if one occurred. This will only be ‘01’ for
the time being.
  (it may prove useful to pass the ptrace error numbers through here
but it’s not
  needed for the current implementation)
* Empty reply - meaning the server doesn’t support this packet
  (in case the client didn’t pre-check this)

Note: The ‘m’ to start the tag data is present to support potential multi part
replies, where the last part would have ‘l’ instead.

Example exchange, reading the tags for the next 24 bytes of memory:
$qMemTags:CAFEFOOD,18:1#<checksum>
$m0E0F#<checksum>

### QMemTags (new)

Write memory tags to the target. (lldb-server will use
PTRACE_POKEMTETAGS to do this)

QMemTags:address,length:type:tags

* address - big endian hex address of the start of the range to write to.
  (which the ptrace interface will align for us)
* length - big endian hex length in bytes of the range to be written
to (see note)
* type - signed int indicating the tag type. For now there will only
be one value,
  which means MTE.
* tags - hex encoded bytes of the tags to be written (one tag per
byte/per 2 hex chars)

Note: The length does not have to match the number of tags given. If it is more
than the given tags can cover, the tags are taken as a pattern to apply.
Examples: (remember 1 tag covers 1 granule/16 bytes)

Write 0 tags to 16 bytes -> Error, must have at least one tag to use
Write 1 tag to 0 bytes -> round up to next granule making 16, so write
tag to 1st granule
Write 1 tag to 16 bytes -> writes 1st tag to 1st granule
Write 1 tag to 32 bytes -> writes 1st tag to the 1st and 2nd granule
Write 2 tags to 16 bytes -> writes the 1st tag to the 1st granule, 2nd
tag is unused

In this way you can do bulk operations like clear all tags or stripe
them throughout some range.

Example packet, writing the tags for the next 24 bytes of memory.
Setting granule
1 to 0xE and the next to 0xF:
$QMemTags:CAFEFOOD,18:1:0E0F#<checksum>

## Toolchain Requirements

We’re just using ptrace interfaces for this, we do not need tools capable of
assembling MTE instructions to be able to build. So there’ll be another header
in source/Plugins/Process/Linux/ containing the ptrace defines.

For some of the testing we will need an MTE toolchain to compile the
test programs,
same for corefiles.

## Testing

As much as possible will be done without needing an MTE system. Tests that need
an actual memory tagging enabled system will be tested using QEMU system mode
emulation. A document will be added to lldb’s documentation describing
how to run
the tests. (or added to the SVE testing docs)

## Interaction with Pointer Authentication

Armv8.3-a Pointer Authentication (PAC) also uses the upper bits of a pointer to
store metadata. PAC and MTE can be enabled in the same system and will share
those bits.
(ARM ARMv8 “Supported PAC field and relation to the use of address tagging”)

The position of the MTE tag does not change when PAC is enabled, so commands
do not need to check this first.

I think given the difference between the two schemes they should have
separate commands.
(MTE being a bitfield, PAC involving keys stored elsewhere)
Generic features like reporting mismatched tags/keys when reading memory could
apply to both so settings regarding that could be named generically.

<end doc>