[PATCH] D142584: [CodeGen] Add a boolean flag to `Address::getPointer` and `Lvalue::getPointer` that indicates whether the pointer is known not to be null

John McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 25 19:58:20 PST 2023


rjmccall added a comment.

In D142584#4081345 <https://reviews.llvm.org/D142584#4081345>, @efriedma wrote:

> I'm having a little trouble following what the meaning of an "Address" is with this patch.  The `Pointer` member refers to an encoded value, and calling getPointer() emits IR to decode it?  Could use a few comments to

explain the API.

That's something of a future direction w.r.t this patch, but it's reasonable to talk about it, yeah.

Pointer authentication includes the ability to have signed data pointers.  Function pointers can only be used opaquely in C, but data pointers are subject to a lot of operations that boil down to pointer arithmetic, both as part of l-value accesses (such as `signedPtr->array[i] = 5`) and just in abstract pointer-typed expressions (such as `&ptr->array[i]`).  For l-values that are ultimately accessed, we really want to make sure that we do the authentication as close as possible to the actual access rather than authenticating it immediately and then having a raw, attackable pointer sitting around while we set up the access.  For pointer expressions in general, if they ultimately need to produce another signed pointer (e.g. we assign into a `__ptrauth`-qualified l-value), for similar reasons we really want to set up all the arithmetic ahead of time for a secure auth-add-and-resign sequence instead of authenticating and having raw, attackable pointers sitting around as we do the arithmetic.  We could do all that with a specialized pointer-with-signature emitter that covers literally every kind of pointer arithmetic, but that requires duplication of a *lot* of code, especially because we have a lot of sanitizers and other special logic also tied to those operations.  It seemed much less invasive to add a state to `Address` where it can be a pair of a signed pointer and an offset, then make a few basic GEP-formation operations accumulate into the offset when the pointer is in that state.  Of course, converting an `Address` into a normal pointer then becomes a non-trivial operation that needs a CGF.

Pointer authentication doesn't sign null pointers, so we have to do some explicit extra checks as part of this when the pointer is possibly null, and this patch is trying to make that easier to avoid.



================
Comment at: clang/lib/CodeGen/Address.h:26
+// Indicates whether a pointer is known not to be null.
+enum class KnownNonNull_t { False, True };
+
----------------
The usual idiom for this kind of boolean enum is that you just use an unscoped enum (not `enum class`), and you spell the enumerators something that reads well where it'll be used.  So you'd make the enumerators `KnownNotNull` and (I guess) something like `NotKnownNotNull` or `NullnessUnknown`.


================
Comment at: clang/lib/CodeGen/Address.h:97
+  llvm::Value *
+  getPointer(KnownNonNull_t KnownNonNull = KnownNonNull_t::False) const {
     assert(isValid());
----------------
Apologies if this rehashes a conversation we had earlier, but does it make more sense to track this internally in `Address`?  Some `Address` sources definitely known that they're non-null, and the places where we only know that *contextually* could definitely just pass it down to e.g. `EmitPointerWithAlignment`.  That would make it a less invasive change for all the clients of `getPointer()`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142584/new/

https://reviews.llvm.org/D142584



More information about the cfe-commits mailing list