[PATCH] D17993: [CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments.

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 17 16:29:10 PST 2020


rsmith added inline comments.


================
Comment at: clang/lib/CodeGen/CGCall.cpp:2169
+    if (!CodeGenOpts.NullPointerIsValid &&
+        getContext().getTargetAddressSpace(FI.arg_begin()->type) == 0) {
+      Attrs.addAttribute(llvm::Attribute::NonNull);
----------------
jdoerfert wrote:
> rjmccall wrote:
> > rsmith wrote:
> > > jdoerfert wrote:
> > > > arichardson wrote:
> > > > > Isn't the `this` pointer also nonnull in other address spaces?
> > > > > 
> > > > > In our CHERI fork we use AS200 for the this pointer and would quite like to have the nonnull attribute.
> > > > > I can obviously change this line locally when I next merge from upstream, but I would like to avoid diffs and it seems to me like this restriction is unnecessary.
> > > > I also think `NullPointerIsValid` is sufficient. 
> > > It's my understanding that:
> > > * The LLVM `null` value in any address space is the all-zero-bits value.
> > > * In address space zero, the `null` value does not correspond to addressable memory, but this is not assumed to hold in other address spaces.
> > > * An address-space-zero `null` value that is addressspacecast to a different address space might not be the `null` in the target address space.
> > > * The `nonnull` attribute implies that the pointer value is not the `null` value.
> > > * A null pointer in the frontend in a non-zero address space corresponds to the value produced by an addressspacecast of an address-space-zero `null` value to the target address space.
> > > 
> > > That being the case, there is simply no connection between the C and C++ notion of a null pointer and a `null` LLVM pointer value in a non-zero address space in general, so it is not correct to use the `nonnull` attribute in a non-zero address space in general. Only if we know that a C++ null pointer is actually represented by the LLVM `null` value in the corresponding address space can we use the `nonnull` attribute to expose that fact to LLVM. And we do not know that in general.
> > I think all of this is correct except that the frontend does know what the bit-pattern of the null pointer is in any particular language-level address space, and it knows what the language-level address space of `this` is.  So we should be able to ask whether the null value in the `this` address space is the all-zero value and use that to decide whether to apply `nonnull`.
> Hm, I think the problem is that we don't couple `NullPointerIsValid` with the address space. As I said before. In LLVM-IR, if we don't have the `null-pointer-is-valid` property, then "memory access" implies `dereferenceable` implies `nonnull`. Now, we usually assume non-0 address spaces to have the above property, but that should be decoupled. The query if "null is valid" should take the function and the AS, as it does conceptually in LLVM-core, and then decide if `null-pointer-is-valid` or not. If we had that, @arichardson could define that AS200 does not have valid null pointers. If you do that right now the IR passes will at least deduce `nonnull` based on `derferenceable`.
> I think all of this is correct except that the frontend does know what the bit-pattern of the null pointer is in any particular language-level address space

Interesting. How do we get at that? Do we ask the middle-end to constant-fold `addrspacecast i8* null to i8* addrspace(N)` and see if we get a `null` back, or is there a better way?

In any case, this patch follows the same pattern we use for return values of reference type, parameters of reference type, and decayed array function parameters with static array bounds, all of which apply `nonnull` only in address space 0. If we want to use a different pattern, to consider whether LLVM's `nonnull` means "not a null pointer" rather than assuming that holds only in address space 0, we should make a holistic change for that throughout CGCall.cpp, rather than touching only the handling of the `this` pointer.

It'd also be interesting to consider what we want `__attribute__((nonnull))` to mean in address spaces where the null pointer isn't the zero pointer: should it mean non-zero at the source level / non-null in LLVM IR, or should it mean non-null at the source level (which might be unrepresentable in LLVM IR, but static analysis etc. could still detect it)? We're currently inconsistent on this: static analysis, constant evaluation, and sanitizers treat the attribute as meaning non-null, but IR generation emits the `nonnull` IR attribute, treating it as meaning non-zero instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D17993/new/

https://reviews.llvm.org/D17993



More information about the cfe-commits mailing list