[cfe-dev] RFC: Nullability qualifiers
Aaron Ballman
aaron at aaronballman.com
Fri Jun 26 14:40:51 PDT 2015
On Fri, Jun 26, 2015 at 5:36 PM, b17 c0de <b17c0de at gmail.com> wrote:
> How can one detect if an Apple clang supports the new nullability
> attributes. I tried something like:
>
> #if __has_attribute(_Nonnull)
> #elif __has_attribute(__nonnull)
> #define _Nonnull __nonnull
> #else
> #define _Nonnull
> #endif
>
> But this didn't work. Why doesn't _Nonnull/__nonnull work with
> __has_attribute?
__has_attribute is used to test for GNU-style attribute support only.
To test for nullability, you should use: __has_feature(nullability)
~Aaron
>
> On Wed, Jun 24, 2015 at 10:39 PM, Douglas Gregor <dgregor at apple.com> wrote:
>>
>> Another addendum: due to the conflict with glibc’s __nonnull, we’ll be
>> renaming the __double_underscored keywords to _Big_underscored keywords,
>> e.g.,
>>
>> __nonnull -> _Nonnull
>> __nullable -> _Nullable
>> __null_unspecified -> _Null_unspecified
>>
>> On Darwin, we’ll add predefines
>>
>> #define __nonnull _Nonnull
>> #define __nullable _Nullable
>> #define __null_unspecified _Null_unspecified
>>
>> to keep the existing headers working.
>>
>> - Doug
>>
>> On Mar 2, 2015, at 1:22 PM, Douglas Gregor <dgregor at apple.com> wrote:
>>
>> Hello all,
>>
>> Null pointers are a significant source of problems in applications.
>> Whether it’s SIGSEGV taking down a process or a foolhardy attempt to recover
>> from NullPointerException breaking invariants everywhere, it’s a problem
>> that’s bad enough for Tony Hoare to call the invention of the null reference
>> his billion dollar mistake [1]. It’s not the ability to create a null
>> pointer that is a problem—having a common sentinel value meaning “no value”
>> is extremely useful—but that it’s very hard to determine whether, for a
>> particular pointer, one is expected to be able to use null. C doesn’t
>> distinguish between “nullable” and “nonnull” pointers, so we turn to
>> documentation and experimentation. Consider strchr from the C standard
>> library:
>>
>> char *strchr(const char *s, int c);
>>
>> It is “obvious” to a programmer who knows the semantics of strchr that
>> it’s important to check for a returned null, because null is used as the
>> sentinel for “not found”. Of course, your tools don’t know that, so they
>> cannot help when you completely forget to check for the null case. Bugs
>> ensue.
>>
>> Can I pass a null string to strchr? The standard is unclear [2], and my
>> platform’s implementation happily accepts a null parameter and returns null,
>> so obviously I shouldn’t worry about it… until I port my code, or the
>> underlying implementation changes because my expectations and the library
>> implementor’s expectations differ. Given the age of strchr, I suspect that
>> every implementation out there has an explicit, defensive check for a null
>> string, because it’s easier to add yet more defensive (and generally
>> useless) null checks than it is to ask your clients to fix their code. Scale
>> this up, and code bloat ensues, as well as wasted programmer effort that
>> obscures the places where checking for null really does matter.
>>
>> In a recent version of Xcode, Apple introduced an extension to
>> C/C++/Objective-C that expresses the nullability of pointers in the type
>> system via new nullability qualifiers . Nullability qualifiers express
>> nullability as part of the declaration of strchr [2]:
>>
>> __nullable char *strchr(__nonnull const char *s, int c);
>>
>> With this, programmers and tools alike can better reason about the use of
>> strchr with null pointers.
>>
>> We’d like to contribute the implementation (and there is a patch attached
>> at the end [3]), but since this is a nontrivial extension to all of the C
>> family of languages that Clang supports, we believe that it needs to be
>> discussed here first.
>>
>> Goals
>> We have several specific goals that informed the design of this feature.
>>
>> Allow the intended nullability to be expressed on all pointers: Pointers
>> are used throughout library interfaces, and the nullability of those
>> pointers is an important part of the API contract with users. It’s too
>> simplistic to only allow function parameters to have nullability, for
>> example, because it’s also important information for data members,
>> pointers-to-pointers (e.g., "a nonnull pointer to a nullable pointer to an
>> integer”), arrays of pointers, etc.
>> Enable better tools support for detecting nullability problems: The
>> nullability annotations should be useful for tools (especially the static
>> analyzer) that can reason about the use of null, to give warnings about both
>> missed null checks (the result of strchr could be null…) as well as for
>> unnecessarily-defensive code.
>> Support workflows where all interfaces provide nullability annotations: In
>> moving from a world where there are no nullability annotations to one where
>> we hope to see many such annotations, we’ve found it helpful to move
>> header-by-header, auditing a complete header to give it nullability
>> qualifiers. Once one has done that, additions to the header need to be held
>> to the same standard, so we need a design that allows us to warn about
>> pointers that don’t provide nullability annotations for some declarations in
>> a header that already has some nullability annotations.
>>
>> Zero effect on ABI or code generation: There are a huge number of
>> interfaces that could benefit from the use of nullability qualifiers, but we
>> won’t get widespread adoption if introducing the nullability qualifiers
>> means breaking existing code, either in the ABI (say, because nullability
>> qualifiers are mangled into the type) or at execution time (e.g., because a
>> non-null pointer ends up being null along some error path and causes
>> undefined behavior).
>>
>>
>>
>>
>> Why not __attribute__((nonnull))?
>> Clang already has an attribute to express nullability, “nonnull”, which we
>> inherited from GCC [4]. The “nonnull” attribute can be placed on functions
>> to indicate which parameters cannot be null: one either specifies the
>> indices of the arguments that cannot be null, e.g.,
>>
>> extern void *my_memcpy (void *dest, const void *src, size_t len)
>> __attribute__((nonnull (1, 2)));
>>
>> or omits the list of indices to state that all pointer arguments cannot be
>> null, e.g.,
>>
>> extern void *my_memcpy (void *dest, const void *src, size_t len)
>> __attribute__((nonnull));
>>
>> More recently, “nonnull” has grown the ability to be applied to
>> parameters, and one can use the companion attribute returns_nonnull to state
>> that a function returns a non-null pointer:
>>
>> extern void *my_memcpy (__attribute__((nonnull)) void *dest,
>> __attribute__((nonnull)) const void *src, size_t len)
>> __attribute__((returns_nonnull));
>>
>> There are a number of problems here. First, there are different attributes
>> to express the same idea at different places in the grammar, and the use of
>> the “nonnull” attribute on the function actually has an effect on the
>> function parameters can get very, very confusing. Quick, which pointers are
>> nullable vs. non-null in this example?
>>
>> __attribute__((nonnull)) void *my_realloc (void *ptr, size_t size);
>>
>> According to that declaration, ptr is nonnull and the function returns a
>> nullable pointer… but that’s the opposite of how it reads (and behaves, if
>> this is anything like a realloc that cannot fail). Moreover, because these
>> two attributes are declaration attributes, not type attributes, you cannot
>> express that nullability of the inner pointer in a multi-level pointer or an
>> array of pointers, which makes these attributes verbose, confusing, and not
>> sufficiently generally. These attributes fail the first of our goals.
>>
>> These attributes aren’t as useful as they could be for tools support (the
>> second and third goals), because they only express the nonnull case, leaving
>> no way to distinguish between the unannotated case (nobody has documented
>> the nullability of some parameter) and the nullable case (we know the
>> pointer can be null). From a tooling perspective, this is a killer: the
>> static analyzer absolutely cannot warn that one has forgotten to check for
>> null for every unannotated pointer, because the false-positive rate would be
>> astronomical.
>>
>> Finally, we’ve recently started considering violations of the
>> __attribute__((nonnull)) contract to be undefined behavior, which fails the
>> last of our goals. This is something we could debate further if it were the
>> only problem, but these declaration attributes fall all of our criteria, so
>> it’s not worth discussing.
>>
>> Nullability Qualifiers
>> We propose the addition of a new set of type qualifiers, spelled
>> __nullable, __nonnull, and __null_unspecified, to Clang. These are
>> collectively known as nullability qualifiers and may be written anywhere any
>> other type qualifier may be written (such as const) on any type subject to
>> the following restrictions:
>>
>> Two nullability qualifiers shall not appear in the same set of qualifiers.
>> A nullability qualifier shall qualify any pointer type, including pointers
>> to objects, pointers to functions, C++ pointers to members, block pointers,
>> and Objective-C object pointers.
>> A nullability qualifier in the declaration-specifiers applies to the
>> innermost pointer type of each declarator (e.g., __nonnull int * is
>> equivalent to int * __nonnull).
>> A nullability qualifier applied to a typedef of a nullability-qualified
>> pointer type shall specify the same nullability as the underlying type of
>> the typedef.
>>
>>
>> The meanings of the three nullability qualifiers are as follows:
>>
>> __nullable: the pointer may store a null value at runtime (as part of the
>> API contract)
>> __nonnull: the pointer should not store a null value at runtime (as part
>> of the API contract). it is possible that the value can be null, e.g., in
>> erroneous historic uses of an API, and it is up to the library implementor
>> to decide to what degree she will accommodate such clients.
>> __null_unspecified: it is unclear whether the pointer can be null or not.
>> Use of this type qualifier is extremely rare in practice, but it fills a
>> small but important niche when auditing a particular header to add
>> nullability qualifiers: sometimes the nullability contract for a few APIs in
>> the header is unclear even when looking at the implementation for historical
>> reasons, and establishing the contract requires more extensive study. In
>> such cases, it’s often best to mark that pointer as __null_unspecified
>> (which will help silence the warning about unannotated pointers in a header)
>> and move on, coming back to __null_unspecified pointers when the appropriate
>> graybeard has been summoned out of retirement [5].
>>
>> Assumes-nonnull Regions
>> We’ve found that it's fairly common for the majority of pointers within a
>> particular header to be __nonnull. Therefore, we’ve introduced
>> assumes-nonnull regions that assume that certain unannotated pointers
>> implicitly get the __nonnull nullability qualifiers. Assumes-nonnull regions
>> are marked by pragmas:
>>
>> #pragma clang assume_nonnull begin
>> __nullable char *strchr(const char *s, int c); // s is inferred to
>> be __nonnull
>> void *my_realloc (__nullable void *ptr, size_t size); // my_realloc is
>> inferred to return __nonnull
>> #pragma clang assume_nonnull end
>>
>> We infer __nonnull within an assumes_nonnull region when:
>>
>> The pointer is a non-typedef declaration, such as a function parameter,
>> variable, or data member, or the result type of a function. It’s very rare
>> for one to warn typedefs to specify nullability information; rather, it’s
>> usually the user of the typedef that needs to specify nullability.
>> The pointer is a single-level pointer, e.g., int* but not int**, because
>> we’ve found that programmers can get confused about the nullability of
>> multi-level pointers (is it a __nullable pointer to __nonnull pointers, or
>> the other way around?) and inferring nullability for any of the pointers in
>> a multi-level pointer compounds the situation.
>>
>>
>> Note that no #include may occur within an assumes_nonnull region, and
>> assumes_nonnull regions cannot cross header boundaries.
>>
>> Type System Impact
>> Nullability qualifiers are mapped to type attributes within the Clang type
>> system, but a nullability-qualified pointer type is not semantically
>> distinct from its unqualified pointer type. Therefore, one may freely
>> convert between nullability-qualified and non-nullability-qualified
>> pointers, or between nullability-qualified pointers with different
>> nullability qualifiers. One cannot overload on nullability qualifiers, write
>> C++ class template partial specializations that identify nullability
>> qualifiers, or inspect nullability via type traits in any way.
>>
>> Said more strongly, removing nullability qualifiers from a well-formed
>> program will not change its behavior in any way, nor will the semantics of a
>> program change when any set of (well-formed) nullability qualifiers are
>> added to it. Operationally, this means that nullability qualifiers are not
>> part of the canonical type in Clang’s type system, and that any warnings we
>> produce based on nullability information will necessarily be dependent on
>> Clang’s ability to retain type sugar during semantic analysis.
>>
>> While it’s somewhat exceptional for us to introduce new type qualifiers
>> that don’t produce semantically distinct types, we feel that this is the
>> only plausible design and implementation strategy for this feature: pushing
>> nullability qualifiers into the type system semantically would cause
>> significant changes to the language (e.g., overloading, partial
>> specialization) and break ABI (due to name mangling) that would drastically
>> reduce the number of potential users, and we feel that Clang’s support for
>> maintaining type sugar throughout semantic analysis is generally good enough
>> [6] to get the benefits of nullability annotations in our tools.
>>
>> Looking forward to our discussion.
>>
>> - Doug (with Jordan Rose and Anna Zaks)
>>
>> [1] http://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions
>> [2] The standard description of strchr seems to imply that the parameter
>> cannot be null
>> [3] The patch is complete, but should be reviewed on cfe-commits rather
>> than here. There are also several logic parts to this monolithic patch:
>> (a) __nonnull/__nullable/__null_unspecified type specifiers
>> (b) nonnull/nullable/null_unspecified syntactic sugar for Objective-C
>> (c) Warning about inconsistent application of nullability specifiers
>> within a given header
>> (d) assume_nonnnull begin/end pragmas
>> (e) Objective-C null_resettable property attribute
>> [4] https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html (search
>> for “nonnull”)
>> [5] No graybeards were harmed in the making of this feature.
>> [6] Template instantiation is the notable exception here, because it
>> always canonicalizes types.
>>
>> <nullability.patch>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
More information about the cfe-dev
mailing list