[cfe-dev] [RFC] Upstreaming Lifetime Function Annotations

Gábor Horváth via cfe-dev cfe-dev at lists.llvm.org
Wed Dec 4 15:02:12 PST 2019


Hi!


1. Introduction


As you might now, there is an ongoing effort to implement Herb Sutter’s
lifetime analysis [1] in Clang [2]. Not too long ago, we upstreamed the
Pointer and Owner annotations including a set of statement local checks and
some hard coded STL function annotations. Those new warnings are on by
default in ToT and they proved to be very useful finding bugs in a variety
of open source projects including LLVM [3] while maintaining virtually zero
false positives (modulo potential bugs). As a next step, we would like to
propose function annotations. A prototype is available in a fork [4].


2. Motivation


Consider the following function and its call site:


  const char *find(const string &haystack, const string &needle);

  auto match = find(“Hi cfe-dev!”, someLValue);


We might easily see the dangling problem in the code above but it is much
harder for the compiler to diagnose it. The main problem is that the
compiler does not know what the relationship is between the arguments and
the return value. If the compiler knew that the return value is either null
or it will point into the first argument, it would be able to diagnose that
the returned pointer dangles immediately. We propose an annotation language
to describe such relationships. This language could be useful both for
statement-local, on-by-default, no-false-positive warnings and other bug
finding tools such as Clang Tidy or the Clang Static Analyzer. Information
like this is often hard coded into checks in the compiler, or other tools.
Using this annotation language we could have a uniform way to encode this
knowledge that is understood by all the tools under the Clang umbrella.
This could also improve the user experience and ease the introduction of
one of the clang tools to a project that is already using other clang tools.


3. Design


There is no separate design document for the annotation language, but it is
part of the flow-sensitive lifetime analysis by Herb [6]. Despite this
fact, we believe that these annotations are universally useful for other
analyses as well.


To summarize the paper:


The annotations would be after the function declaration where all the
symbols are available. We would like to follow the contracts proposal in
almost every detail. The plan is, once contracts are implemented, we turn
those annotations into pre- and postconditions.


The annotation for the motivational example would look like this:

  const char *find(const string &haystack, const string &needle)
      [[gsl::post(lifetime(find, {haystack, null}))]];


Whenever the annotation is about the return value, we use the function
name. We can read this annotation the following way: the value returned by
find is either null or its lifetime is tied to the lifetime of haystack. In
the current design, the target of the lifetime attribute must be a
parameter or the return value whose type is a raw pointer, reference or a
(possibly implicitly) gsl::Pointer annotated user defined type.

We also support output arguments and fields. Let’s look at a slightly
different variant of a find API.

  struct Match { const char *pos; /* ... */ };
  bool find(const string &hs, const string &n, Match *m)
    [[gsl::post(lifetime(deref(M).pos, {haystack}))]];


We can also have preconditions, e. g. to express to iterators must be
obtained from the same container:

  template<class It, class val>
  It find(It begin, It end, T val)

      [[gsl::pre(lifetime(begin, {end}))]];


To potentially diagnose calls like:
  auto result = find(a.begin(), b.end(), val);


We plan to include a small library to support this. The identifiers used
within the gsl::pre/post annotations would be part of the gsl namespace.
The examples assume the presence of the corresponding using statements.
Users will need to specialize a template so their type can be dereferenced
in the annotations. The reason is that, not all types are using operator*
for this purpose, for example there might be smart pointers in code bases
where operator overloading is forbidden.


3a. Relation to lifetimebound

Clang already has a similar attribute, lifetimebound [5], which allows to
express a subset of the proposed function annotations.

Example:

  const V& findOrDefault(const map<K, V>& m, const K& key,

                         const V& defvalue [[clang::lifetimebound]]);

is equivalent to

  const V& findOrDefault(const map<K, V>& m, const K& key,

                         const V& defvalue)
    [[gsl::post(lifetime(findOrDefault, {defvaluel}))]];


  template<...> class basic_string_view {

    …

    basic_string_view(

      const basic_string& str [[clang::lifetimebound]]) noexcept;

  };

is equivalent to

    template<...> class basic_string_view {

    …

    basic_string_view(const basic_string& str) noexcept

      [[gsl::post(lifetime(*this, str))]];

    };


While the lifetimebound annotation has similar purpose, it has less
expressivity. It is not compatible with output arguments, field selection
and it cannot express nullability.


4. Bikeshedding


We are open to discuss any details of the syntax. We also need to discuss
if we are willing to accept that using these annotations might need a small
library support and how to handle this.


5. Upstreaming


We plan to contribute incremental changes based on the prototype we have.
The first part would only include the representation of the annotations
without any of the parsing and syntax. This could help us replace some of
the hard coded ad-hoc information using this new format and gives us (the
community) more time to think about bikeshedding. After the representation
is upstreamed, we plan to start contributing to the actual parsing logic.


6. Conclusion


Adding an annotation language might be scary, but this is only intended for
library authors. The annotations could greatly improve the precision of
existing static analysis tools and increase the uniformity of the clang
ecosystem. We could also introduce new useful efficient warnings that have
no false positives.


What do you think? Is the community open to see this being upstreamed?


Cheers,

Gabor


[1]: https://herbsutter.com/2018/09/20/lifetime-profile-v1-0-posted/

[2]: http://lists.llvm.org/pipermail/cfe-dev/2018-November/060355.html

[3]: https://www.youtube.com/watch?v=d67kfSnhbpA

[4]: https://github.com/mgehre/llvm-project

[5]: https://clang.llvm.org/docs/AttributeReference.html#lifetimebound

[6]:
https://github.com/isocpp/CppCoreGuidelines/blob/master/docs/Lifetime.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20191204/dbd4f5f2/attachment-0001.html>


More information about the cfe-dev mailing list