[cfe-dev] [RFC] Adding lifetime analysis to clang

Mon Apr 15 05:26:35 PDT 2019

Thank you for summarizing the discussion and raising all these points.
Based on your response I propose the following plan moving forward:

* Start submitting patches regarding the non-controversial parts of the
analysis, including:
  - Adding the annotations to Clang
  - Generalize existing statement-local warnings
    - Technical question here: should we expect the STL vendors to annotate
the types, should we hard-code the annotations into the compiler, or should
we actually implement type inference but restrict it to STL types?
  - Adding some extra statement-local warnings
  - Adding the flow sensitive analysis

Can we add you as a reviewer to those patches?

While we add the non-controversial part we can continue to discuss the rest
on the mailing list and act according to the consensus we have later on.
Fortunately, not having the inference upstream will not stop us from
leveraging the rest of the work.

To start the discussions regarding the type category inference, as far as I
understand you are more worried about the false negatives that can give a
bad impression about the analysis rather than the false positives. One idea
we had is to have a refactoring that will automatically annotate the user
defined types based on the inference we have. This could give the users a
way to understand the root causes of such false negatives.
In case the community will not want automatic type category inference in
the compiler we are still likely to implement it as a Tidy check with a
refactoring to make it easier for users to adopt this analysis.

What do you think?

Cheers,
Gabor

On Fri, 12 Apr 2019 at 16:04, Dmitri Gribenko <gribozavr at gmail.com> wrote:

> Gabor and I chatted privately, and I wanted to summarize the discussion.
>
> === Concerns about inference of type categories ===
>
> I agree that having type categories is a good idea.  I think this is
> one of the most important user-facing features in the proposal.
>
> I don't agree with inferring type categories though.  The rules are
> too complex to allow the inference to be done implicitly without
> hurting readability.  I think the users should be always required to
> annotate their Owners and Pointers, but not Aggregates and not Values.
> The implementation could still check the rules, if they are crucial.
>
> If the rules were simpler (e.g., "Pointer is any type that satisfies
> the iterator concept"), I would consider inference more acceptable,
> but still borderline.  Iterator is a complex concept by itself, there
> are many broken iterators out there that don't satisfy the concept,
> but the simpler usages in practice compile anyway.  However, just the
> Pointer rules way more complex than that -- they are like half a page
> long.
>
> I don't think we can simplify the rules though -- they need to cover
> the necessary C++ concepts.
>
> My primary concern with inference is not false positives, but
> understandability and debuggability of the system.  When I get a
> warning, how do I, as a human, determine if the types I use are being
> correctly classified?  As an API vendor, how can I be sure that my
> types are correctly classified so that users will get correct
> warnings?  The answer is "add an annotation".  So wouldn't every API
> provider want to add an annotation regardless then?
>
> My worry is that a fully automatic system where the user is not
> required to understand the rules, will not match users' expectations.
> If a provider of a type thinks that it is a Pointer, but it actually
> isn't, the failure mode is false negatives, not false positives. Users
> will just think the analysis is too weak and can't detect the problem.
> So the API providers will have to test that their types are indeed
> recognized by the compiler with the correct type category.  At that
> point they could as well add an annotation.
>
> This is basically the same reason why C++ added the `override`
> keyword, and C++ having backwards compatibility constraints made
> `override` optional.  The inference rules for `override` are much
> simpler than inference of type categories.  However, style guides
> often require writing `override` where possible.
>
> Some types don't conform to the Pointer category formally, even though
> in spirit they are pointers.  For example, if style guide used in the
> project prohibits operator overloading: https://godbolt.org/z/IYJwHw .
> However, users who "don't fully understand the rules" could reasonably
> think that the compiler should be able to figure out that this type is
> a Pointer.  If the annotation was required, the problem would be
> instantly discovered when the compiler would rightfully complain that
> the type does not conform to Pointer.
>
> I understand there are two primary reasons why inference is desired:
> reduction of annotation burden, and third-party libraries.
>
> To evaluate the annotation burden in practice, I think LLVM is not a
> great example.  LLVM has an above-average number of Owners and
> Pointers because it tries to implement standard library bits from
> future C++ versions.  Also, those Owners and Pointers are in "ADT" and
> "Support" libraries, not in regular application code.
>
> Third-party libraries can be annotated through the "API Notes"
> approach, that Swift's Objective-C interop uses.  API notes allow the
> user to pass an annotations file that injects the necessary attributes
> into the library headers.
>
> === Future work idea: "Optional" type category ===
>
> I also think it would be great to have more type categories, in
> particular "optional" -- which will include pointers (that can contain
> null), `std::optional` (that can contain nullopt), and whatever other
> types with an empty state people might have in their projects.  Then
> we could use the same dataflow-sensitive rules as the proposal uses
> for tracking null pointers to track checks of empty optionals etc.
>
> However, the rules for assuming nullability seem to be loose (return
> values and members are assumed non-null), I'm not very happy about
> that: https://godbolt.org/z/CMxEVv . I understand this was probably
> done to lower the annotation burden.  I don't think it makes for a
> very understandable model though.  I also couldn't find a single place
> in the proposal that summarized these assumptions, I had to piece them
> together from different sections.
>
> Dmitri
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190415/b4b5fd48/attachment.html>