<div dir="ltr"><div>Thank you for summarizing the discussion and raising all these points. <br></div><div>Based on your response I propose the following plan moving forward:<br><br></div><div>* Start submitting patches regarding the non-controversial parts of the analysis, including:<br></div><div>  - Adding the annotations to Clang<br></div><div>  - Generalize existing statement-local warnings<br></div><div>    - Technical question here: should we expect the STL vendors to annotate the types, should we hard-code the annotations into the compiler, or should we actually implement type inference but restrict it to STL types?<br></div><div>  - Adding some extra statement-local warnings<br></div><div>  - Adding the flow sensitive analysis<br><br></div><div>Can we add you as a reviewer to those patches?<br><br></div><div>While we add the non-controversial part we can continue to discuss the rest on the mailing list and act according to the consensus we have later on.<br></div><div>Fortunately, not having the inference upstream will not stop us from leveraging the rest of the work. <br><br></div><div>To start the discussions regarding the type category inference, as far as I understand you are more worried about the false negatives that can give a bad impression about the analysis rather than the false positives. One idea we had is to have a refactoring that will automatically annotate the user defined types based on the inference we have. This could give the users a way to understand the root causes of such false negatives.<br></div><div>In case the community will not want automatic type category inference in the compiler we are still likely to implement it as a Tidy check with a refactoring to make it easier for users to adopt this analysis.<br><br></div><div>What do you think?<br><br></div><div>Cheers,<br></div><div>Gabor<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 12 Apr 2019 at 16:04, Dmitri Gribenko <<a href="mailto:gribozavr@gmail.com">gribozavr@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Gabor and I chatted privately, and I wanted to summarize the discussion.<br>

<br>

=== Concerns about inference of type categories ===<br>

<br>

I agree that having type categories is a good idea.  I think this is<br>

one of the most important user-facing features in the proposal.<br>

<br>

I don't agree with inferring type categories though.  The rules are<br>

too complex to allow the inference to be done implicitly without<br>

hurting readability.  I think the users should be always required to<br>

annotate their Owners and Pointers, but not Aggregates and not Values.<br>

The implementation could still check the rules, if they are crucial.<br>

<br>

If the rules were simpler (e.g., "Pointer is any type that satisfies<br>

the iterator concept"), I would consider inference more acceptable,<br>

but still borderline.  Iterator is a complex concept by itself, there<br>

are many broken iterators out there that don't satisfy the concept,<br>

but the simpler usages in practice compile anyway.  However, just the<br>

Pointer rules way more complex than that -- they are like half a page<br>

long.<br>

<br>

I don't think we can simplify the rules though -- they need to cover<br>

the necessary C++ concepts.<br>

<br>

My primary concern with inference is not false positives, but<br>

understandability and debuggability of the system.  When I get a<br>

warning, how do I, as a human, determine if the types I use are being<br>

correctly classified?  As an API vendor, how can I be sure that my<br>

types are correctly classified so that users will get correct<br>

warnings?  The answer is "add an annotation".  So wouldn't every API<br>

provider want to add an annotation regardless then?<br>

<br>

My worry is that a fully automatic system where the user is not<br>

required to understand the rules, will not match users' expectations.<br>

If a provider of a type thinks that it is a Pointer, but it actually<br>

isn't, the failure mode is false negatives, not false positives. Users<br>

will just think the analysis is too weak and can't detect the problem.<br>

So the API providers will have to test that their types are indeed<br>

recognized by the compiler with the correct type category.  At that<br>

point they could as well add an annotation.<br>

<br>

This is basically the same reason why C++ added the `override`<br>

keyword, and C++ having backwards compatibility constraints made<br>

`override` optional.  The inference rules for `override` are much<br>

simpler than inference of type categories.  However, style guides<br>

often require writing `override` where possible.<br>

<br>

Some types don't conform to the Pointer category formally, even though<br>

in spirit they are pointers.  For example, if style guide used in the<br>

project prohibits operator overloading: <a href="https://godbolt.org/z/IYJwHw" rel="noreferrer" target="_blank">https://godbolt.org/z/IYJwHw</a> .<br>

However, users who "don't fully understand the rules" could reasonably<br>

think that the compiler should be able to figure out that this type is<br>

a Pointer.  If the annotation was required, the problem would be<br>

instantly discovered when the compiler would rightfully complain that<br>

the type does not conform to Pointer.<br>

<br>

I understand there are two primary reasons why inference is desired:<br>

reduction of annotation burden, and third-party libraries.<br>

<br>

To evaluate the annotation burden in practice, I think LLVM is not a<br>

great example.  LLVM has an above-average number of Owners and<br>

Pointers because it tries to implement standard library bits from<br>

future C++ versions.  Also, those Owners and Pointers are in "ADT" and<br>

"Support" libraries, not in regular application code.<br>

<br>

Third-party libraries can be annotated through the "API Notes"<br>

approach, that Swift's Objective-C interop uses.  API notes allow the<br>

user to pass an annotations file that injects the necessary attributes<br>

into the library headers.<br>

<br>

=== Future work idea: "Optional" type category ===<br>

<br>

I also think it would be great to have more type categories, in<br>

particular "optional" -- which will include pointers (that can contain<br>

null), `std::optional` (that can contain nullopt), and whatever other<br>

types with an empty state people might have in their projects.  Then<br>

we could use the same dataflow-sensitive rules as the proposal uses<br>

for tracking null pointers to track checks of empty optionals etc.<br>

<br>

However, the rules for assuming nullability seem to be loose (return<br>

values and members are assumed non-null), I'm not very happy about<br>

that: <a href="https://godbolt.org/z/CMxEVv" rel="noreferrer" target="_blank">https://godbolt.org/z/CMxEVv</a> . I understand this was probably<br>

done to lower the annotation burden.  I don't think it makes for a<br>

very understandable model though.  I also couldn't find a single place<br>

in the proposal that summarized these assumptions, I had to piece them<br>

together from different sections.<br>

<br>

Dmitri<br>

</blockquote></div></div>