[PATCH] D123352: [analyzer] Add FixItHint to `nullability.NullReturnedFromNonnull` and `nullability.NullableReturnedFromNonnull`

Tue Apr 26 13:22:40 PDT 2022

MosheBerman added a comment.

In D123352#3474033 <https://reviews.llvm.org/D123352#3474033>, @whisperity wrote:

> Regarding FixIts... FixIts are implemented in the "Diagnostic" library, which is non-specific to neither Clang-Tidy nor Sema whatsoever, they use the same infrastructure under the hood. Why the apply logic in CSA might do the same FixIt multiple times is beyond me, but I know that both `clang-apply-replacements` and `clang-tidy` go to length to ensure that in case multiple checkers report to the same location with potentially conflicting FixIts, then none gets applied, because applying all of them would result in ridiculously broken source code. They internally become an object in the `clang::tooling` namespace which is implemented as a core Clang library. The relevant entrypoint to this logic, at least in Clang-Tidy, should be this one: http://github.com/llvm/llvm-project/blob/8f9dd5e608c0ac201ab682ccc89ac3be2dfd0d29/clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp#L115-L134
>
> FixIts are very rudimentary under the hood. Essentially, they allow you to, in a file at a given position, delete `N` characters and insert `M` characters. Removal fixits only delete, insertion fixits only insert (duh), and just like in version control, a //modification// is a "removal" and an "insertion" grouped into one action...
>
> If FixIts are so broken in CSA's infrastructure as @steakhal observed then it might very well be a good thing to hold off on them and fix the infrastructure first. Or at least remove the ability to auto-apply them. You don't **have to** apply the fixes, but you can always get the neat little green-coloured notes as to what could the user should insert.

The default behavior _does not change_ with this diff. These FixIts are gated by the flags`nullability.NilReturnedToNonnull:ShowFixIts` and `nullability.NilReturnedToNonnull:ShowFixIts`. Even then `-apply-fixits` is required to actually apply them.

> In D123352#3463063 <https://reviews.llvm.org/D123352#3463063>, @MosheBerman wrote:
>
>> There's a DeadStore checker that does have some fixit code, which is where I found the `FixItHint` class. I did notice it isn't used in too many other places.
>
>
>
> In D123352#3439390 <https://reviews.llvm.org/D123352#3439390>, @steakhal wrote:
>
>> For fixits, we should be confident that some property holds, but the static analyzer might conclude erroneously that a given pointer is null resulting in a **bad** fixit.

In this case, the consequence of a bad fixit is pretty harmless from the perspective of a developer compiling their code. These annotations are not detemining if a pointer is `nullptr`, but if it can be. The code will still compile, and callsites might have an extra check. (Swift or Obj-C/Cxx) would have an extra check, but it would always pass. The NullabilityChecker has lots of conservative logic to prevent false positives. If we trust the existing checker, we are already pretty low-risk to begin with.

> Dead stores and dead code are, for the most part, rather easy to reason about and not make dangerous fixits by their removal. (I'd go far enough to say the DeadStore checker should be a run-of-the-mill warning in Sema, and not some extra fluff only available in CSA...)
>
> In D123352#3439649 <https://reviews.llvm.org/D123352#3439649>, @MosheBerman wrote:
>
>> Where can I learn more about this?  Would it be possible and idiomatically/architecturally sounds to write a clang-tidy that processes output from this checker?
>
> Very unlikely. While Clang-Tidy can execute CSA checkers (because CSA is a library within Clang, but AFAIK calling CSA from Tidy doesn't handle things like CTU), Clang-Tidy does not contain **any** notion of dependency between checks and they are meant to be independent. Moreover, you would need to enforce a sequential execution by the user //across// binaries (not to mention the fact that the entire project needs to be parsed twice, which might not be a trivial cost!), which seems error-prone...

Ok, thanks!

> -----
>
> In general, why you are trying to achieve this with the static analyser? It looks to me as if what you want is to mechanically add an annotation to some types **_if they are in between specific macros / annotations._** More specifically, the table that is in your original post doesn't seem to involve path-sensitive information. So could this be a purely (platform-specific?) Tidy check? Your test code also doesn't contain anything that seems to depend on path-sensitive information.

We want is to mechanically add a **_nullable_** annotation to some types **_when they are incorrectly annotated as nonnull_**. A nonnull annotation is "incorrect" when a return value is a `nil` literal, or the result of a `nullable` expression. (Macros are only one way to specify a return type as `nonnull`.) The Nullability Checker emits warnings for exactly this, but doesn't go as far as displaying or applying fixits, and that's what this diff is about.

> 1. Where is this `NS_` annotation coming from? Is this a thing of a specific platform, or specific to your company's products?

You can disregard the `NS_` comment. It refers to Apple's NS_ prefixed Objective-C collection classes which the checker appears to special-case no-op on. I was thinking if we could leverage it for those cases, but this is completely auxiliary to what I'm trying to do here, though. I'll delete the comment about that.

> 2. If you are sure you will not depend on any path-sensitive symbolic execution information (which you currently appear you do not), and the answer to `1.` is meaningful enough to be useful for a wider community (that is, not just you or your company?), then I would suggest rewriting the whole thing within Clang-Tidy. Clang-Tidy works off of AST matchers directly.

I've looked at clang-tidy before, and found that matching on return types gets complex once you account for all of the various cases like:  
a) implicit cast nodes generated by arc vs no different AST trees when arc is disabled 
b) returns of calls to other functions or methods.

While working on this, I referenced NullabilityChecker for a better way to reliably detect these cases and realized I was looking right at what I wanted.

> is meaningful enough to be useful for a wider community (that is, not just you or your company

This change will make it easier for everyone to annotate codebases, although has greater impact for larger codebases. Annotating a smaller codebase manually might manageable for a single engineer through the following process:

1. Check the implementations for `nil`/`nullable` returns.
2. Write annotations for nullable return types.
3. We can optionally add NS_ASSUME_NONNULL macros to cover the rest.

This is tedious, and with the CSA, we can do better. Today, the CSA already does step 1. With our changes here, we get:

1. CSA checks the implementations for `nil`/`nullable` returns.
2. CSA optionally writes annotations for nullable return types.
3. We can optionally add `NS_ASSUME_NONNULL` macros to cover the rest.

> The nullability annotations are attributes, which are also AST nodes, so you can handle them similarly to matching types or Decls.

This raises another issue, which is that inferred nullability `Attr`ibutes do not carry any information (from what I can tell) that tells us if they are explicitly spelled out or are macro expansions.

They are added in SemaTypes and don't currently track that they've been generated. I think that we can use IsImplicit, but I'm not sure if that is in the spirit of the original diff where it was introduced.

I'm eager to hear your thoughts, and happy to answer any more questions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123352/new/

https://reviews.llvm.org/D123352