[cfe-dev] Clang Static Analyzer: False Positive Suppression Support

Mon Aug 29 07:59:19 PDT 2016

Hi!

I will try to summarize the discussion, let me know if I forgot something.

So far the consensus is that, people are not convinced that false positive
suppression is useful for compiler warnings or clang tidy. For both the
warnings and clang tidy checks the false positives can usually be
suppressed by changing the code (often in a natural way). In case it is not
possible to do so, or the warning makes no sense, one should report a bug.
(And clang tidy already has a way to suppress results using comments,
however that is not general enough for the static analyzer.) However most
of us thinks that it is useful to have false positive suppression for the
clang static analyzer. The reason is that, there are cases, where code can
not be changed in a way that suppresses the warning. One can not use
asserts to assert that something is freed. And one can not use the
__clang_analyzer__ macro to suppress some of the warnings.

There are some false positive suppression code for the static analyzer
available privately and the might be open sourced.

There are two main approaches for suppression.

Hashes:
* Good for 3rd party code or when teams share the same codebase and have
different preferences regarding how to handle suppressions
* It is better to store it in json/yaml/textual format, so it can be
committed to source control systems
* The user do not need to think about how to suppress an issue

Comments or pragmas:
* For path sensitive checks either the uniqueing location must be
suppressed or multiple locations which is noisy. In the former case the
user might need to read the documentation to be able to properly suppress
an issue
* Using pragmas one can introduce convenience macros, it might also be
better to suppress code regions
* It is better to suppress based on bug type or something like that rather
than using the name of the checker (which is not intended to leak out)
* It is good idea to have line offsets to avoid too much noise when reading
code
* Anything against pragmas?

In both cases:
* It is crucial to think about suppression lifetime management, so make it
easy to review false positives later on (e.g.: a flag to report a note when
a suppression comment or hash is unused? also, it is crucial to be able to
add comments to suppressions, why is it a false positive, what is the
criterion to remove the suppression)
* We should not mix baselining with supression, we should think about them
as separate problems.
* Suppression inside clang increases the compatibility between 3rd party
tools
* In the long run clang might no longer provide HTML output, it might be
generated by a 3rd party tool (or another tool in the repository). It might
not be crucial to feed suppressions back to text output.

What about continuing the analysis after a sink node was suppressed?
* It is more likely to find more false positives on such paths.
* In some cases however (if the checker fails to understand the domain
specific logic, but the analyzer still has a good understanding of the
code), it might make sense. So maybe it might make sense to continue the
analysis, when a non-core check result was suppressed, and turn off that
check for the rest of the path?
* In the former case maybe the better alternative to make the check not to
generate a sink.

Some of us thinks that it might make sense to implement both hash and
pragma or comment based suppression.

Regards,
Gábor

On 25 August 2016 at 19:31, Devin Coughlin <dcoughlin at apple.com> wrote:

>
> On Aug 24, 2016, at 4:56 PM, p23 power <p23power at gmail.com> wrote:
>
> Hi Anna,
>
> The main in-source suppression mechanism that the clang static analyzer
>> supports is the ‘__clang_analyzer__’ macro. I am not sure if you tried
>> using it or not...  so I am curious what are the main limitations of it
>> that you are seeing.
>>
>
> The doc page you refer to says: "use this macro to selectively exclude
> code the analyzer examines" ... so wouldn't this potentially create false
> positives later in the code path? For example, by excluding code that
> initializes variables that is required later in the path, the analyzer may
> then warn about a garbage values.
>
> Additionally, our customers do not want to, and some times are not allow
> to, add such annotations into the code, especially when it's maintained by
> another team or company.
>
>
>
>>
>> The static analyzer already has support for suppression hashes in tree.
>> The CodeChecker tool (https://github.com/Ericsson/codechecker) is using
>> them to provide user workflows such as baselining and issue suppression.
>>
>
> Where I like this tool, it imposes on the developer that all warnings need
> to viewed through that interface and it also requires the company to setup
> the tool in the first place.  Many developers just want to run the analyzer
> locally and either view the stdout or html reports.
>
>
> I have found the text output to be not very helpful in understanding
> typical static analyzer reports — and in some cases actively harmful since
> an unhelpful report is more likely to be interpreted as a false positive. I
> view the text output as primarily a mechanism for testing the analyzer.
>
> I also think a good goal is to eventually get the static analyzer itself
> out of the business of generating HTML. It seems to me that decisions about
> the user workflow of how to present/filter/navigate issues are much better
> implemented outside of clang itself. I agree that it is important to be
> able to run the analyzer and view reports without setup — but I think we
> can achieve this without sticking the UI logic in the compiler.
>
> Lastly, you didn't comment on the idea of feeding a yaml/json file
> containing the issue hashes back into the analyzer for suppression the
> output.  What are your thoughts on this solution?
>
>
> I think the biggest benefit of using a yaml/json file is for issue
> baselining rather than suppression and I would support incorporating this
> functionality in clang for that purpose. It is important that the format be
> easily diffable and stored in version control — i.e. not sqlite database.
>
> In my view, the key different between baselining and suppression is that
> suppressions typically have lifecycle comments associated when them. These
> comments often include:
> - A justification for the suppression. This is typically an explanation of
> why the diagnostic is a false positive.
> - A criterion for when the suppression can be removed. This is often in
> the form of a bug filed against the static analyzer to fix the false
> positive or a bug filed against the codebase under analysis to follow
> proper coding conventions (e.g., for memory management and ownership).
>
> Suppression lifecycle comments are important so that when the code in
> question changes someone can evaluate whether the suppression is still
> needed and make sure it is not hiding a true positive in the changed
> codebase. These comments can either be represented in the code itself (in
> which case the external yaml/json suppressions file would presumably not be
> needed) or viewed and stored by some external tool like CodeChecker (in
> which case this suppression/filtering logic is probably better implemented
> there).
>
> Devin
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160829/464d2a0b/attachment.html>