[cfe-dev] Clang Static Analyzer: False Positive Suppression Support

Mon Aug 29 09:49:34 PDT 2016

> On Aug 29, 2016, at 7:59 AM, Gábor Horváth <xazax.hun at gmail.com> wrote:
> 
> Hi!
> 
> I will try to summarize the discussion,
Thanks!

> let me know if I forgot something.
> 
> So far the consensus is that, people are not convinced that false positive suppression is useful for compiler warnings or clang tidy. For both the warnings and clang tidy checks the false positives can usually be suppressed by changing the code (often in a natural way). In case it is not possible to do so, or the warning makes no sense, one should report a bug.

Compiler warnings can already be suppressed with pragmas.

> (And clang tidy already has a way to suppress results using comments, however that is not general enough for the static analyzer.) However most of us thinks that it is useful to have false positive suppression for the clang static analyzer. The reason is that, there are cases, where code can not be changed in a way that suppresses the warning. One can not use asserts to assert that something is freed. And one can not use the __clang_analyzer__ macro to suppress some of the warnings. 
> 
> There are some false positive suppression code for the static analyzer available privately and the might be open sourced.
> 
> There are two main approaches for suppression.
> 
> Hashes:
> * Good for 3rd party code or when teams share the same codebase and have different preferences regarding how to handle suppressions
> * It is better to store it in json/yaml/textual format, so it can be committed to source control systems
> * The user do not need to think about how to suppress an issue
> 
> Comments or pragmas:
> * For path sensitive checks either the uniqueing location must be suppressed or multiple locations which is noisy. In the former case the user might need to read the documentation to be able to properly suppress an issue

I believe using the uniquing location can be done automatically in the analyzer as it can be inferred from the regular location.

> * Using pragmas one can introduce convenience macros, it might also be better to suppress code regions
> * It is better to suppress based on bug type or something like that rather than using the name of the checker (which is not intended to leak out)
> * It is good idea to have line offsets to avoid too much noise when reading code
> * Anything against pragmas?

Another reason pragmas are better than comments is because this is what the compiler uses for suppressions.

> 
> In both cases:
> * It is crucial to think about suppression lifetime management, so make it easy to review false positives later on (e.g.: a flag to report a note when a suppression comment or hash is unused? also, it is crucial to be able to add comments to suppressions, why is it a false positive, what is the criterion to remove the suppression)
> * We should not mix baselining with supression, we should think about them as separate problems.
> * Suppression inside clang increases the compatibility between 3rd party tools
> * In the long run clang might no longer provide HTML output, it might be generated by a 3rd party tool (or another tool in the repository). It might not be crucial to feed suppressions back to text output.
> 
> What about continuing the analysis after a sink node was suppressed?
> * It is more likely to find more false positives on such paths.
> * In some cases however (if the checker fails to understand the domain specific logic, but the analyzer still has a good understanding of the code), it might make sense. So maybe it might make sense to continue the analysis, when a non-core check result was suppressed, and turn off that check for the rest of the path?
> * In the former case maybe the better alternative to make the check not to generate a sink.
> 
> Some of us thinks that it might make sense to implement both hash and pragma or comment based suppression.
> 
> Regards,
> Gábor
> 
> On 25 August 2016 at 19:31, Devin Coughlin <dcoughlin at apple.com <mailto:dcoughlin at apple.com>> wrote:
> 
>> On Aug 24, 2016, at 4:56 PM, p23 power <p23power at gmail.com <mailto:p23power at gmail.com>> wrote:
>> 
>> Hi Anna,
>> 
>> The main in-source suppression mechanism that the clang static analyzer supports is the ‘__clang_analyzer__’ macro. I am not sure if you tried using it or not...  so I am curious what are the main limitations of it that you are seeing.
>> 
>> The doc page you refer to says: "use this macro to selectively exclude code the analyzer examines" ... so wouldn't this potentially create false positives later in the code path? For example, by excluding code that initializes variables that is required later in the path, the analyzer may then warn about a garbage values.
>> 
>> Additionally, our customers do not want to, and some times are not allow to, add such annotations into the code, especially when it's maintained by another team or company.
>> 
>>  
>>  
>> The static analyzer already has support for suppression hashes in tree. The CodeChecker tool (https://github.com/Ericsson/codechecker <https://github.com/Ericsson/codechecker>) is using them to provide user workflows such as baselining and issue suppression. 
>> 
>> Where I like this tool, it imposes on the developer that all warnings need to viewed through that interface and it also requires the company to setup the tool in the first place.  Many developers just want to run the analyzer locally and either view the stdout or html reports.  
> 
> I have found the text output to be not very helpful in understanding typical static analyzer reports — and in some cases actively harmful since an unhelpful report is more likely to be interpreted as a false positive. I view the text output as primarily a mechanism for testing the analyzer.
> 
> I also think a good goal is to eventually get the static analyzer itself out of the business of generating HTML. It seems to me that decisions about the user workflow of how to present/filter/navigate issues are much better implemented outside of clang itself. I agree that it is important to be able to run the analyzer and view reports without setup — but I think we can achieve this without sticking the UI logic in the compiler.
> 
>> Lastly, you didn't comment on the idea of feeding a yaml/json file containing the issue hashes back into the analyzer for suppression the output.  What are your thoughts on this solution?
> 
> I think the biggest benefit of using a yaml/json file is for issue baselining rather than suppression and I would support incorporating this functionality in clang for that purpose. It is important that the format be easily diffable and stored in version control — i.e. not sqlite database.
> 
> In my view, the key different between baselining and suppression is that suppressions typically have lifecycle comments associated when them. These comments often include:
> - A justification for the suppression. This is typically an explanation of why the diagnostic is a false positive.
> - A criterion for when the suppression can be removed. This is often in the form of a bug filed against the static analyzer to fix the false positive or a bug filed against the codebase under analysis to follow proper coding conventions (e.g., for memory management and ownership).
> 
> Suppression lifecycle comments are important so that when the code in question changes someone can evaluate whether the suppression is still needed and make sure it is not hiding a true positive in the changed codebase. These comments can either be represented in the code itself (in which case the external yaml/json suppressions file would presumably not be needed) or viewed and stored by some external tool like CodeChecker (in which case this suppression/filtering logic is probably better implemented there).
> 
> Devin
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160829/ff30e518/attachment.html>