[cfe-dev] [analyzer][RFC] Attribute(s) to enhance/configure the analysis

Tue Oct 20 11:24:18 PDT 2020

Hi folks,

@Gabor thank you for opening the conversation, I think it is extremely important!

TBH, at the moment, I am more interested in the first item on the list, so it will be mostly about that item from me.

I want to start off with one common thing for all the items on the list.  It is actually something that we should always keep in mind as the top priority, - how easy and pleasant it is to use.  If we ask users to add something in their code, it should not look like a gimmick.  If it looks elegant and adds value to the code, the users will be happy to add those annotations.  In the perfect scenario, it makes code more clear even for users who don’t use the analyzer (one example here is Python and type annotations for Mypy).

So, what’s my take on suppressions (aka item #1).  I believe that Aaron’s suggestion is better for two reasons:
It is more clear in its intentions.  From the very first glance you see that this is a suppression.
We already have “suppress” attribute!  We just need to adopt it (and maybe add another possible spelling).

However, the hardest choice here is the string argument for the attribute.  Again, it would be perfect if we can avoid weird identifiers like “java.com.warnings.alpha.whatever.2020”, it would be good if that string will make it clear for people who don’t use the analyzer what’s going on here.
Checker names that we use in command line are a bit too verbose, error messages are prone to changes, bug categories are too broad and abstract (the intention is pretty vague if you suppress “Logic error”).  

My suggestion would be using so-called “Bug types” (we already show them in the summary table in index.html).  Each diagnostic has one, it’s unique and more-or-less descriptive.  If we revise all the existing bug types and try to make them concise and clear, I think it can work!

As for other items, summaries… Oof that's a tough one.  It looks like we need to parse raw strings anyway, right?  So, keeping in mind UX, maybe it should be something entirely different?  Something like a DSL specifically for summaries.  I don’t have one direct suggestion, but I believe that summaries of all things can be something that all developers can benefit from.

Here is another thought here though.  Summaries is not something magical, one summary can not and should not describe the entirety of a particular function.  Summaries usually model specific aspects interesting for one particular checker.  So one function, can have a good number of checker-specific summaries.  It can state that it frees the first argument, dereferences the second argument when the third argument is not equal to 42, and returns tainted data.  Expressing everything in one annotation is way too much.  Having a bunch of summaries directly in the code can be fine, but can be terrifying (so many lines before the actual function).  One way to deal with it is APINotes, but that is a completely different topic.

Honestly, I think the third item is essentially the same thing as summaries, but maybe I don’t see some sort of peculiarity here.

Phew, I got a bit wordy!

Cheers!

Valeriy

> On 20 Oct 2020, at 18:38, Gábor Márton <martongabesz at gmail.com> wrote:
> 
> My example is a bit flawed, so it could look like this:
> 
> 1) 
> [[clang::csa("supress.somecheck.somefunctionality")]] void foo();
> 
> 2)
> [[clang::csa("summary.BufferSize.Buffer(0).BufSize(1).BufSizeMultiplier(2)")]]
> std::size_t fread( void* buffer, std::size_t size, std::size_t count, std::FILE* stream );
> 
> 3) 
> namespace myNamespace {
> [[clang::csa("taint.sink")]]
> void mySink(int x);
> } // myNamespace
> 
> On Tue, Oct 20, 2020 at 5:34 PM Gábor Márton <martongabesz at gmail.com <mailto:martongabesz at gmail.com>> wrote:
> Hi,
> 
> There is an evolving need to configure the Clang Static Analyzer within the analyzed source code itself. We'd like to
> 1) suppress specific checkers (we already have an ongoing discussion at D89638 <https://reviews.llvm.org/D89638>)
> 2) express summaries (mainly argument constraints)
> 3) express taint propagation rules for functions (or for global variables like std::cin)
> 
> What if we had one attribute for CSA with a StringArgument?
> (Actually, we already have that with the `annotate` attribute.)
> 
> So we'd have something like this:
> 1) [[clang::csa("supress.somecheck.somefunctionality")]]
> 2) [[clang::csa("summary.std::fread.BufferSize.Buffer(0).BufSize(1).BufSizeMultiplier(2)")]]
> 3) [[clang::csa("taint.sink.myNamespace::mySink")]]
> 
> Disadvantages: we must process strings whenever a node has the 'csa' attr attached, we have to come up with a "DSL".
> Advantages: total flexibility.
> 
> I'd like to explore the possible approaches that we could have. For example, Aaron suggested alternatively for the suppression:
> [[clang::suppress("analyzer.somecheck.somefunctionality")]]
> [[clang::suppress("compiler.warning.12345")]]
> [[clang::suppress("tidy.check-name.whatever")]]
> 
> Thanks,
> Gabor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20201020/56b41d3f/attachment.html>