[cfe-dev] [analyzer][RFC] Attribute(s) to enhance/configure the analysis

Tue Oct 20 13:12:28 PDT 2020

Hi!

Sorry for the duplicate emails, I just realized that there is one more
dimension of suppression mechanisms. There are separate tools such as
CodeChecker that already support suppressing individual reports (either
using comments or bug hashes). So adding an annotation could make it
unclear who is responsible for the suppression and which feature should the
user use.

Cheers,
Gabor

On Tue, 20 Oct 2020 at 22:06, Gábor Horváth <xazax.hun at gmail.com> wrote:

> Hi!
>
> Speaking of suppressions, in an ideal world the user would never need
> that. I.e. when the analyzer misunderstands the code, the user would be
> able to add an assert or rewrite the code slightly and the result would be
> easier to understand both for humans and the analyzer. Unfortunately, this
> is not the case. We do have false positives when there is no clear way of
> rewriting the code or adding asserts to make the warning disappear. While
> it is useful to have an annotation for those cases, there are also some
> risks involved. Such an annotation can bitrot and redundant annotations can
> linger even after the code is changed or the analyzer is improved. Those
> annotations can suppress true positive results. Moreover, users might start
> to rely on those annotations instead of trying to make the code clearer
> first, amplifying the effects I mentioned earlier.
>
> I think an ideal suppress annotation would:
> * be applicable to a wide range of scopes, not just function scope
> * have clear documentation indicating this is the last resort and refer to
> some guidelines how to suppress false positives by improving the code
> * come with an easy mechanism to check whether the annotation makes any
> difference, so we can easily get rid of redundant ones
> * come with an easy mechanism to check how many issues are suppressed by
> the same annotation (or even expect the user to specify this number as an
> argument and warn if that does not match the reality?)
>
> We could also warn for checks when we do believe rewriting the code in a
> cleaner way should always be possible. I think overall those problems that
> would require the use of suppression annotations should be treated as a
> high priority.
>
> Cheers,
> Gabor
>
> On Tue, 20 Oct 2020 at 20:24, Valeriy Savchenko via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Hi folks,
>>
>> @Gabor thank you for opening the conversation, I think it is extremely
>> important!
>>
>> TBH, at the moment, I am more interested in the first item on the list,
>> so it will be mostly about that item from me.
>>
>> I want to start off with one common thing for all the items on the list.
>> It is actually something that we should always keep in mind as the top
>> priority, - how easy and pleasant it is to use.  If we ask users to add
>> something in their code, it should not look like a gimmick.  If it looks
>> elegant and adds value to the code, the users will be happy to add those
>> annotations.  In the perfect scenario, it makes code more clear even for
>> users who don’t use the analyzer (one example here is Python and type
>> annotations for Mypy).
>>
>> So, what’s my take on suppressions (aka item #1).  I believe that Aaron’s
>> suggestion is better for two reasons:
>>
>>    - It is more clear in its intentions.  From the very first glance you
>>    see that this is a suppression.
>>    - We already have “suppress” attribute!  We just need to adopt it
>>    (and maybe add another possible spelling).
>>
>>
>> However, the hardest choice here is the string argument for the
>> attribute.  Again, it would be perfect if we can avoid weird identifiers
>> like “java.com.warnings.alpha.whatever.2020”, it would be good if that
>> string will make it clear for people who don’t use the analyzer what’s
>> going on here.
>> Checker names that we use in command line are a bit too verbose, error
>> messages are prone to changes, bug categories are too broad and abstract
>> (the intention is pretty vague if you suppress “Logic error”).
>>
>> My suggestion would be using so-called “Bug types” (we already show them
>> in the summary table in index.html).  Each diagnostic has one, it’s unique
>> and more-or-less descriptive.  If we revise all the existing bug types and
>> try to make them concise and clear, I think it can work!
>>
>> As for other items, summaries… Oof that's a tough one.  It looks like we
>> need to parse raw strings anyway, right?  So, keeping in mind UX, maybe it
>> should be something entirely different?  Something like a DSL specifically
>> for summaries.  I don’t have one direct suggestion, but I believe that
>> summaries of all things can be something that all developers can benefit
>> from.
>>
>> Here is another thought here though.  Summaries is not something magical,
>> one summary can not and should not describe the entirety of a particular
>> function.  Summaries usually model specific aspects interesting for one
>> particular checker.  So one function, can have a good number of
>> checker-specific summaries.  It can state that it frees the first argument,
>> dereferences the second argument when the third argument is not equal to
>> 42, and returns tainted data.  Expressing everything in one annotation is
>> way too much.  Having a bunch of summaries directly in the code can be
>> fine, but can be terrifying (so many lines before the actual function).
>> One way to deal with it is APINotes, but that is a completely different
>> topic.
>>
>> Honestly, I think the third item is essentially the same thing as
>> summaries, but maybe I don’t see some sort of peculiarity here.
>>
>> Phew, I got a bit wordy!
>>
>> Cheers!
>>
>> Valeriy
>>
>> On 20 Oct 2020, at 18:38, Gábor Márton <martongabesz at gmail.com> wrote:
>>
>> My example is a bit flawed, so it could look like this:
>>
>> 1)
>> [[clang::csa("supress.somecheck.somefunctionality")]] void foo();
>>
>> 2)
>>
>> [[clang::csa("summary.BufferSize.Buffer(0).BufSize(1).BufSizeMultiplier(2)")]]
>> std::size_t fread( void* buffer, std::size_t size, std::size_t count,
>> std::FILE* stream );
>>
>> 3)
>> namespace myNamespace {
>> [[clang::csa("taint.sink")]]
>> void mySink(int x);
>> } // myNamespace
>>
>> On Tue, Oct 20, 2020 at 5:34 PM Gábor Márton <martongabesz at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> There is an evolving need to configure the Clang Static Analyzer within
>>> the analyzed source code itself. We'd like to
>>> 1) suppress specific checkers (we already have an ongoing discussion at
>>> D89638 <https://reviews.llvm.org/D89638>)
>>> 2) express summaries (mainly argument constraints)
>>> 3) express taint propagation rules for functions (or for global
>>> variables like std::cin)
>>>
>>> What if we had one attribute for CSA with a StringArgument?
>>> (Actually, we already have that with the `annotate` attribute.)
>>>
>>> So we'd have something like this:
>>> 1) [[clang::csa("supress.somecheck.somefunctionality")]]
>>> 2)
>>> [[clang::csa("summary.std::fread.BufferSize.Buffer(0).BufSize(1).BufSizeMultiplier(2)")]]
>>> 3) [[clang::csa("taint.sink.myNamespace::mySink")]]
>>>
>>> Disadvantages: we must process strings whenever a node has the 'csa'
>>> attr attached, we have to come up with a "DSL".
>>> Advantages: total flexibility.
>>>
>>> I'd like to explore the possible approaches that we could have. For
>>> example, Aaron suggested alternatively for the suppression:
>>> [[clang::suppress("analyzer.somecheck.somefunctionality")]]
>>> [[clang::suppress("compiler.warning.12345")]]
>>> [[clang::suppress("tidy.check-name.whatever")]]
>>>
>>> Thanks,
>>> Gabor
>>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20201020/46a6d87a/attachment.html>