[cfe-dev] [analyzer] Alpha checker statuses.

Kristóf Umann via cfe-dev cfe-dev at lists.llvm.org
Wed May 22 10:57:25 PDT 2019


I implemented what we discussed here in the following manner: Each checker
or checker option must either be in:

* Alpha: Unfinished, users are strongly discouraged from using it.
* Beta: Stable but might produce more false positives and the emitted
reports might lack proper explanation
* Released: Production ready, very few false positives with easy to
understand bug reports.

Any checker or checker option in addition can be marked as hidden, meaning
that it's a developer feature.

On Thu, 16 May 2019 at 01:24, Artem Dergachev <noqnoqneo at gmail.com> wrote:

> Just wanted to also point out that help is very welcome with implementing
> the missing functionality in the checkers that we've been discussing here.
>
> (more replies inline)
>
> On 5/15/19 5:26 AM, Kristóf Umann wrote:
>
> How about this:
>
> -analyzer-checker-help: Displays production ready (non-modeling)
> checkers, and beta checkers with a disclaimer for each description. Don't
> forget to patch clang-tidy!
> -analyzer-checker-help-developer: Displays only developer (modeling,
> debug) checkers (renamed from *-hidden).
> -analyzer-checker-help-alpha: Displays only alpha (non-modeling) checkers
> with very scary disclaimers both around the list and for each checker
> description.
>
> Note how each of these is mutually exclusive. I think we shouldn't make a
> flag that displays all checkers, but should allow the following invocation
> to do so:
>
> clang -cc1 -analyzer-checker-help -analyzer-checker-help-alpha
> -analyzer-checker-help-hidden
>
>
> This makes sense to me and i think this is how it should have always been
> done.
>
>
Note that hidden checkers may only show up in
-analyzer-checker-help-developer. Options for implicit checkers are NOT
implicitly hidden, but options for alpha checkers are implicitly marked
alpha.

> On Wed, 15 May 2019 at 12:39, Kristóf Umann <dkszelethus at gmail.com> wrote:
>
>> + György Orbán
>>
>> On Wed, 15 May 2019 at 00:41, Artem Dergachev <noqnoqneo at gmail.com>
>> wrote:
>>
>>> I wanted to give more visibility to the discussion on the status of
>>> different Static Analyzer "alpha" (unfinished) checkers that's going on
>>> on Phabricator (https://reviews.llvm.org/D57858). Story so far: We're
>>> trying to figure out how many of them can be finished with relatively
>>> little effort (and therefore we should probably ask around to gather
>>> feedback) and how many are clearly not ready for any sort of use and
>>> should be hidden until the most glaring bugs are fixed. For now we
>>> officially treat all alpha checkers as the latter - so don't use them!
>>>
>>> Just to add to the summary: despite our "disapproval" of using alpha
>> checkers, we always made them very visible, since if you wanted to list the
>> available checkers, you inevitable came across them (and they are also
>> lexicographically greater than the rest of them).
>>
>> Due to the lack of branches in our SVN repository, we used the alpha
>> package to make development incremental, which inevitable resulted in some
>> checkers in there being unfinished and unstable, while others merely need
>> some finishing touches, and could be used despite being rough around the
>> patches.
>>
> rough around the edges*
>
>>
>> The discussion came up in a patch that plans to expose checker options,
>> that always existed but were never listable, unless you read the source
>> code. However, just like alpha checkers, many of these also hide features
>> under development, while other would genuinely be useful to fine tune the
>> analyzer for a specific project.
>>
>> This discussion is important because different people's codebases are
>>> ridiculously different, so it's almost impossible to estimate the
>>> quality and usefulness of static analysis unless as many varied
>>> codebases as possible are involved.
>>>
>>>  >>! In D57858#1500668, @Szelethus wrote:
>>>  > `IteratorChecker` is a prime example that still suffers from not
>>> ideal FP/TP ratio, but users at Ericsson value it plenty enough not to
>>> be bothered by it. Many of the changes here and in private directly
>>> address user bug reports (That's just one, but I do remember having
>>> others around too!).
>>>
>>> Once it has visitor path notes about where did it get its iterators
>>> from, some of the iterator checks should definitely be considered for
>>> being turned on by default. Especially the mismatched iterator check
>>> that doesn't rely on hardcore constraint solving. The current upstream
>>> version is not in good shape though; i just tried it on LLVM and it
>>> currently crashes all over the place with "Symbol comparison must be a
>>> `SymIntExpr`" assertion (pls ask me for repros if they aren't obvious).
>>> Also it has false mismatched iterator positives on `A.insert(B.begin(),
>>> B.end())`.
>>>
>>>
>>>  >> In https://reviews.llvm.org/D57858#1501065, @dkrupp wrote:
>>>  > These are the alpha checkers that we are testing in Ericsson:
>>>
>>> Let me undig my last year's attempt to take a look at alpha checkers.
>>> The most common "limb" to "miss" in the alpha checkers is the "bug
>>> visitor" functionality that'd add checker-specific path notes to the
>>> report, which is almost inevitably necessary for any path-sensitive
>>> checker. Bug reports without path notes are hard to understand, but
>>> that's one thing that your users won't tell you: they often just don't
>>> have their good taste to realize that bug reports shouldn't be so hard
>>> to understand. The users often take it for granted that they have to
>>> figure out themselves where do these values come from, but it's still
>>> our job to not force them to.
>>>
>>>
>> The fundamental problem here is, in my opinion, that "alpha" doesn't
>> describe many of these checkers too well. I think once a checker is stable
>> and has a "reasonable" true positive/false positive ratio, we should move
>> them out of alpha status: not only would we be able to gather invaluable
>> feedback for these, but users might appreciate the feature even with a
>> couple shortcomings. However, just because they are not falling apart on
>> their own, these aren't always production ready -- how about we introduce a
>> "beta" package?
>>
>> Alpha checkers would be incomplete, incorrect and unstable by definition,
>> and would be hidden from non-developers. Beta checkers would receive a
>> disclaimer that they might emit too much false positives to be production
>> ready and don't emit ideal bug reports in terms of readability, but are
>> considered stable and usable.
>>
>>
>>>
>>>  >  alpha.core.BoolAssignment
>>>
>>> Yes, i agree that this one's pretty useful. It's currently missing a
>>> visitor that explains why does the analyzer think that the value is
>>> non-true/false, which is often necessary to understand the warning.
>>>
>>> This would be an ideal candidate to be moved to a beta package.
>>
>
> I'll be comfy to move it straight to core as long as someone stuffs a
> `bugreporter::trackExpressionValue()` into it and we both test it and see
> no obvious FP patterns specific to this checker. Unless we do this, i
> probably won't be comfy putting it into beta :)
>
>
>>>  >  alpha.core.CastSize
>>>
>>> This one's indeed relatively quiet, but i'm seeing ~50 false positives
>>> caused by stuffing metadata at the beginning of a dynamically allocated
>>> buffer. I.e., allocate a buffer that's 4 bytes larger than necessary,
>>> use these 4 bytes for our own bookkeeping and provide a pointer to the
>>> rest of the buffer to be used for the actual value. I don't see an easy
>>> way to fix these false positives, so i don't see how to move this out of
>>> alpha.
>>>
>>>
>>>  >  alpha.core.Conversion
>>>
>>> Interestingly, i haven't seen this one trigger on our codebases. So i
>>> don't have an opinion here. There's a chance it might be a good opt-in
>>> check. Do you have an open-source project in mind on which this check is
>>> actually useful?
>>>
>>>
>>>  >  alpha.core.DynamicTypeChecker
>>>
>>> I really root for enabling this checker (and generally improving our
>>> dynamic type-checking), but for now i'm seeing ~600 false positives in
>>> projects that use custom RTTI (something like `dyn_cast`) and ~300 more
>>> Objective-C-specific false positives. I can take a look and try to
>>> reduce some of the custom RTTI ones if you're interested in figuring out
>>> how to fix them; i don't remember if they are easy to fix.
>>>
>>>
>>>  >  alpha.core.SizeofPtr
>>>
>>> This checker does indeed find interesting bugs sometimes, but i'm
>>> overwhelmed by ~300 false positives in which the sizeof of a pointer is
>>> taken intentionally. This is especially annoying when the pointer is
>>> hidden behind a typedef and the user doesn't need to know whether it's a
>>> pointer or not.
>>>
>>>
>>>  >  alpha.core.TestAfterDivZero
>>>
>>> I don't see any positives of this checker, but this checker is crazy and
>>> shouldn't have been done this way. It's a "must-problem" check and we
>>> don't have any sort of infrastructure to even display this kind of bug
>>> report properly after we find it, let alone to properly find it. We need
>>> a more-or-less full-featured data flow analysis engine before we make an
>>> attempt on such checker.
>>>
>>>
>>>  >  alpha.cplusplus.DeleteWithNonVirtualDtor
>>>
>>> I don't see any positives of this checker. Is it any better than the
>>> compiler warning that we have for all polymorphic classes that have no
>>> virtual destructors?
>>>
>>> Maybe this one too.
>>
>
> Dunno, why duplicate a compiler warning that's already on by default and
> more aggressive than the checker?
>
>
>>>  >  alpha.security.MallocOverflow
>>>
>>> This one's extremely loud for me (~1500 false positives). It looks as if
>>> it warns on every `malloc(x * sizeof(int))` (due to potential integer
>>> overflow during multiplication) so i just don't see it working as an
>>> AST-based check. We should probably rewrite it on top of taint analysis
>>> and then it'll need a constraint solver that knows how to multiply
>>> things.
>>>
>>> Like, this is the point where i'd like to ask how does this happen that
>>> you don't see these false positives. Is this checker actually quiet for
>>> you? Or are your users addressing these warnings somehow?
>>>
>>>
>>>  >  alpha.security.MmapWriteExec
>>>
>>> I don't see any positives of this checker. It probably needs a visitor
>>> (which is trivial) and it definitely needs a solution for different
>>> values of macros on different platforms that'd be better than just
>>> setting them as a config flag.
>>>
>>>
>>>  >  alpha.security.ReturnPtrRange
>>>
>>> I don't see any positives of this checker. It definitely needs a visitor
>>> and we might as well join it with the generic array bound checker. Also
>>> need to figure out how to deal with, say, vector::end() which is
>>> supposed to return an out-of-bound pointer.
>>>
>>>
>>>  >  alpha.security.taint.TaintPropagation
>>>
>>> I believe that the generic taint engine is currently very solid, but
>>> specific checks that we have on top of it are very much unfinished: they
>>> simply don't react on any sort of validation that can be used to remove
>>> the taint. Normally that'll involve consulting the constraint manager
>>> (in case of tainted integers) or modeling validation routines (in case
>>> of more complicated objects).
>>>
>>>
>>>  >  alpha.unix.BlockInCriticalSection
>>>
>>> I have ~10 positives and i didn't have a look at them back then; might
>>> be good. This checker needs a visitor to explain why do we think we're
>>> in a critical section.
>>>
>>>
>>>  >  alpha.unix.Chroot
>>>
>>> I don't see any positives of this checker. This checker needs a visitor
>>> to highlight chroot.
>>>
>>>
>>>  >  alpha.unix.PthreadLock
>>>
>>> Uhm, i have a few patches on this checker that i never committed:
>>> - https://reviews.llvm.org/D37806
>>> - https://reviews.llvm.org/D37807
>>> - https://reviews.llvm.org/D37809
>>> - https://reviews.llvm.org/D37812
>>> - https://reviews.llvm.org/D37963
>>> And it still needs a visitor. And support for more APIs, 'cause there
>>> are still false positives caused by unobvious POSIX APIs that release
>>> the mutex (sometimes conditionally). And once it's done, i'll be seeing
>>> no positives of this checker; it sounds like a good checker to have, but
>>> it doesn't seem to find mistakes that are *easy to make*.
>>>
>>>
>>>  >  alpha.unix.SimpleStream
>>>  >  alpha.unix.Stream
>>>
>>> Those need, at least:
>>> - A bug visitor.
>>> - A suppress-on-sink behavior. Otherwise they warn on every assert
>>> between open and close (~200 false positives for me).
>>> - Pointer escape support.
>>> Also i vaguely remember that the non-simple checker is known to cause
>>> more state splits than necessary.
>>>
>>>
>>>  >  alpha.unix.cstring.NotNullTerminated
>>>
>>> Hmm, this check looks like a walking .addTransition() bug (unintended
>>> state split) when invoked from getCStringLength(). Also it doesn't seem
>>> to be disabled when the checker is disabled, so i guess we're kinda
>>> "using" it too. But it's also too quiet to matter, as it pretty much
>>> only warns when you're trying to compute a strlen() of a function
>>> pointer.
>>>
>>>
>>>  >  alpha.unix.cstring.OutOfBounds
>>>
>>> https://bugs.llvm.org/show_bug.cgi?id=41729 and it also needs a visitor
>>> for the index.
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190522/f96b5ee7/attachment.html>


More information about the cfe-dev mailing list