[cfe-dev] [analyzer] Alpha checker statuses.

Tue May 14 15:40:33 PDT 2019

I wanted to give more visibility to the discussion on the status of 
different Static Analyzer "alpha" (unfinished) checkers that's going on 
on Phabricator (https://reviews.llvm.org/D57858). Story so far: We're 
trying to figure out how many of them can be finished with relatively 
little effort (and therefore we should probably ask around to gather 
feedback) and how many are clearly not ready for any sort of use and 
should be hidden until the most glaring bugs are fixed. For now we 
officially treat all alpha checkers as the latter - so don't use them!

This discussion is important because different people's codebases are 
ridiculously different, so it's almost impossible to estimate the 
quality and usefulness of static analysis unless as many varied 
codebases as possible are involved.

 >>! In D57858#1500668, @Szelethus wrote:
 > `IteratorChecker` is a prime example that still suffers from not 
ideal FP/TP ratio, but users at Ericsson value it plenty enough not to 
be bothered by it. Many of the changes here and in private directly 
address user bug reports (That's just one, but I do remember having 
others around too!).

Once it has visitor path notes about where did it get its iterators 
from, some of the iterator checks should definitely be considered for 
being turned on by default. Especially the mismatched iterator check 
that doesn't rely on hardcore constraint solving. The current upstream 
version is not in good shape though; i just tried it on LLVM and it 
currently crashes all over the place with "Symbol comparison must be a 
`SymIntExpr`" assertion (pls ask me for repros if they aren't obvious). 
Also it has false mismatched iterator positives on `A.insert(B.begin(), 
B.end())`.

 >> In https://reviews.llvm.org/D57858#1501065, @dkrupp wrote:
 > These are the alpha checkers that we are testing in Ericsson:

Let me undig my last year's attempt to take a look at alpha checkers. 
The most common "limb" to "miss" in the alpha checkers is the "bug 
visitor" functionality that'd add checker-specific path notes to the 
report, which is almost inevitably necessary for any path-sensitive 
checker. Bug reports without path notes are hard to understand, but 
that's one thing that your users won't tell you: they often just don't 
have their good taste to realize that bug reports shouldn't be so hard 
to understand. The users often take it for granted that they have to 
figure out themselves where do these values come from, but it's still 
our job to not force them to.

 >  alpha.core.BoolAssignment

Yes, i agree that this one's pretty useful. It's currently missing a 
visitor that explains why does the analyzer think that the value is 
non-true/false, which is often necessary to understand the warning.

 >  alpha.core.CastSize

This one's indeed relatively quiet, but i'm seeing ~50 false positives 
caused by stuffing metadata at the beginning of a dynamically allocated 
buffer. I.e., allocate a buffer that's 4 bytes larger than necessary, 
use these 4 bytes for our own bookkeeping and provide a pointer to the 
rest of the buffer to be used for the actual value. I don't see an easy 
way to fix these false positives, so i don't see how to move this out of 
alpha.

 >  alpha.core.Conversion

Interestingly, i haven't seen this one trigger on our codebases. So i 
don't have an opinion here. There's a chance it might be a good opt-in 
check. Do you have an open-source project in mind on which this check is 
actually useful?

 >  alpha.core.DynamicTypeChecker

I really root for enabling this checker (and generally improving our 
dynamic type-checking), but for now i'm seeing ~600 false positives in 
projects that use custom RTTI (something like `dyn_cast`) and ~300 more 
Objective-C-specific false positives. I can take a look and try to 
reduce some of the custom RTTI ones if you're interested in figuring out 
how to fix them; i don't remember if they are easy to fix.

 >  alpha.core.SizeofPtr

This checker does indeed find interesting bugs sometimes, but i'm 
overwhelmed by ~300 false positives in which the sizeof of a pointer is 
taken intentionally. This is especially annoying when the pointer is 
hidden behind a typedef and the user doesn't need to know whether it's a 
pointer or not.

 >  alpha.core.TestAfterDivZero

I don't see any positives of this checker, but this checker is crazy and 
shouldn't have been done this way. It's a "must-problem" check and we 
don't have any sort of infrastructure to even display this kind of bug 
report properly after we find it, let alone to properly find it. We need 
a more-or-less full-featured data flow analysis engine before we make an 
attempt on such checker.

 >  alpha.cplusplus.DeleteWithNonVirtualDtor

I don't see any positives of this checker. Is it any better than the 
compiler warning that we have for all polymorphic classes that have no 
virtual destructors?

 >  alpha.security.MallocOverflow

This one's extremely loud for me (~1500 false positives). It looks as if 
it warns on every `malloc(x * sizeof(int))` (due to potential integer 
overflow during multiplication) so i just don't see it working as an 
AST-based check. We should probably rewrite it on top of taint analysis 
and then it'll need a constraint solver that knows how to multiply things.

Like, this is the point where i'd like to ask how does this happen that 
you don't see these false positives. Is this checker actually quiet for 
you? Or are your users addressing these warnings somehow?

 >  alpha.security.MmapWriteExec

I don't see any positives of this checker. It probably needs a visitor 
(which is trivial) and it definitely needs a solution for different 
values of macros on different platforms that'd be better than just 
setting them as a config flag.

 >  alpha.security.ReturnPtrRange

I don't see any positives of this checker. It definitely needs a visitor 
and we might as well join it with the generic array bound checker. Also 
need to figure out how to deal with, say, vector::end() which is 
supposed to return an out-of-bound pointer.

 >  alpha.security.taint.TaintPropagation

I believe that the generic taint engine is currently very solid, but 
specific checks that we have on top of it are very much unfinished: they 
simply don't react on any sort of validation that can be used to remove 
the taint. Normally that'll involve consulting the constraint manager 
(in case of tainted integers) or modeling validation routines (in case 
of more complicated objects).

 >  alpha.unix.BlockInCriticalSection

I have ~10 positives and i didn't have a look at them back then; might 
be good. This checker needs a visitor to explain why do we think we're 
in a critical section.

 >  alpha.unix.Chroot

I don't see any positives of this checker. This checker needs a visitor 
to highlight chroot.

 >  alpha.unix.PthreadLock

Uhm, i have a few patches on this checker that i never committed:
- https://reviews.llvm.org/D37806
- https://reviews.llvm.org/D37807
- https://reviews.llvm.org/D37809
- https://reviews.llvm.org/D37812
- https://reviews.llvm.org/D37963
And it still needs a visitor. And support for more APIs, 'cause there 
are still false positives caused by unobvious POSIX APIs that release 
the mutex (sometimes conditionally). And once it's done, i'll be seeing 
no positives of this checker; it sounds like a good checker to have, but 
it doesn't seem to find mistakes that are *easy to make*.

 >  alpha.unix.SimpleStream
 >  alpha.unix.Stream

Those need, at least:
- A bug visitor.
- A suppress-on-sink behavior. Otherwise they warn on every assert 
between open and close (~200 false positives for me).
- Pointer escape support.
Also i vaguely remember that the non-simple checker is known to cause 
more state splits than necessary.

 >  alpha.unix.cstring.NotNullTerminated

Hmm, this check looks like a walking .addTransition() bug (unintended 
state split) when invoked from getCStringLength(). Also it doesn't seem 
to be disabled when the checker is disabled, so i guess we're kinda 
"using" it too. But it's also too quiet to matter, as it pretty much 
only warns when you're trying to compute a strlen() of a function pointer.

 >  alpha.unix.cstring.OutOfBounds

https://bugs.llvm.org/show_bug.cgi?id=41729 and it also needs a visitor 
for the index.