[cfe-dev] [analyzer][RFC] Our stance on checker dependencies and disabling core checkers

Tue Aug 13 17:07:51 PDT 2019

On 8/13/19 4:15 PM, Kristóf Umann wrote:
> On Wed, 14 Aug 2019 at 00:12, Artem Dergachev <noqnoqneo at gmail.com 
> <mailto:noqnoqneo at gmail.com>> wrote:
>
>
>
>     On 8/12/19 7:20 AM, Kristóf Umann via cfe-dev wrote:
>>     Hi!
>>
>>     Our stance has long been that despite being visible to the user,
>>     core checkers shouldn't be enabled/disabled by hand, because they
>>     do important modeling that any pathsensitive checker may depend
>>     on, potentially causing incorrect reports and crashes. However,
>>     since then, a series of patches gave us the ability to express
>>     dependencies and hide modeling checkers.
>>
>>     The core doesn't only do modeling, however, it emits diagnostics
>>     as well, and these diagnostics may contain false positives,
>>     sometimes to the degree where getting rid of them is desirable,
>>     yet we explicitly state that it shouldn't be disabled. And this
>>     problem doesn't only affect the core itself: disabling any
>>     checker that emits diagnostics and also is a dependency of some
>>     another checker will disable dependent checkers, which isn't
>>     always the intent.
>>
>>     When I originally implemented the checker dependency system, my
>>     immediate goal was to fix a bug causing inconsistent checker
>>     names, but I firmly believe its time to make the it even more rigid:
>>
>>     * Don't allow dependency checkers to emit diagnostics. Since the
>>     list of these checkers can easily be retrieved through
>>     Checkers.td, assertions could be used to enforce this. This would
>>     solve the issue of forgetting to create CheckerProgramPointTags
>>     for subcheckers for good, and a helpful assertion message could
>>     guide checker developers about it.
>>     * Make all dependency checkers hidden. If no dependent checkers
>>     are enabled, let the analyzer disable them. Disabling a
>>     non-hidden checker should only mean that the diagnostics from it
>>     are undesired.
>
>     Back when we were just discussing checker dependencies, i wasn't
>     sure we need them. My point was that if we instead split all
>     checkers into checkers that emit actual warnings but don't do
>     modeling (and therefore don't need to depend on each other) and
>     checkers that only do modeling but never do checking (and
>     therefore never need to be turned off), our hierarchy of checkers
>     becomes fairly flat and there's no need to write down dependencies
>     at all.
>
>
> Sounds about right! In this case, just as the const keyword in a 
> disciplined codebase, dependencies wouldn't be needed. However, we 
> could use it to enforce this, among other things, like the use of 
> checker tags. And now that I think about it, we do really need some 
> sort of dependency system to keep the checker naming bug in the abyss, 
> though I can't confidently say this is the only solution. I'll keep 
> this in the back of my mind and either try to prove that we need it or 
> mention alternatives.
>
>     With these first two bullets we get closer than ever to that
>     solution, right?
>
>
> Yup. We would only be able to enforce it with asserts, but I suspect 
> we have at least a single testcase for each checker that emits a 
> report, so theoretically, its not even "getting closer", but rather 
> actually nailing it! Theoretically.
>
>     Even if we ever need to turn off modeling-checkers (eg., they have
>     horrible bugs in them and we suggest disabling them as a
>     workaround), in most cases it's fine to keep their respective
>     checking-checkers on (they'll simply not find any bugs: say,
>     MallocChecker will be unable to find bugs if the memory is not
>     ever marked as allocated or freed). If it's not fine to keep them
>     on - well, just turn them off manually, given that you are already
>     smart enough to turn off the modeling-checker.
>
>
> I vaguely remember reading in your workbook that checkers are 
> relatively lightweight compared to what the core is doing (I also 
> remember something like a double-digit percentage of time being spent 
> in SymbolReaper alone), but why keep it enabled when we know it'll do 
> nothing? Also, if we know it wouldn't do anything, 
> -analyzer-list-enabled-checkers would be more precise by not 
> displaying it there.
>
> That said, I agree that these are low level issues, and I wouldn't 
> worry much about them.
>
>     Situations when we truly need dependencies are currently fairly
>     exotic. We may run into more such situations when we have more
>     complicated checker interactions (say when we seriously try to
>     model the C++ standard library - i'm pretty sure that's gonna be
>     quite a mess), but for now our hierarchy of checkers is fairly
>     flat in practice.
>
>
> I agree, but I guess when the time comes, an already mature dependency 
> system would be one less thing to worry about, right? :)

Absolutely!!

> Also, we currently have 46 dependencies registered, and have a total 
> of 164 checkers, meaning that a good percentage of them is affected -- 
> would you say that a significant portion of these shouldn't depend on 
> one another? If its not too much trouble, do you have an example on 
> top of your head where you believe its appropriate, and one where its 
> unnecessary?
>
> (You can always see the list of dependencies in <build 
> directory>/tools/clang/include/clang/StaticAnalyzer/Checkers/Checkers.inc)

I mean, let's take, say, MallocChecker. It only does modeling within its 
own isolated GDM trait; it's invisible to other families of checkers. In 
this case if the model is turned off, the checker has no effect - it 
effectively gets silently disabled, we don't need an additional facility 
to get it disabled. Most of the time we also don't need an additional 
facility to make the model enabled when the checker is enabled - because 
it doesn't make much sense to disable the model in the first place. Such 
dependencies should still be there, just for the sake of discipline and 
sanity (and also for sorting out problems with checker names), but they 
aren't an indication that our dependency trees are actually complex 
enough to motivate a facility for dependency management.

On the contrary, the dependency between MoveChecker and SmartPtrModel is 
non-trivial: we made the latter specifically to fix common false 
positives of the former, so we recommend enabling the SmartPtrModel when 
the user wants MoveChecker. This is an example where things truly get 
complicated, and it will get even more complicated in the future. I 
think this is the actual motivation we have for checker dependencies and 
i'm very happy that it's in place.

>>     * Allow checkers to depend on packages.
>>     * Create the hidden(!) coreModeling package, separate all the
>>     modeling from core to it, leaving core as a set of highly
>>     recommended, but no longer mandatory checkers. coreModeling would
>>     be a dependency of all pathsensitive checkers.
>     This is equivalent to annotating path-sensitive checkers as such.
>     This would allow, say, clang-tidy enable path-insensitive checkers
>     without bringing in path-sensitive core checkers.
>
>     I'm worried that i need to remember to add a dependency every time
>     i make a path-sensitive check. Can we enforce it somehow, or even
>     do that automatically? 'Cause at run-time we already know whether
>     any path-sensitive checkers are enabled (by looking at how many
>     subscribers do path-sensitive callbacks have). Can we use that
>     information to automatically bring in path-sensitive core checkers
>     when path-sensitive analysis was requested, or is it too late at
>     this point?
>
>
> Aye, it sounds totally possible! Resolving dependencies has to be done 
> in order, but coreModeling would be a special case -- no checker 
> should directly depend on it, so we would totally be fine registering 
> them after every other checker, depending on 
> CheckerManager::hasPathSensitiveCheckers().
>
>>     I think this would allow users to opt out of any undesired
>>     diagnostics, without the fear of causing instabilities. It would
>>     simultaneously create a far more precise representation of
>>     dependencies, since no checker depends on another's diagnostics.
>>     Also note that while this API change is significant, its also
>>     totally backward-compatible, as nothing would change on the
>>     user-facing side.
>>
>>     Would love to hear your feedback on this!
>>
>>     Cheers,
>>     Kristóf
>>
>>     _______________________________________________
>>     cfe-dev mailing list
>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190813/de35fb38/attachment.html>