[cfe-dev] [analyzer][RFC] Our stance on checker dependencies and disabling core checkers

Tue Aug 13 16:15:01 PDT 2019

On Wed, 14 Aug 2019 at 00:12, Artem Dergachev <noqnoqneo at gmail.com> wrote:

>
>
> On 8/12/19 7:20 AM, Kristóf Umann via cfe-dev wrote:
>
> Hi!
>
> Our stance has long been that despite being visible to the user, core
> checkers shouldn't be enabled/disabled by hand, because they do important
> modeling that any pathsensitive checker may depend on, potentially causing
> incorrect reports and crashes. However, since then, a series of patches
> gave us the ability to express dependencies and hide modeling checkers.
>
> The core doesn't only do modeling, however, it emits diagnostics as well,
> and these diagnostics may contain false positives, sometimes to the degree
> where getting rid of them is desirable, yet we explicitly state that it
> shouldn't be disabled. And this problem doesn't only affect the core
> itself: disabling any checker that emits diagnostics and also is a
> dependency of some another checker will disable dependent checkers, which
> isn't always the intent.
>
> When I originally implemented the checker dependency system, my immediate
> goal was to fix a bug causing inconsistent checker names, but I firmly
> believe its time to make the it even more rigid:
>
> * Don't allow dependency checkers to emit diagnostics. Since the list of
> these checkers can easily be retrieved through Checkers.td, assertions
> could be used to enforce this. This would solve the issue of forgetting to
> create CheckerProgramPointTags for subcheckers for good, and a helpful
> assertion message could guide checker developers about it.
> * Make all dependency checkers hidden. If no dependent checkers are
> enabled, let the analyzer disable them. Disabling a non-hidden checker
> should only mean that the diagnostics from it are undesired.
>
>
> Back when we were just discussing checker dependencies, i wasn't sure we
> need them. My point was that if we instead split all checkers into checkers
> that emit actual warnings but don't do modeling (and therefore don't need
> to depend on each other) and checkers that only do modeling but never do
> checking (and therefore never need to be turned off), our hierarchy of
> checkers becomes fairly flat and there's no need to write down dependencies
> at all.
>

Sounds about right! In this case, just as the const keyword in a
disciplined codebase, dependencies wouldn't be needed. However, we could
use it to enforce this, among other things, like the use of checker tags.
And now that I think about it, we do really need some sort of dependency
system to keep the checker naming bug in the abyss, though I can't
confidently say this is the only solution. I'll keep this in the back of my
mind and either try to prove that we need it or mention alternatives.

> With these first two bullets we get closer than ever to that solution,
> right?
>

Yup. We would only be able to enforce it with asserts, but I suspect we
have at least a single testcase for each checker that emits a report, so
theoretically, its not even "getting closer", but rather actually nailing
it! Theoretically.

Even if we ever need to turn off modeling-checkers (eg., they have horrible
> bugs in them and we suggest disabling them as a workaround), in most cases
> it's fine to keep their respective checking-checkers on (they'll simply not
> find any bugs: say, MallocChecker will be unable to find bugs if the memory
> is not ever marked as allocated or freed). If it's not fine to keep them on
> - well, just turn them off manually, given that you are already smart
> enough to turn off the modeling-checker.
>

I vaguely remember reading in your workbook that checkers are relatively
lightweight compared to what the core is doing (I also remember something
like a double-digit percentage of time being spent in SymbolReaper alone),
but why keep it enabled when we know it'll do nothing? Also, if we know it
wouldn't do anything, -analyzer-list-enabled-checkers would be more precise
by not displaying it there.

That said, I agree that these are low level issues, and I wouldn't worry
much about them.

> Situations when we truly need dependencies are currently fairly exotic. We
> may run into more such situations when we have more complicated checker
> interactions (say when we seriously try to model the C++ standard library -
> i'm pretty sure that's gonna be quite a mess), but for now our hierarchy of
> checkers is fairly flat in practice.
>

I agree, but I guess when the time comes, an already mature dependency
system would be one less thing to worry about, right? :) Also, we currently
have 46 dependencies registered, and have a total of 164 checkers, meaning
that a good percentage of them is affected -- would you say that a
significant portion of these shouldn't depend on one another? If its not
too much trouble, do you have an example on top of your head where you
believe its appropriate, and one where its unnecessary?

(You can always see the list of dependencies in <build
directory>/tools/clang/include/clang/StaticAnalyzer/Checkers/Checkers.inc)

> * Allow checkers to depend on packages.
> * Create the hidden(!) coreModeling package, separate all the modeling
> from core to it, leaving core as a set of highly recommended, but no longer
> mandatory checkers. coreModeling would be a dependency of all pathsensitive
> checkers.
>
> This is equivalent to annotating path-sensitive checkers as such. This
> would allow, say, clang-tidy enable path-insensitive checkers without
> bringing in path-sensitive core checkers.
>
> I'm worried that i need to remember to add a dependency every time i make
> a path-sensitive check. Can we enforce it somehow, or even do that
> automatically? 'Cause at run-time we already know whether any
> path-sensitive checkers are enabled (by looking at how many subscribers do
> path-sensitive callbacks have). Can we use that information to
> automatically bring in path-sensitive core checkers when path-sensitive
> analysis was requested, or is it too late at this point?
>

Aye, it sounds totally possible! Resolving dependencies has to be done in
order, but coreModeling would be a special case -- no checker should
directly depend on it, so we would totally be fine registering them after
every other checker, depending on
CheckerManager::hasPathSensitiveCheckers().

> I think this would allow users to opt out of any undesired diagnostics,
> without the fear of causing instabilities. It would simultaneously create a
> far more precise representation of dependencies, since no checker depends
> on another's diagnostics. Also note that while this API change is
> significant, its also totally backward-compatible, as nothing would change
> on the user-facing side.
>
> Would love to hear your feedback on this!
>
> Cheers,
> Kristóf
>
> _______________________________________________
> cfe-dev mailing listcfe-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190814/1691330a/attachment.html>