[cfe-dev] [analyzer][RFC] Our stance on checker dependencies and disabling core checkers

Thu Aug 15 16:42:58 PDT 2019

We seem to agree on that an even more thought-out dependency system would
be good, and that separating checkers into modeling and diagnostics
portions would be ideal, so I'll detail my proposal further:

I googled it, apparently "modelling" is British, and "modeling" is
American, so for this mail I'm just gonna roll with American. :^)

1. Don't allow dependency checkers (and implicitly regard them as modeling
checkers) and modeling checkers to emit diagnostics
  a.) This implies that if we imagine checkers dependencies as a directed
graph where any given checker's parent is the checkers it depends on, any
given checker that emits diagnostics is a leaf and has either no parent, or
it's parent is a modeling checker, and the path leading to the root of the
directed graph contains only modeling checkers. Since Checkers.inc contains
every checker that is a dependency, assert()ing whether the ProgramPointTag
associated with the report is a dependency checker (given that we have at
least one testcase for each checker that emits a report) would enforce that
any report is emitted with the correct tag.
  b.) The problem that this wouldn't solve is modeling subcheckers, Making
addTransition take the tag as a mandatory argument would help on this, but
wouldn't *enforce* it, and woud result in a significant code duplication.
We define subcheckers as checkers that do not have their own checker
objects (they are basically glorified checker options), and I don't even
have any ideas how we could catch these cases. There are two things that
make this a tad bit better: For one, we don't have a checker like that just
yet. For another, this would only affect developers when they are
inspecting the ExplodedGraph, and with some experience, they may be able to
catch and fix this.
  c.) This would pretty much make it so that disabling a checker would mean
suppressing it's diagnostics, nothing else.

2. Making all modeling checkers hidden
  a.) Hidden checkers are defined as development only checkers. Riding on
this excuse, I envision no expectations from end users regarding this. We
mentioned that maybe if a dependency checker gets disabled but if later in
the invocation a dependent checker is enabled it should reenable the
dependency, but I think there is no need to implement this functionality if
we enforce this rule along with the first one. Anybody who disables a
development-only checker should know what that means, and any non-developer
advised to do so is probably doing it to get rid of a crash or similar bad
behavior.
  b.) Modeling checkers really are implementation details, and this move
would make sure that whatever the user enables/disables, the analyzer would
be responsible for any undesired side effects.

3. Allow checkers to depend on packages
Does this need any justification? We should be able to depend on whether
the sun is out or not, but definitely on packages.

4. Create the hidden(!) coreModeling package, separate all the modeling
from core to it, leaving core as a set of highly recommended, but no longer
mandatory checkers. coreModeling would be a dependency of all pathsensitive
checkers.
  a.) Remember how we need to remember to enable core in all test cases
where at least a single pathsensitive checker is enabled? Well, from now
on, **not** enabling core would be an even better idea, because we would
have all the necessary modeling without the obstruction of core
diagnostics. Also, one less thing to keep in mind!
  b.) "Make an interface easy to use correctly, and hard to use
incorrectly" (Scott Meyers): This would be a drastic improvement in that
department.
  c.) Don't allow any checker to directly depend on a coreModeling checker.
This would allow us to evaluate whether there are any pathsensitive
checkers enabled (which we can only do after we loaded everything else into
CheckerManager) and enable coreModeling according to that. This would of
course increase the pressure on codeModeling: if an error in this package
slips into a release, we're screwed, though this has pretty much been the
case ever since, my proposal only formalizes it a little better.

5. Regard some operation modifying the ProgramState as modeling operations,
and regard checkers doing modeling operations as modeling checkers.
  a.) This would both make sure that the ExplodedGraph would remain the
same aside from the diagnostic reports regardless of what non-developer,
non-alpha checkers are enabled. We could come up with other heuristics as
well, such as regarding checkers that add regular range constraints to a
value to be regarded as modeling checkers.
  b.) Regard the creation of sink nodes as modeling operations. I bet we
could somehow combine the function to create regular sink nodes and sink
error nodes to allow bug reports to be both sink nodes and not be regarded
as modeling checkers.
  c.) Modifications of regular range constraints are also modeling
operations.
  d.) Document this well, make adding new modeling operations simple, and
easy to remember for reviewers.

What do you think? This is more of a checklist of things I'd like to see
implemented accompanied with some of my design philosophies, rather than an
order in which I'm planning to implement it.

On Wed, 14 Aug 2019 at 02:27, Kristóf Umann <dkszelethus at gmail.com> wrote:

>
>> Also, we currently have 46 dependencies registered, and have a total of
>> 164 checkers, meaning that a good percentage of them is affected -- would
>> you say that a significant portion of these shouldn't depend on one
>> another? If its not too much trouble, do you have an example on top of your
>> head where you believe its appropriate, and one where its unnecessary?
>>
>> (You can always see the list of dependencies in <build
>> directory>/tools/clang/include/clang/StaticAnalyzer/Checkers/Checkers.inc)
>>
>>
>> I mean, let's take, say, MallocChecker. It only does modeling within its
>> own isolated GDM trait; it's invisible to other families of checkers. In
>> this case if the model is turned off, the checker has no effect - it
>> effectively gets silently disabled, we don't need an additional facility to
>> get it disabled. Most of the time we also don't need an additional facility
>> to make the model enabled when the checker is enabled - because it doesn't
>> make much sense to disable the model in the first place. Such dependencies
>> should still be there, just for the sake of discipline and sanity (and also
>> for sorting out problems with checker names), but they aren't an indication
>> that our dependency trees are actually complex enough to motivate a
>> facility for dependency management.
>>
>
>
>> On the contrary, the dependency between MoveChecker and SmartPtrModel is
>> non-trivial: we made the latter specifically to fix common false positives
>> of the former, so we recommend enabling the SmartPtrModel when the user
>> wants MoveChecker. This is an example where things truly get complicated,
>> and it will get even more complicated in the future. I think this is the
>> actual motivation we have for checker dependencies and i'm very happy that
>> it's in place.
>>
>
> Ah, very interesting. I actually think the contrary -- subcheckers are the
> ones that really drive this dependency development forward, because for
> them, its a feature, it it provides a safe, easy to use interface. Adding
> more functionalities to it could further reduce boilerplate code, improve
> stability and consistency.
>
> On the other hand, MoveChecker's dependence and SmartPtrModel is far
> simpler, as they don't interact at all directly. For them, the now gone
> inter-checker-api header would've been sufficient as long as it gets the
> job done.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190816/0fa3d50f/attachment.html>