r352470 - [analyzer] Added a checklist to help checker authors and reviewers
Gabor Horvath via cfe-commits
cfe-commits at lists.llvm.org
Tue Jan 29 02:21:49 PST 2019
Author: xazax
Date: Tue Jan 29 02:21:49 2019
New Revision: 352470
URL: http://llvm.org/viewvc/llvm-project?rev=352470&view=rev
Log:
[analyzer] Added a checklist to help checker authors and reviewers
Differential Revision: https://reviews.llvm.org/D52984
Modified:
cfe/trunk/www/analyzer/checker_dev_manual.html
Modified: cfe/trunk/www/analyzer/checker_dev_manual.html
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/www/analyzer/checker_dev_manual.html?rev=352470&r1=352469&r2=352470&view=diff
==============================================================================
--- cfe/trunk/www/analyzer/checker_dev_manual.html (original)
+++ cfe/trunk/www/analyzer/checker_dev_manual.html Tue Jan 29 02:21:49 2019
@@ -675,6 +675,111 @@ to:</p>
(gdb) <b>p C.getPredecessor()->getCodeDecl().getBody()->dump()</b>
</pre>
+<h2 id=links>Making Your Checker Better</h2>
+<ul>
+<li>User facing documentation is important for adoption! Make sure the <a href="/available_checks.html">checker list </a>is updated
+ at the homepage of the analyzer. Also ensure the description is clear to
+ non-analyzer-developers in <tt>Checkers.td</tt>.</li>
+<li>Warning and note messages should be clear and easy to understand, even if a bit long.</li>
+<ul>
+ <li>Messages should start with a capital letter (unlike Clang warnings!) and should not
+ end with <tt>.</tt>.</li>
+ <li>Articles are usually omitted, eg. <tt>Dereference of a null pointer</tt> ->
+ <tt>Dereference of null pointer</tt>.</li>
+ <li>Introduce <tt>BugReporterVisitor</tt>s to emit additional notes that explain the warning
+ to the user better. There are some existing visitors that might be useful for your check,
+ e.g. <tt>trackNullOrUndefValue</tt>. For example, SimpleStreamChecker should highlight
+ the event of opening the file when reporting a file descriptor leak.</li>
+</ul>
+<li>If the check tracks anything in the program state, it needs to implement the
+ <tt>checkDeadSymbols</tt>callback to clean the state up.</li>
+<li>The check should conservatively assume that the program is correct when a tracked symbol
+ is passed to a function that is unknown to the analyzer.
+ <tt>checkPointerEscape</tt> callback could help you handle that case.</li>
+<li>Use safe and convenient APIs!</li>
+<ul>
+ <li>Always use <tt>CheckerContext::generateErrorNode</tt> and
+ <tt>CheckerContext::generateNonFatalErrorNode</tt> for emitting bug reports.
+ Most importantly, never emit report against <tt>CheckerContext::getPredecessor</tt>.</li>
+ <li>Prefer <tt>checkPreCall</tt> and <tt>checkPostCall</tt> to
+ <tt>checkPreStmt<CallExpr></tt> and <tt>checkPostStmt<CallExpr></tt>.</li>
+ <li>Use <tt>CallDescription</tt> to detect hardcoded API calls in the program.</li>
+ <li>Simplify <tt>C.getState()->getSVal(E, C.getLocationContext())</tt> to <tt>C.getSVal(E)</tt>.</li>
+</ul>
+<li>Common sources of crashes:</li>
+<ul>
+ <li><tt>CallEvent::getOriginExpr</tt> is nullable - for example, it returns null for an
+ automatic destructor of a variable. The same applies to some values generated while the
+ call was modeled, eg. <tt>SymbolConjured::getStmt</tt> is nullable.</li>
+ <li><tt>CallEvent::getDecl</tt> is nullable - for example, it returns null for a
+ call of symbolic function pointer.</li>
+ <li><tt>addTransition</tt>, <tt>generateSink</tt>, <tt>generateNonFatalErrorNode</tt>,
+ <tt>generateErrorNode</tt> are nullable because you can transition to a node that you have already visited.</li>
+ <li>Methods of <tt>CallExpr</tt>/<tt>FunctionDecl</tt>/<tt>CallEvent</tt> that
+ return arguments crash when the argument is out-of-bounds. If you checked the function name,
+ it doesn't mean that the function has the expected number of arguments!
+ Which is why you should use <tt>CallDescription</tt>.</li>
+ <li>Nullability of different entities within different kinds of symbols and regions is usually
+ documented via assertions in their constructors.</li>
+ <li><tt>NamedDecl::getName</tt> will fail if the name of the declaration is not a single token,
+ e.g. for destructors. You could use <tt>NamedDecl::getNameAsString</tt> for those cases.
+ Note that this method is much slower and should be used sparringly, e.g. only when generating reports
+ but not during analysis.</li>
+ <li>Is <tt>-analyzer-checker=core</tt> included in all test <tt>RUN:</tt> lines? It was never supported
+ to run the analyzer with the core checks disabled. It might cause unexpected behavior and
+ crashes. You should do all your testing with the core checks enabled.</li>
+</ul>
+</ul>
+<li>Patterns that you should most likely avoid even if they're not technically wrong:</li>
+<ul>
+ <li><tt>BugReporterVisitor</tt> should most likely not match the AST of the current program point
+ to decide when to emit a note. It is much easier to determine that by observing changes in
+ the program state.</li>
+ <li>In <tt>State->getSVal(Region)</tt>, if <tt>Region</tt> is not known to be a <tt>TypedValueRegion</tt>
+ and the optional type argument is not specified, the checker may accidentally try to dereference a
+ void pointer.</li>
+ <li>Checker logic should not depend on whether a certain value is a <tt>Loc</tt> or <tt>NonLoc</tt>.
+ It should be immediately obvious whether the <tt>SVal</tt> is a <tt>Loc</tt> or a
+ <tt>NonLoc</tt> depending on the AST that is being checked. Checking whether a value
+ is <tt>Loc</tt> or <tt>Unknown</tt>/<tt>Undefined</tt> or whether the value is
+ <tt>NonLoc</tt> or <tt>Unknown</tt>/<tt>Undefined</tt> is totally fine.</li>
+ <li>New symbols should not be constructed in the checker via direct calls to <tt>SymbolManager</tt>,
+ unless they are of <tt>SymbolMetadata</tt> class tagged by the checker,
+ or they represent newly created values such as the return value in <tt>evalCall</tt>.
+ For modeling arithmetic/bitwise/comparison operations, <tt>SValBuilder</tt> should be used.</li>
+ <li>Custom <tt>ProgramPointTag</tt>s should not be created within the checker. There is usually
+ no good reason for a checker to chain multiple nodes together, because checkers aren't worklists.</li>
+</ul>
+<li>Checkers are encouraged to actively participate in the analysis by sharing
+ their knowledge about the program state with the rest of the analyzer,
+ but they should not be disrupting the analysis unnecessarily:</li>
+<ul>
+ <li>If a checker splits program state, this must be based on knowledge that
+ the newly appearing branches are definitely possible and worth exploring
+ from the user's perspective. Otherwise the state split should be delayed
+ until there's an indication that one of the paths is taken, or one of the
+ paths needs to be dropped entirely. For example, it is fine to eagerly split
+ paths while modeling <tt>isalpha(x)</tt> as long as <tt>x</tt> is constrained accordingly on
+ each path. At the same time, it is not a good idea to split paths over the
+ return value of <tt>printf()</tt> while modeling the call because nobody ever checks
+ for errors in <tt>printf</tt>; at best, it'd just double the remaining analysis time.
+ </li>
+ <li>Caution is advised when using <tt>CheckerContext::generateNonFatalErrorNode</tt>
+ because it generates an independent transition, much like <tt>addTransition</tt>.
+ It is easy to accidentally split paths while using it. Ideally, try to
+ structure the code so that it was obvious that every <tt>addTransition</tt> or
+ <tt>generateNonFatalErrorNode</tt> (or sequence of such if the split is intended) is
+ immediately followed by return from the checker callback.</li>
+ <li>Multiple implementations of <tt>evalCall</tt> in different checkers should not conflict.</li>
+ <li>When implementing <tt>evalAssume</tt>, the checker should always return a non-null state
+ for either the true assumption or the false assumption (or both).</li>
+ <li>Checkers shall not mutate values of expressions, i.e. use the <tt>ProgramState::BindExpr</tt> API,
+ unless they are fully responsible for computing the value.
+ Under no circumstances should they change non-<tt>Unknown</tt> values of expressions.
+ Currently the only valid use case for this API in checkers is to model the return value in the <tt>evalCall</tt> callback.
+ If expression values are incorrect, <tt>ExprEngine</tt> needs to be fixed instead.</li>
+</ul>
+
<h2 id=additioninformation>Additional Sources of Information</h2>
Here are some additional resources that are useful when working on the Clang
More information about the cfe-commits
mailing list