diff --git a/www/analyzer/checker_dev_manual.html b/www/analyzer/checker_dev_manual.html index a824953..24388d8 100644 --- a/www/analyzer/checker_dev_manual.html +++ b/www/analyzer/checker_dev_manual.html @@ -33,15 +33,20 @@ for developer guidelines and send your questions and proposals to
Once an idea for a checker has been chosen, there are two key decisions that +need to be made: +
-using namespace clang; -using namespace ento; +void ento::registerSimpleStreamChecker(CheckerManager &mgr) { + mgr.registerChecker<SimpleStreamChecker>(); +} ++
+let ParentPackage = UnixAlpha in { +... +def SimpleStreamChecker : Checker<"SimpleStream">, + HelpText<"Check for misuses of stream APIs">, + DescFile<"SimpleStreamChecker.cpp">; +... +} // end "alpha.unix" ++ +
All checkers inherit from the +Checker template class; the template parameter(s) describe the type of +events that the checker is interested in processing. The various types of events +that are available are described in the file +CheckerDocumentation.cpp -namespace { -class NewChecker: public Checker< check::PreStmt<CallExpr> > { +
For each event type requested, a corresponding callback function must be +defined in the checker class (CheckerDocumentation.cpp shows the +correct function name and signature for each event type). + +
As an example, consider SimpleStreamChecker. This checker needs to +take action at the following times: + +
These events that will be used for each of these actions are, respectively, PreCall, +PostCall, +DeadSymbols, +and PointerEscape. +The high-level structure of the checker's class is thus: + +
+class SimpleStreamChecker : public Checker<check::PreCall, + check::PostCall, + check::DeadSymbols, + check::PointerEscape> { public: - void checkPreStmt(const CallExpr *CE, CheckerContext &Ctx) const {} -} -} -void ento::registerNewChecker(CheckerManager &mgr) { - mgr.registerChecker<NewChecker>(); -} + + void checkPreCall(const CallEvent &Call, CheckerContext &C) const; + + void checkPostCall(const CallEvent &Call, CheckerContext &C) const; + + void checkDeadSymbols(SymbolReaper &SR, CheckerContext &C) const; + + ProgramStateRef checkPointerEscape(ProgramStateRef State, + const InvalidatedSymbols &Escaped, + const CallEvent *Call, + PointerEscapeKind Kind) const; +}; ++ +
Checkers often need to keep track of information specific to the checks they +perform. However, since checkers have no guarantee about the order in which the +program will be explored, or even that all possible paths will be explored, this +state information cannot be kept within individual checkers. Therefore, if +checkers need to store custom information, they need to add new categories of +data to the ProgramState. The preferred way to do so is to use one of +several macros designed for this purpose. They are: + +
All of these macros take as parameters the name to be used for the custom +category of state information and the data type(s) to be used for storage. The +data type(s) specified will become the parameter type and/or return type of the +methods that manipulate the new category of state information. Each of these +methods are templated with the name of the custom data type. + +
For example, a common case is the need to track data associated with a +symbolic expression; a map type is the most logical way to implement this. The +key for this map will be a pointer to a symbolic expression +(SymbolRef). If the data type to be associated with the symbolic +expression is an integer, then the custom category of state information would be +declared as + +
+REGISTER_MAP_WITH_PROGRAMSTATE(ExampleDataType, SymbolRef, int)-
-let ParentPackage = SecurityExperimental in { +ProgramStateRef state; +SymbolRef Sym; ... -def NewChecker : Checker<"NewChecker">, - HelpText<"This text should give a short description of the checks performed.">, - DescFile<"NewChecker.cpp">; +int currentlValue = state->get<ExampleDataType>(Sym); ++ +and set with the function + +
+ProgramStateRef state; +SymbolRef Sym; +int newValue; ... -} // end "security.experimental" +ProgramStateRef newState = state->set<ExampleDataType>(Sym, newValue);-
In addition, the macros define a data type used for storing the data of the +new data category; the name of this type is the name of the data category with +"Ty" appended. For REGISTER_TRAIT_WITH_PROGRAMSTATE, this will simply +be passed data type; for the other three macros, this will be a specialized +version of the llvm::ImmutableList, +llvm::ImmutableSet, +or llvm::ImmutableMap +templated class. For the ExampleDataType example above, the type +created would be equivalent to writing the declaration: -
+typedef llvm::ImmutableMap<SymbolRef, int> ExampleDataTypeTy; +-
These macros will cover a majority of use cases; however, they still have a +few limitations. They cannot be used inside namespaces (since they expand to +contain top-level namespace references), and the data types that they define +cannot be referenced from more than one file. +
Note that ProgramStates are immutable; instead of modifying an existing +one, functions that modify the state will return a copy of the previous state +with the change applied. This updated state must be then provided to the +analyzer core by calling the CheckerContext::addTransition function.
When a checker detects a mistake in the analyzed code, it needs a way to +report it to the analyzer core so that it can be displayed. The two classes used +to construct this report are BugType +and +BugReport. + +
+BugType, as the name would suggest, represents a type of bug. The +constructor for BugType takes two parameters: The name of the bug +type, and the name of the category of the bug. These are used (e.g.) in the +summary page generated by the scan-build tool. + +
+ The BugReport class represents a specific occurrence of a bug. In + the most common case, three parameters are used to form a BugReport: +
In order to obtain the correct ExplodedNode, a decision must be made +as to whether or not analysis can continue along the current path. This decision +is based on whether the detected bug is one that would prevent the program under +analysis from continuing. For example, leaking of a resource should not stop +analysis, as the program can continue to run after the leak. Dereferencing a +null pointer, on the other hand, should stop analysis, as there is no way for +the program to meaningfully continue after such an error. + +
If analysis can continue, then the most recent ExplodedNode can be +passed to the BugReport constructor without additional modification. +This ExplodedNode will be the one returned by the most recent call to CheckerContext::addTransition. +If no transition has been performed during the current callback, then the +ExplodedNode can be obtained from the CheckerContext::getPredecessor +function. + +
If analysis can not continue, then the current state should be transitioned +into a so-called sink node, a node from which no further analysis will be +performed. This is done by calling the +CheckerContext::generateSink function; this function is the same as the +addTransition function, but marks the state as a sink node. Like +addTransition, this returns an ExplodedNode with the updated +state, which can then be passed to the BugReport constructor. + +
+After a BugReport is created, it should be passed to the +EmitReport function of the CheckerContext. +