[cfe-commits] r146048 - /cfe/trunk/www/analyzer/checker_dev_manual.html
Anna Zaks
ganna at apple.com
Wed Dec 7 11:04:25 PST 2011
Author: zaks
Date: Wed Dec 7 13:04:24 2011
New Revision: 146048
URL: http://llvm.org/viewvc/llvm-project?rev=146048&view=rev
Log:
[analyzer] Update the checker writer manual with explanation of SVals
and the link to checker callback documentation.
SVal, SymExpr, MemRegion description is a slightly edited version of
Ted's reply to a question on cfe-dev list.
Modified:
cfe/trunk/www/analyzer/checker_dev_manual.html
Modified: cfe/trunk/www/analyzer/checker_dev_manual.html
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/www/analyzer/checker_dev_manual.html?rev=146048&r1=146047&r2=146048&view=diff
==============================================================================
--- cfe/trunk/www/analyzer/checker_dev_manual.html (original)
+++ cfe/trunk/www/analyzer/checker_dev_manual.html Wed Dec 7 13:04:24 2011
@@ -103,7 +103,7 @@
<li><tt>GenericDataMap</tt> - constraints on symbolic values
</ul>
- <p>
+ <h3>Interaction with Checkers</h3>
Checkers are not merely passive receivers of the analyzer core changes - they
actively participate in the <tt>ProgramState</tt> construction through the
<tt>GenericDataMap</tt> which can be used to store the checker-defined part
@@ -113,6 +113,67 @@
the checker itself should be stateless.) The checkers are called one after another
in the predefined order; thus, calling all the checkers adds a chain to the
<tt>ExplodedGraph</tt>.
+
+ <h3>Representing Values</h3>
+ During symbolic execution, <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html">SVal</a>
+ objects are used to represent the semantic evaluation of expressions. They can
+ represent things like concrete integers, symbolic values, or memory locations
+ (which are memory regions). They are a discriminated union of "values",
+ symbolic and otherwise.
+ <p>
+ <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymExpr.html">SymExpr</a> (symbol)
+ is meant to represent abstract, but named, symbolic value.
+ Symbolic values can have constraints associated with them. Symbols represent
+ an actual (immutable) value. We might not know what its specific value is, but
+ we can associate constraints with that value as we analyze a path.
+ <p>
+
+ <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1MemRegion.html">MemRegion</a> is similar to a symbol.
+ It is used to provide a lexicon of how to describe abstract memory. Regions can
+ layer on top of other regions, providing a layered approach to representing memory.
+ For example, a struct object on the stack might be represented by a <tt>VarRegion</tt>,
+ but a <tt>FieldRegion</tt> which is a subregion of the <tt>VarRegion</tt> could
+ be used to represent the memory associated with a specific field of that object.
+ So how do we represent symbolic memory regions? That's what <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymbolicRegion.html">SymbolicRegion</a>
+ is for. It is a <tt>MemRegion</tt> that has an associated symbol. Since the
+ symbol is unique and has a unique name; that symbol names the region.
+ <p>
+ Let's see how the analyzer processes the expressions in the following example:
+ <p>
+ <pre class="code_example">
+ int foo(int x) {
+ int y = x * 2;
+ int z = x;
+ ...
+ }
+ </pre>
+ <p>
+Let's look at how <tt>x*2</tt> gets evaluated. When <tt>x</tt> is evaluated,
+we first construct an <tt>SVal</tt> that represents the lvalue of <tt>x</tt>, in
+this case it is an <tt>SVal</tt> that references the <tt>MemRegion</tt> for <tt>x</tt>.
+Afterwards, when we do the lvalue-to-rvalue conversion, we get a new <tt>SVal</tt>,
+which references the value <b>currently bound</b> to <tt>x</tt>. That value is
+symbolic; it's whatever <tt>x</tt> was bound to at the start of the function.
+Let's call that symbol <tt>$0</tt>. Similarly, we evaluate the expression for <tt>2</tt>,
+and get an <tt>SVal</tt> that references the concrete number <tt>2</tt>. When
+we evaluate <tt>x*2</tt>, we take the two <tt>SVals</tt> of the subexpressions,
+and create a new <tt>SVal</tt> that represents their multiplication (which in
+this case is a new symbolic expression, which we might call <tt>$1</tt>). When we
+evaluate the assignment to <tt>y</tt>, we again compute its lvalue (a <tt>MemRegion</tt>),
+and then bind the <tt>SVal</tt> for the RHS (which references the symbolic value <tt>$1</tt>)
+to the <tt>MemRegion</tt> in the symbolic store.
+<br>
+The second line is similar. When we evaluate <tt>x</tt> again, we do the same
+dance, and create an <tt>SVal</tt> that references the symbol <tt>$0</tt>. Note, two <tt>SVals</tt>
+might reference the same underlying values.
+
+<p>
+To summarize, MemRegions are unique names for blocks of memory. Symbols are
+unique names for abstract symbolic values. Some MemRegions represents abstract
+symbolic chunks of memory, and thus are also based on symbols. SVals are just
+references to values, and can reference either MemRegions, Symbols, or concrete
+values (e.g., the number 1).
+
<!--
TODO: Add a picture.
<br>
@@ -186,7 +247,9 @@
<h2 id=skeleton>Checker Skeleton</h2>
There are two main decisions you need to make:
<ul>
- <li> Which events the checker should be tracking.</li>
+ <li> Which events the checker should be tracking.
+ See <a href="http://clang.llvm.org/doxygen/classento_1_1CheckerDocumentation.html">CheckerDocumentation</a>
+ for the list of available checker callbacks.</li>
<li> What data you want to store as part of the checker-specific program
state. Try to minimize the checker state as much as possible. </li>
</ul>
More information about the cfe-commits
mailing list