diff options
author | Anna Zaks <ganna@apple.com> | 2011-12-07 19:04:24 +0000 |
---|---|---|
committer | Anna Zaks <ganna@apple.com> | 2011-12-07 19:04:24 +0000 |
commit | 22d4fb9d687ad38e7a0ee74f58abd65db84fa436 (patch) | |
tree | 0fb9c129c9d642bfe084c0de7b6313f0f419b6c4 /www | |
parent | 84fa05ef805ecfb8b264d4c1984d1d48eec821b5 (diff) |
[analyzer] Update the checker writer manual with explanation of SVals
and the link to checker callback documentation.
SVal, SymExpr, MemRegion description is a slightly edited version of
Ted's reply to a question on cfe-dev list.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@146048 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'www')
-rw-r--r-- | www/analyzer/checker_dev_manual.html | 67 |
1 files changed, 65 insertions, 2 deletions
diff --git a/www/analyzer/checker_dev_manual.html b/www/analyzer/checker_dev_manual.html index c079f33d6d..6fbc1683ae 100644 --- a/www/analyzer/checker_dev_manual.html +++ b/www/analyzer/checker_dev_manual.html @@ -103,7 +103,7 @@ for general developer guidelines and information. </p> <li><tt>GenericDataMap</tt> - constraints on symbolic values </ul> - <p> + <h3>Interaction with Checkers</h3> Checkers are not merely passive receivers of the analyzer core changes - they actively participate in the <tt>ProgramState</tt> construction through the <tt>GenericDataMap</tt> which can be used to store the checker-defined part @@ -113,6 +113,67 @@ for general developer guidelines and information. </p> the checker itself should be stateless.) The checkers are called one after another in the predefined order; thus, calling all the checkers adds a chain to the <tt>ExplodedGraph</tt>. + + <h3>Representing Values</h3> + During symbolic execution, <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SVal.html">SVal</a> + objects are used to represent the semantic evaluation of expressions. They can + represent things like concrete integers, symbolic values, or memory locations + (which are memory regions). They are a discriminated union of "values", + symbolic and otherwise. + <p> + <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymExpr.html">SymExpr</a> (symbol) + is meant to represent abstract, but named, symbolic value. + Symbolic values can have constraints associated with them. Symbols represent + an actual (immutable) value. We might not know what its specific value is, but + we can associate constraints with that value as we analyze a path. + <p> + + <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1MemRegion.html">MemRegion</a> is similar to a symbol. + It is used to provide a lexicon of how to describe abstract memory. Regions can + layer on top of other regions, providing a layered approach to representing memory. + For example, a struct object on the stack might be represented by a <tt>VarRegion</tt>, + but a <tt>FieldRegion</tt> which is a subregion of the <tt>VarRegion</tt> could + be used to represent the memory associated with a specific field of that object. + So how do we represent symbolic memory regions? That's what <a href="http://clang.llvm.org/doxygen/classclang_1_1ento_1_1SymbolicRegion.html">SymbolicRegion</a> + is for. It is a <tt>MemRegion</tt> that has an associated symbol. Since the + symbol is unique and has a unique name; that symbol names the region. + <p> + Let's see how the analyzer processes the expressions in the following example: + <p> + <pre class="code_example"> + int foo(int x) { + int y = x * 2; + int z = x; + ... + } + </pre> + <p> +Let's look at how <tt>x*2</tt> gets evaluated. When <tt>x</tt> is evaluated, +we first construct an <tt>SVal</tt> that represents the lvalue of <tt>x</tt>, in +this case it is an <tt>SVal</tt> that references the <tt>MemRegion</tt> for <tt>x</tt>. +Afterwards, when we do the lvalue-to-rvalue conversion, we get a new <tt>SVal</tt>, +which references the value <b>currently bound</b> to <tt>x</tt>. That value is +symbolic; it's whatever <tt>x</tt> was bound to at the start of the function. +Let's call that symbol <tt>$0</tt>. Similarly, we evaluate the expression for <tt>2</tt>, +and get an <tt>SVal</tt> that references the concrete number <tt>2</tt>. When +we evaluate <tt>x*2</tt>, we take the two <tt>SVals</tt> of the subexpressions, +and create a new <tt>SVal</tt> that represents their multiplication (which in +this case is a new symbolic expression, which we might call <tt>$1</tt>). When we +evaluate the assignment to <tt>y</tt>, we again compute its lvalue (a <tt>MemRegion</tt>), +and then bind the <tt>SVal</tt> for the RHS (which references the symbolic value <tt>$1</tt>) +to the <tt>MemRegion</tt> in the symbolic store. +<br> +The second line is similar. When we evaluate <tt>x</tt> again, we do the same +dance, and create an <tt>SVal</tt> that references the symbol <tt>$0</tt>. Note, two <tt>SVals</tt> +might reference the same underlying values. + +<p> +To summarize, MemRegions are unique names for blocks of memory. Symbols are +unique names for abstract symbolic values. Some MemRegions represents abstract +symbolic chunks of memory, and thus are also based on symbols. SVals are just +references to values, and can reference either MemRegions, Symbols, or concrete +values (e.g., the number 1). + <!-- TODO: Add a picture. <br> @@ -186,7 +247,9 @@ def NewChecker : Checker<"NewChecker">, <h2 id=skeleton>Checker Skeleton</h2> There are two main decisions you need to make: <ul> - <li> Which events the checker should be tracking.</li> + <li> Which events the checker should be tracking. + See <a href="http://clang.llvm.org/doxygen/classento_1_1CheckerDocumentation.html">CheckerDocumentation</a> + for the list of available checker callbacks.</li> <li> What data you want to store as part of the checker-specific program state. Try to minimize the checker state as much as possible. </li> </ul> |