From: Anna Zaks Date: Wed, 7 Dec 2011 19:04:24 +0000 (+0000) Subject: [analyzer] Update the checker writer manual with explanation of SVals X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=22d4fb9d687ad38e7a0ee74f58abd65db84fa436;p=clang [analyzer] Update the checker writer manual with explanation of SVals and the link to checker callback documentation. SVal, SymExpr, MemRegion description is a slightly edited version of Ted's reply to a question on cfe-dev list. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@146048 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/www/analyzer/checker_dev_manual.html b/www/analyzer/checker_dev_manual.html index c079f33d6d..6fbc1683ae 100644 --- a/www/analyzer/checker_dev_manual.html +++ b/www/analyzer/checker_dev_manual.html @@ -103,7 +103,7 @@ for general developer guidelines and information.

  • GenericDataMap - constraints on symbolic values -

    +

    Interaction with Checkers

    Checkers are not merely passive receivers of the analyzer core changes - they actively participate in the ProgramState construction through the GenericDataMap which can be used to store the checker-defined part @@ -113,6 +113,67 @@ for general developer guidelines and information.

    the checker itself should be stateless.) The checkers are called one after another in the predefined order; thus, calling all the checkers adds a chain to the ExplodedGraph. + +

    Representing Values

    + During symbolic execution, SVal + objects are used to represent the semantic evaluation of expressions. They can + represent things like concrete integers, symbolic values, or memory locations + (which are memory regions). They are a discriminated union of "values", + symbolic and otherwise. +

    + SymExpr (symbol) + is meant to represent abstract, but named, symbolic value. + Symbolic values can have constraints associated with them. Symbols represent + an actual (immutable) value. We might not know what its specific value is, but + we can associate constraints with that value as we analyze a path. +

    + + MemRegion is similar to a symbol. + It is used to provide a lexicon of how to describe abstract memory. Regions can + layer on top of other regions, providing a layered approach to representing memory. + For example, a struct object on the stack might be represented by a VarRegion, + but a FieldRegion which is a subregion of the VarRegion could + be used to represent the memory associated with a specific field of that object. + So how do we represent symbolic memory regions? That's what SymbolicRegion + is for. It is a MemRegion that has an associated symbol. Since the + symbol is unique and has a unique name; that symbol names the region. +

    + Let's see how the analyzer processes the expressions in the following example: +

    +

    +  int foo(int x) {
    +     int y = x * 2;
    +     int z = x;
    +     ...
    +  }
    +  
    +

    +Let's look at how x*2 gets evaluated. When x is evaluated, +we first construct an SVal that represents the lvalue of x, in +this case it is an SVal that references the MemRegion for x. +Afterwards, when we do the lvalue-to-rvalue conversion, we get a new SVal, +which references the value currently bound to x. That value is +symbolic; it's whatever x was bound to at the start of the function. +Let's call that symbol $0. Similarly, we evaluate the expression for 2, +and get an SVal that references the concrete number 2. When +we evaluate x*2, we take the two SVals of the subexpressions, +and create a new SVal that represents their multiplication (which in +this case is a new symbolic expression, which we might call $1). When we +evaluate the assignment to y, we again compute its lvalue (a MemRegion), +and then bind the SVal for the RHS (which references the symbolic value $1) +to the MemRegion in the symbolic store. +
    +The second line is similar. When we evaluate x again, we do the same +dance, and create an SVal that references the symbol $0. Note, two SVals +might reference the same underlying values. + +

    +To summarize, MemRegions are unique names for blocks of memory. Symbols are +unique names for abstract symbolic values. Some MemRegions represents abstract +symbolic chunks of memory, and thus are also based on symbols. SVals are just +references to values, and can reference either MemRegions, Symbols, or concrete +values (e.g., the number 1). +