From: Ted Kremenek
The CFG class is designed to represent a source-level +control-flow graph for a single statement (Stmt*). Typically +instances of CFG are constructed for function bodies (usually +an instance of CompoundStmt), but can also be instantiated to +represent the control-flow of any class that subclasses Stmt, +which includes simple expressions. Control-flow graphs are especially +useful for performing +flow- +or path-sensitive program analyses on a given function.
+ +Concretely, an instance of CFG is a collection of basic +blocks. Each basic block is an instance of CFGBlock, which +simply contains an ordered sequence of Stmt* (each referring +to statements in the AST). The ordering of statements within a block +indicates unconditional flow of control from one statement to the +next. Conditional control-flow +is represented using edges between basic blocks. The statements +within a given CFGBlock can be traversed using +the CFGBlock::*iterator interface.
+ ++A CFG objects owns the instances of CFGBlock within +the control-flow graph it represents. Each CFGBlock within a +CFG is also uniquely numbered (accessible +via CFGBlock::getBlockID()). Currently the number is +based on the ordering the blocks were created, but no assumptions +should be made on how CFGBlocks are numbered other than their +numbers are unique and that they are numbered from 0..N-1 (where N is +the number of basic blocks in the CFG).
+ +Conditional control-flow (such as those induced by if-statements +and loops) is represented as edges between CFGBlocks. +Because different C language constructs can induce control-flow, +each CFGBlock also records an extra Stmt* that +represents the terminator of the block. A terminator is simply +the statement that caused the control-flow, and is used to identify +the nature of the conditional control-flow between blocks. For +example, in the case of an if-statement, the terminator refers to +the IfStmt object in the AST that represented the given +branch.
+ +To illustrate, consider the following code example:
+ +
+int foo(int x) {
+ x = x + 1;
+
+ if (x > 2) x++;
+ else {
+ x += 2;
+ x *= 2;
+ }
+
+ return x;
+}
+
+
+After invoking the parser+semantic analyzer on this code fragment, +the AST of the body of foo is referenced by a +single Stmt*. We can then construct an instance +of CFG representing the control-flow graph of this function +body by single call to a static class method:
+ +
+ Stmt* FooBody = ...
+ CFG* FooCFG = CFG::buildCFG(FooBody);
+
+
+It is the responsibility of the caller of CFG::buildCFG +to delete the returned CFG* when the CFG is no +longer needed.
+ +Along with providing an interface to iterate over +its CFGBlocks, the CFG class also provides methods +that are useful for debugging and visualizing CFGs. For example, the +method +CFG::dump() dumps a pretty-printed version of the CFG to +standard error. This is especially useful when one is using a +debugger such as gdb. For example, here is the output +of FooCFG->dump():
+ +
+ [ B5 (ENTRY) ]
+ Predecessors (0):
+ Successors (1): B4
+
+ [ B4 ]
+ 1: x = x + 1
+ 2: (x > 2)
+ T: if [B4.2]
+ Predecessors (1): B5
+ Successors (2): B3 B2
+
+ [ B3 ]
+ 1: x++
+ Predecessors (1): B4
+ Successors (1): B1
+
+ [ B2 ]
+ 1: x += 2
+ 2: x *= 2
+ Predecessors (1): B4
+ Successors (1): B1
+
+ [ B1 ]
+ 1: return x;
+ Predecessors (2): B2 B3
+ Successors (1): B0
+
+ [ B0 (EXIT) ]
+ Predecessors (1): B1
+ Successors (0):
+
+
+For each block, the pretty-printed output displays for each block +the number of predecessor blocks (blocks that have outgoing +control-flow to the given block) and successor blocks (blocks +that have control-flow that have incoming control-flow from the given +block). We can also clearly see the special entry and exit blocks at +the beginning and end of the pretty-printed output. For the entry +block (block B5), the number of predecessor blocks is 0, while for the +exit block (block B0) the number of successor blocks is 0.
+ +The most interesting block here is B4, whose outgoing control-flow +represents the branching caused by the sole if-statement +in foo. Of particular interest is the second statement in +the block, (x > 2), and the terminator, printed +as if [B4.2]. The second statement represents the +evaluation of the condition of the if-statement, which occurs before +the actual branching of control-flow. Within the CFGBlock +for B4, the Stmt* for the second statement refers to the +actual expression in the AST for (x > 2). Thus +pointers to subclasses of Expr can appear in the list of +statements in a block, and not just subclasses of Stmt that +refer to proper C statements.
+ +The terminator of block B4 is a pointer to the IfStmt +object in the AST. The pretty-printer outputs if +[B4.2] because the condition expression of the if-statement +has an actual place in the basic block, and thus the terminator is +essentially +referring to the expression that is the second statement of +block B4 (i.e., B4.2). In this manner, conditions for control-flow +(which also includes conditions for loops and switch statements) are +hoisted into the actual basic block.
+ + +