From: Chris Lattner
There are several places where constants and constant folding matter a lot to +the Clang front-end. First, in general, we prefer the AST to retain the source +code as close to how the user wrote it as possible. This means that if they +wrote "5+4", we want to keep the addition and two constants in the AST, we don't +want to fold to "9". This means that constant folding in various ways turns +into a tree walk that needs to handle the various cases.
+ +However, there are places in both C and C++ that require constants to be +folded. For example, the C standard defines what an "integer constant +expression" (i-c-e) is with very precise and specific requirements. The +language then requires i-c-e's in a lot of places (for example, the size of a +bitfield, the value for a case statement, etc). For these, we have to be able +to constant fold the constants, to do semantic checks (e.g. verify bitfield size +is non-negative and that case statements aren't duplicated). We aim for Clang +to be very pedantic about this, diagnosing cases when the code does not use an +i-c-e where one is required, but accepting the code unless running with +-pedantic-errors.
+ +Things get a little bit more tricky when it comes to compatibility with +real-world source code. Specifically, GCC has historically accepted a huge +superset of expressions as i-c-e's, and a lot of real world code depends on this +unfortuate accident of history (including, e.g., the glibc system headers). GCC +accepts anything its "fold" optimizer is capable of reducing to an integer +constant, which means that the definition of what it accepts changes as its +optimizer does. One example is that GCC accepts things like "case X-X:" even +when X is a variable, because it can fold this to 0.
+ +Another issue are how constants interact with the extensions we support, such +as __builtin_constant_p, __builtin_inf, __extension__ and many others. C99 +obviously does not specify the semantics of any of these extensions, and the +definition of i-c-e does not include them. However, these extensions are often +used in real code, and we have to have a way to reason about them.
+ +Finally, this is not just a problem for semantic analysis. The code +generator and other clients have to be able to fold constants (e.g. to +initialize global variables) and has to handle a superset of what C99 allows. +Further, these clients can benefit from extended information. For example, we +know that "foo()||1" always evaluates to true, but we can't replace the +expression with true because it has side effects.
+ + +After trying several different approaches, we've finally converged on a +design (Note, at the time of this writing, not all of this has been implemented, +consider this a design goal!). Our basic approach is to define a single +recursive method evaluation method (Expr::Evaluate), which is +implemented in AST/ExprConstant.cpp. Given an expression with 'scalar' +type (integer, fp, complex, or pointer) this method returns the following +information:
+ +This information gives various clients the flexibility that they want, and we +will eventually have some helper methods for various extensions. For example, +Sema should have a Sema::VerifyIntegerConstantExpression method, which +calls Evaluate on the expression. If the expression is not foldable, the error +is emitted, and it would return true. If the expression is not an i-c-e, the +EXTENSION diagnostic is emitted. Finally it would return false to indicate that +the AST is ok.
+ +Other clients can use the information in other ways, for example, codegen can +just use expressions that are foldable in any way.
+ + +This section describes how some of the various extensions clang supports +interacts with constant evaluation:
+ +