From: Sean Silva Date: Wed, 12 Dec 2012 23:44:55 +0000 (+0000) Subject: docs: Convert some docs to reST. X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=3872b46ba9a5275ef0bf4fcefe2d7ef11ce75cc5;p=clang docs: Convert some docs to reST. Converts: LanguageExtensions LibASTMatchers LibTooling PCHInternals ThreadSanitizer Tooling Patch by Mykhailo Pustovit! (with minor edits by Dmitri Gribenko and Sean Silva) git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@170048 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/docs/LanguageExtensions.html b/docs/LanguageExtensions.html deleted file mode 100644 index 8c0e5b7ffc..0000000000 --- a/docs/LanguageExtensions.html +++ /dev/null @@ -1,2082 +0,0 @@ - - - - - - Clang Language Extensions - - - - - - - - -
- -

Clang Language Extensions

- - - - -

Introduction

- - -

This document describes the language extensions provided by Clang. In -addition to the language extensions listed here, Clang aims to support a broad -range of GCC extensions. Please see the GCC manual for -more information on these extensions.

- - -

Feature Checking Macros

- - -

Language extensions can be very useful, but only if you know you can depend -on them. In order to allow fine-grain features checks, we support three builtin -function-like macros. This allows you to directly test for a feature in your -code without having to resort to something like autoconf or fragile "compiler -version checks".

- - -

__has_builtin

- - -

This function-like macro takes a single identifier argument that is the name -of a builtin function. It evaluates to 1 if the builtin is supported or 0 if -not. It can be used like this:

- -
-
-#ifndef __has_builtin         // Optional of course.
-  #define __has_builtin(x) 0  // Compatibility with non-clang compilers.
-#endif
-
-...
-#if __has_builtin(__builtin_trap)
-  __builtin_trap();
-#else
-  abort();
-#endif
-...
-
-
- - - -

__has_feature and __has_extension

- - -

These function-like macros take a single identifier argument that is the -name of a feature. __has_feature evaluates to 1 if the feature -is both supported by Clang and standardized in the current language standard -or 0 if not (but see below), while -__has_extension evaluates to 1 if the feature is supported by -Clang in the current language (either as a language extension or a standard -language feature) or 0 if not. They can be used like this:

- -
-
-#ifndef __has_feature         // Optional of course.
-  #define __has_feature(x) 0  // Compatibility with non-clang compilers.
-#endif
-#ifndef __has_extension
-  #define __has_extension __has_feature // Compatibility with pre-3.0 compilers.
-#endif
-
-...
-#if __has_feature(cxx_rvalue_references)
-// This code will only be compiled with the -std=c++11 and -std=gnu++11
-// options, because rvalue references are only standardized in C++11.
-#endif
-
-#if __has_extension(cxx_rvalue_references)
-// This code will be compiled with the -std=c++11, -std=gnu++11, -std=c++98
-// and -std=gnu++98 options, because rvalue references are supported as a
-// language extension in C++98.
-#endif
-
-
- -

For backwards compatibility reasons, -__has_feature can also be used to test for support for -non-standardized features, i.e. features not prefixed c_, -cxx_ or objc_.

- -

-Another use of __has_feature is to check for compiler features -not related to the language standard, such as e.g. -AddressSanitizer. - -

If the -pedantic-errors option is given, -__has_extension is equivalent to __has_feature.

- -

The feature tag is described along with the language feature below.

- -

The feature name or extension name can also be specified with a preceding and -following __ (double underscore) to avoid interference from a macro -with the same name. For instance, __cxx_rvalue_references__ can be -used instead of cxx_rvalue_references.

- - -

__has_attribute

- - -

This function-like macro takes a single identifier argument that is the name -of an attribute. It evaluates to 1 if the attribute is supported or 0 if not. It -can be used like this:

- -
-
-#ifndef __has_attribute         // Optional of course.
-  #define __has_attribute(x) 0  // Compatibility with non-clang compilers.
-#endif
-
-...
-#if __has_attribute(always_inline)
-#define ALWAYS_INLINE __attribute__((always_inline))
-#else
-#define ALWAYS_INLINE
-#endif
-...
-
-
- -

The attribute name can also be specified with a preceding and -following __ (double underscore) to avoid interference from a macro -with the same name. For instance, __always_inline__ can be used -instead of always_inline.

- - -

Include File Checking Macros

- - -

Not all developments systems have the same include files. -The __has_include and -__has_include_next macros allow you to -check for the existence of an include file before doing -a possibly failing #include directive.

- - -

__has_include

- - -

This function-like macro takes a single file name string argument that -is the name of an include file. It evaluates to 1 if the file can -be found using the include paths, or 0 otherwise:

- -
-
-// Note the two possible file name string formats.
-#if __has_include("myinclude.h") && __has_include(<stdint.h>)
-# include "myinclude.h"
-#endif
-
-// To avoid problem with non-clang compilers not having this macro.
-#if defined(__has_include) && __has_include("myinclude.h")
-# include "myinclude.h"
-#endif
-
-
- -

To test for this feature, use #if defined(__has_include).

- - -

__has_include_next

- - -

This function-like macro takes a single file name string argument that -is the name of an include file. It is like __has_include except that it -looks for the second instance of the given file found in the include -paths. It evaluates to 1 if the second instance of the file can -be found using the include paths, or 0 otherwise:

- -
-
-// Note the two possible file name string formats.
-#if __has_include_next("myinclude.h") && __has_include_next(<stdint.h>)
-# include_next "myinclude.h"
-#endif
-
-// To avoid problem with non-clang compilers not having this macro.
-#if defined(__has_include_next) && __has_include_next("myinclude.h")
-# include_next "myinclude.h"
-#endif
-
-
- -

Note that __has_include_next, like the GNU extension -#include_next directive, is intended for use in headers only, -and will issue a warning if used in the top-level compilation -file. A warning will also be issued if an absolute path -is used in the file argument.

- - - -

__has_warning

- - -

This function-like macro takes a string literal that represents a command - line option for a warning and returns true if that is a valid warning - option.

- -
-
-#if __has_warning("-Wformat")
-...
-#endif
-
-
- - -

Builtin Macros

- - -
-
__BASE_FILE__
-
Defined to a string that contains the name of the main input - file passed to Clang.
- -
__COUNTER__
-
Defined to an integer value that starts at zero and is - incremented each time the __COUNTER__ macro is - expanded.
- -
__INCLUDE_LEVEL__
-
Defined to an integral value that is the include depth of the - file currently being translated. For the main file, this value is - zero.
- -
__TIMESTAMP__
-
Defined to the date and time of the last modification of the - current source file.
- -
__clang__
-
Defined when compiling with Clang
- -
__clang_major__
-
Defined to the major marketing version number of Clang (e.g., the - 2 in 2.0.1). Note that marketing version numbers should not be used to - check for language features, as different vendors use different numbering - schemes. Instead, use the feature checking - macros.
- -
__clang_minor__
-
Defined to the minor version number of Clang (e.g., the 0 in - 2.0.1). Note that marketing version numbers should not be used to - check for language features, as different vendors use different numbering - schemes. Instead, use the feature checking - macros.
- -
__clang_patchlevel__
-
Defined to the marketing patch level of Clang (e.g., the 1 in 2.0.1).
- -
__clang_version__
-
Defined to a string that captures the Clang marketing version, including - the Subversion tag or revision number, e.g., "1.5 (trunk 102332)".
-
- - -

Vectors and Extended Vectors

- - -

Supports the GCC, OpenCL, AltiVec and NEON vector extensions.

- -

OpenCL vector types are created using ext_vector_type attribute. It -support for V.xyzw syntax and other tidbits as seen in OpenCL. An -example is:

- -
-
-typedef float float4 __attribute__((ext_vector_type(4)));
-typedef float float2 __attribute__((ext_vector_type(2)));
-
-float4 foo(float2 a, float2 b) {
-  float4 c;
-  c.xz = a;
-  c.yw = b;
-  return c;
-}
-
-
- -

Query for this feature with -__has_extension(attribute_ext_vector_type).

- -

Giving -faltivec option to clang enables support for AltiVec vector -syntax and functions. For example:

- -
-
-vector float foo(vector int a) { 
-  vector int b;
-  b = vec_add(a, a) + a; 
-  return (vector float)b;
-}
-
-
- -

NEON vector types are created using neon_vector_type and -neon_polyvector_type attributes. For example:

- -
-
-typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t;
-typedef __attribute__((neon_polyvector_type(16))) poly8_t poly8x16_t;
-
-int8x8_t foo(int8x8_t a) {
-  int8x8_t v;
-  v = a;
-  return v;
-}
-
-
- - -

Vector Literals

- - -

Vector literals can be used to create vectors from a set of scalars, or -vectors. Either parentheses or braces form can be used. In the parentheses form -the number of literal values specified must be one, i.e. referring to a scalar -value, or must match the size of the vector type being created. If a single -scalar literal value is specified, the scalar literal value will be replicated -to all the components of the vector type. In the brackets form any number of -literals can be specified. For example:

- -
-
-typedef int v4si __attribute__((__vector_size__(16)));
-typedef float float4 __attribute__((ext_vector_type(4)));
-typedef float float2 __attribute__((ext_vector_type(2)));
-
-v4si vsi = (v4si){1, 2, 3, 4};
-float4 vf = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
-vector int vi1 = (vector int)(1);    // vi1 will be (1, 1, 1, 1).
-vector int vi2 = (vector int){1};    // vi2 will be (1, 0, 0, 0).
-vector int vi3 = (vector int)(1, 2); // error
-vector int vi4 = (vector int){1, 2}; // vi4 will be (1, 2, 0, 0).
-vector int vi5 = (vector int)(1, 2, 3, 4);
-float4 vf = (float4)((float2)(1.0f, 2.0f), (float2)(3.0f, 4.0f));
-
-
- - -

Vector Operations

- - -

The table below shows the support for each operation by vector extension. -A dash indicates that an operation is not accepted according to a corresponding -specification.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
OperatorOpenCLAltiVecGCCNEON
[]yesyesyes-
unary operators +, -yesyesyes-
++, --yesyes--
+, -, *, /, %yesyesyes-
bitwise operators &, |, ^, ~yesyesyes-
>>, <<yesyesyes-
!, &&,||no---
==,!=, >, <, >=, <=yesyes--
=yesyesyesyes
:?yes---
sizeofyesyesyesyes
- -

See also __builtin_shufflevector.

- - -

Messages on deprecated and unavailable Attributes

- - -

An optional string message can be added to the deprecated -and unavailable attributes. For example:

- -
-
void explode(void) __attribute__((deprecated("extremely unsafe, use 'combust' instead!!!")));
-
- -

If the deprecated or unavailable declaration is used, the message -will be incorporated into the appropriate diagnostic:

- -
-
harmless.c:4:3: warning: 'explode' is deprecated: extremely unsafe, use 'combust' instead!!!
-      [-Wdeprecated-declarations]
-  explode();
-  ^
-
- -

Query for this feature -with __has_extension(attribute_deprecated_with_message) -and __has_extension(attribute_unavailable_with_message).

- - -

Attributes on Enumerators

- - -

Clang allows attributes to be written on individual enumerators. -This allows enumerators to be deprecated, made unavailable, etc. The -attribute must appear after the enumerator name and before any -initializer, like so:

- -
-
enum OperationMode {
-  OM_Invalid,
-  OM_Normal,
-  OM_Terrified __attribute__((deprecated)),
-  OM_AbortOnError __attribute__((deprecated)) = 4
-};
-
- -

Attributes on the enum declaration do not apply to -individual enumerators.

- -

Query for this feature with __has_extension(enumerator_attributes).

- - -

'User-Specified' System Frameworks

- - -

Clang provides a mechanism by which frameworks can be built in such a way -that they will always be treated as being 'system frameworks', even if they are -not present in a system framework directory. This can be useful to system -framework developers who want to be able to test building other applications -with development builds of their framework, including the manner in which the -compiler changes warning behavior for system headers.

- -

Framework developers can opt-in to this mechanism by creating a -'.system_framework' file at the top-level of their framework. That is, the -framework should have contents like:

- -
- .../TestFramework.framework
- .../TestFramework.framework/.system_framework
- .../TestFramework.framework/Headers
- .../TestFramework.framework/Headers/TestFramework.h
- ...
-
- -

Clang will treat the presence of this file as an indicator that the framework -should be treated as a system framework, regardless of how it was found in the -framework search path. For consistency, we recommend that such files never be -included in installed versions of the framework.

- - -

Availability attribute

- - -

Clang introduces the availability attribute, which can -be placed on declarations to describe the lifecycle of that -declaration relative to operating system versions. Consider the function declaration for a hypothetical function f:

- -
-void f(void) __attribute__((availability(macosx,introduced=10.4,deprecated=10.6,obsoleted=10.7)));
-
- -

The availability attribute states that f was introduced in Mac OS X 10.4, deprecated in Mac OS X 10.6, and obsoleted in Mac OS X 10.7. This information is used by Clang to determine when it is safe to use f: for example, if Clang is instructed to compile code for Mac OS X 10.5, a call to f() succeeds. If Clang is instructed to compile code for Mac OS X 10.6, the call succeeds but Clang emits a warning specifying that the function is deprecated. Finally, if Clang is instructed to compile code for Mac OS X 10.7, the call fails because f() is no longer available.

- -

The availablility attribute is a comma-separated list starting with the platform name and then including clauses specifying important milestones in the declaration's lifetime (in any order) along with additional information. Those clauses can be:

- -
-
introduced=version
-
The first version in which this declaration was introduced.
- -
deprecated=version
-
The first version in which this declaration was deprecated, meaning that users should migrate away from this API.
- -
obsoleted=version
-
The first version in which this declaration was obsoleted, meaning that it was removed completely and can no longer be used.
- -
unavailable
-
This declaration is never available on this platform.
- -
message=string-literal
-
Additional message text that Clang will provide when emitting a warning or error about use of a deprecated or obsoleted declaration. Useful to direct users to replacement APIs.
-
- -

Multiple availability attributes can be placed on a declaration, which may correspond to different platforms. Only the availability attribute with the platform corresponding to the target platform will be used; any others will be ignored. If no availability attribute specifies availability for the current target platform, the availability attributes are ignored. Supported platforms are:

- -
-
ios
-
Apple's iOS operating system. The minimum deployment target is specified by the -mios-version-min=version or -miphoneos-version-min=version command-line arguments.
- -
macosx
-
Apple's Mac OS X operating system. The minimum deployment target is specified by the -mmacosx-version-min=version command-line argument.
-
- -

A declaration can be used even when deploying back to a platform -version prior to when the declaration was introduced. When this -happens, the declaration is weakly -linked, as if the weak_import attribute were added to the declaration. A weakly-linked declaration may or may not be present a run-time, and a program can determine whether the declaration is present by checking whether the address of that declaration is non-NULL.

- - -

Checks for Standard Language Features

- - -

The __has_feature macro can be used to query if certain standard -language features are enabled. The __has_extension macro can be used -to query if language features are available as an extension when compiling for -a standard which does not provide them. The features which can be tested are -listed here.

- -

C++98

- -

The features listed below are part of the C++98 standard. These features are -enabled by default when compiling C++ code.

- -

C++ exceptions

- -

Use __has_feature(cxx_exceptions) to determine if C++ exceptions have been enabled. For -example, compiling code with -fno-exceptions disables C++ exceptions.

- -

C++ RTTI

- -

Use __has_feature(cxx_rtti) to determine if C++ RTTI has been enabled. For example, -compiling code with -fno-rtti disables the use of RTTI.

- -

C++11

- -

The features listed below are part of the C++11 standard. As a result, all -these features are enabled with the -std=c++11 or -std=gnu++11 -option when compiling C++ code.

- -

C++11 SFINAE includes access control

- -

Use __has_feature(cxx_access_control_sfinae) or __has_extension(cxx_access_control_sfinae) to determine whether access-control errors (e.g., calling a private constructor) are considered to be template argument deduction errors (aka SFINAE errors), per C++ DR1170.

- -

C++11 alias templates

- -

Use __has_feature(cxx_alias_templates) or -__has_extension(cxx_alias_templates) to determine if support for -C++11's alias declarations and alias templates is enabled.

- -

C++11 alignment specifiers

- -

Use __has_feature(cxx_alignas) or -__has_extension(cxx_alignas) to determine if support for alignment -specifiers using alignas is enabled.

- -

C++11 attributes

- -

Use __has_feature(cxx_attributes) or -__has_extension(cxx_attributes) to determine if support for attribute -parsing with C++11's square bracket notation is enabled.

- -

C++11 generalized constant expressions

- -

Use __has_feature(cxx_constexpr) to determine if support -for generalized constant expressions (e.g., constexpr) is -enabled.

- -

C++11 decltype()

- -

Use __has_feature(cxx_decltype) or -__has_extension(cxx_decltype) to determine if support for the -decltype() specifier is enabled. C++11's decltype -does not require type-completeness of a function call expression. -Use __has_feature(cxx_decltype_incomplete_return_types) -or __has_extension(cxx_decltype_incomplete_return_types) -to determine if support for this feature is enabled.

- -

C++11 default template arguments in function templates

- -

Use __has_feature(cxx_default_function_template_args) or -__has_extension(cxx_default_function_template_args) to determine -if support for default template arguments in function templates is enabled.

- -

C++11 defaulted functions

- -

Use __has_feature(cxx_defaulted_functions) or -__has_extension(cxx_defaulted_functions) to determine if support for -defaulted function definitions (with = default) is enabled.

- -

C++11 delegating constructors

- -

Use __has_feature(cxx_delegating_constructors) to determine if -support for delegating constructors is enabled.

- -

C++11 deleted functions

- -

Use __has_feature(cxx_deleted_functions) or -__has_extension(cxx_deleted_functions) to determine if support for -deleted function definitions (with = delete) is enabled.

- -

C++11 explicit conversion functions

-

Use __has_feature(cxx_explicit_conversions) to determine if support for explicit conversion functions is enabled.

- -

C++11 generalized initializers

- -

Use __has_feature(cxx_generalized_initializers) to determine if -support for generalized initializers (using braced lists and -std::initializer_list) is enabled.

- -

C++11 implicit move constructors/assignment operators

- -

Use __has_feature(cxx_implicit_moves) to determine if Clang will -implicitly generate move constructors and move assignment operators where needed.

- -

C++11 inheriting constructors

- -

Use __has_feature(cxx_inheriting_constructors) to determine if support for inheriting constructors is enabled. Clang does not currently implement this feature.

- -

C++11 inline namespaces

- -

Use __has_feature(cxx_inline_namespaces) or -__has_extension(cxx_inline_namespaces) to determine if support for -inline namespaces is enabled.

- -

C++11 lambdas

- -

Use __has_feature(cxx_lambdas) or -__has_extension(cxx_lambdas) to determine if support for lambdas -is enabled.

- -

C++11 local and unnamed types as template arguments

- -

Use __has_feature(cxx_local_type_template_args) or -__has_extension(cxx_local_type_template_args) to determine if -support for local and unnamed types as template arguments is enabled.

- -

C++11 noexcept

- -

Use __has_feature(cxx_noexcept) or -__has_extension(cxx_noexcept) to determine if support for noexcept -exception specifications is enabled.

- -

C++11 in-class non-static data member initialization

- -

Use __has_feature(cxx_nonstatic_member_init) to determine whether in-class initialization of non-static data members is enabled.

- -

C++11 nullptr

- -

Use __has_feature(cxx_nullptr) or -__has_extension(cxx_nullptr) to determine if support for -nullptr is enabled.

- -

C++11 override control

- -

Use __has_feature(cxx_override_control) or -__has_extension(cxx_override_control) to determine if support for -the override control keywords is enabled.

- -

C++11 reference-qualified functions

-

Use __has_feature(cxx_reference_qualified_functions) or -__has_extension(cxx_reference_qualified_functions) to determine -if support for reference-qualified functions (e.g., member functions with -& or && applied to *this) -is enabled.

- -

C++11 range-based for loop

- -

Use __has_feature(cxx_range_for) or -__has_extension(cxx_range_for) to determine if support for the -range-based for loop is enabled.

- -

C++11 raw string literals

-

Use __has_feature(cxx_raw_string_literals) to determine if support -for raw string literals (e.g., R"x(foo\bar)x") is enabled.

- -

C++11 rvalue references

- -

Use __has_feature(cxx_rvalue_references) or -__has_extension(cxx_rvalue_references) to determine if support for -rvalue references is enabled.

- -

C++11 static_assert()

- -

Use __has_feature(cxx_static_assert) or -__has_extension(cxx_static_assert) to determine if support for -compile-time assertions using static_assert is enabled.

- -

C++11 type inference

- -

Use __has_feature(cxx_auto_type) or -__has_extension(cxx_auto_type) to determine C++11 type inference is -supported using the auto specifier. If this is disabled, auto -will instead be a storage class specifier, as in C or C++98.

- -

C++11 strongly typed enumerations

- -

Use __has_feature(cxx_strong_enums) or -__has_extension(cxx_strong_enums) to determine if support for -strongly typed, scoped enumerations is enabled.

- -

C++11 trailing return type

- -

Use __has_feature(cxx_trailing_return) or -__has_extension(cxx_trailing_return) to determine if support for the -alternate function declaration syntax with trailing return type is enabled.

- -

C++11 Unicode string literals

-

Use __has_feature(cxx_unicode_literals) to determine if -support for Unicode string literals is enabled.

- -

C++11 unrestricted unions

- -

Use __has_feature(cxx_unrestricted_unions) to determine if support for unrestricted unions is enabled.

- -

C++11 user-defined literals

- -

Use __has_feature(cxx_user_literals) to determine if support for user-defined literals is enabled.

- -

C++11 variadic templates

- -

Use __has_feature(cxx_variadic_templates) or -__has_extension(cxx_variadic_templates) to determine if support -for variadic templates is enabled.

- -

C11

- -

The features listed below are part of the C11 standard. As a result, all -these features are enabled with the -std=c11 or -std=gnu11 -option when compiling C code. Additionally, because these features are all -backward-compatible, they are available as extensions in all language modes.

- -

C11 alignment specifiers

- -

Use __has_feature(c_alignas) or __has_extension(c_alignas) -to determine if support for alignment specifiers using _Alignas -is enabled.

- -

C11 atomic operations

- -

Use __has_feature(c_atomic) or __has_extension(c_atomic) -to determine if support for atomic types using _Atomic is enabled. -Clang also provides a set of builtins which can be -used to implement the <stdatomic.h> operations on -_Atomic types.

- -

C11 generic selections

- -

Use __has_feature(c_generic_selections) or -__has_extension(c_generic_selections) to determine if support for -generic selections is enabled.

- -

As an extension, the C11 generic selection expression is available in all -languages supported by Clang. The syntax is the same as that given in the -C11 standard.

- -

In C, type compatibility is decided according to the rules given in the -appropriate standard, but in C++, which lacks the type compatibility rules -used in C, types are considered compatible only if they are equivalent.

- -

C11 _Static_assert()

- -

Use __has_feature(c_static_assert) or -__has_extension(c_static_assert) to determine if support for -compile-time assertions using _Static_assert is enabled.

- - -

Checks for Type Traits

- - -

Clang supports the GNU C++ type traits and a subset of the Microsoft Visual C++ Type traits. For each supported type trait __X, __has_extension(X) indicates the presence of the type trait. For example: -

-
-#if __has_extension(is_convertible_to)
-template<typename From, typename To>
-struct is_convertible_to {
-  static const bool value = __is_convertible_to(From, To);
-};
-#else
-// Emulate type trait
-#endif
-
-
- -

The following type traits are supported by Clang:

- - - -

Blocks

- - -

The syntax and high level language feature description is in BlockLanguageSpec.txt. Implementation and ABI -details for the clang implementation are in Block-ABI-Apple.txt.

- - -

Query for this feature with __has_extension(blocks).

- - -

Objective-C Features

- - -

Related result types

- -

According to Cocoa conventions, Objective-C methods with certain names ("init", "alloc", etc.) always return objects that are an instance of the receiving class's type. Such methods are said to have a "related result type", meaning that a message send to one of these methods will have the same static type as an instance of the receiver class. For example, given the following classes:

- -
-
-@interface NSObject
-+ (id)alloc;
-- (id)init;
-@end
-
-@interface NSArray : NSObject
-@end
-
-
- -

and this common initialization pattern

- -
-
-NSArray *array = [[NSArray alloc] init];
-
-
- -

the type of the expression [NSArray alloc] is -NSArray* because alloc implicitly has a -related result type. Similarly, the type of the expression -[[NSArray alloc] init] is NSArray*, since -init has a related result type and its receiver is known -to have the type NSArray *. If neither alloc nor init had a related result type, the expressions would have had type id, as declared in the method signature.

- -

A method with a related result type can be declared by using the -type instancetype as its result type. instancetype -is a contextual keyword that is only permitted in the result type of -an Objective-C method, e.g.

- -
-@interface A
-+ (instancetype)constructAnA;
-@end
-
- -

The related result type can also be inferred for some methods. -To determine whether a method has an inferred related result type, the first -word in the camel-case selector (e.g., "init" in "initWithObjects") is -considered, and the method will have a related result type if its return -type is compatible with the type of its class and if

- - - -

If a method with a related result type is overridden by a subclass -method, the subclass method must also return a type that is compatible -with the subclass type. For example:

- -
-
-@interface NSString : NSObject
-- (NSUnrelated *)init; // incorrect usage: NSUnrelated is not NSString or a superclass of NSString
-@end
-
-
- -

Related result types only affect the type of a message send or -property access via the given method. In all other respects, a method -with a related result type is treated the same way as method that -returns id.

- -

Use __has_feature(objc_instancetype) to determine whether -the instancetype contextual keyword is available.

- - -

Automatic reference counting

- - -

Clang provides support for automated reference counting in Objective-C, which eliminates the need for manual retain/release/autorelease message sends. There are two feature macros associated with automatic reference counting: __has_feature(objc_arc) indicates the availability of automated reference counting in general, while __has_feature(objc_arc_weak) indicates that automated reference counting also includes support for __weak pointers to Objective-C objects.

- - -

Enumerations with a fixed underlying type

- - -

Clang provides support for C++11 enumerations with a fixed -underlying type within Objective-C. For example, one can write an -enumeration type as:

- -
-typedef enum : unsigned char { Red, Green, Blue } Color;
-
- -

This specifies that the underlying type, which is used to store the -enumeration value, is unsigned char.

- -

Use __has_feature(objc_fixed_enum) to determine whether -support for fixed underlying types is available in Objective-C.

- - -

Interoperability with C++11 lambdas

- - -

Clang provides interoperability between C++11 lambdas and -blocks-based APIs, by permitting a lambda to be implicitly converted -to a block pointer with the corresponding signature. For example, -consider an API such as NSArray's array-sorting -method:

- -
 - (NSArray *)sortedArrayUsingComparator:(NSComparator)cmptr; 
- -

NSComparator is simply a typedef for the block pointer -NSComparisonResult (^)(id, id), and parameters of this -type are generally provided with block literals as arguments. However, -one can also use a C++11 lambda so long as it provides the same -signature (in this case, accepting two parameters of type -id and returning an NSComparisonResult):

- -
-  NSArray *array = @[@"string 1", @"string 21", @"string 12", @"String 11",
-                     @"String 02"];
-  const NSStringCompareOptions comparisonOptions
-    = NSCaseInsensitiveSearch | NSNumericSearch |
-      NSWidthInsensitiveSearch | NSForcedOrderingSearch;
-  NSLocale *currentLocale = [NSLocale currentLocale];
-  NSArray *sorted 
-    = [array sortedArrayUsingComparator:[=](id s1, id s2) -> NSComparisonResult {
-               NSRange string1Range = NSMakeRange(0, [s1 length]);
-               return [s1 compare:s2 options:comparisonOptions 
-                          range:string1Range locale:currentLocale];
-       }];
-  NSLog(@"sorted: %@", sorted);
-
- -

This code relies on an implicit conversion from the type of the -lambda expression (an unnamed, local class type called the closure -type) to the corresponding block pointer type. The conversion -itself is expressed by a conversion operator in that closure type -that produces a block pointer with the same signature as the lambda -itself, e.g.,

- -
-  operator NSComparisonResult (^)(id, id)() const;
-
- -

This conversion function returns a new block that simply forwards -the two parameters to the lambda object (which it captures by copy), -then returns the result. The returned block is first copied (with -Block_copy) and then autoreleased. As an optimization, if a -lambda expression is immediately converted to a block pointer (as in -the first example, above), then the block is not copied and -autoreleased: rather, it is given the same lifetime as a block literal -written at that point in the program, which avoids the overhead of -copying a block to the heap in the common case.

- -

The conversion from a lambda to a block pointer is only available -in Objective-C++, and not in C++ with blocks, due to its use of -Objective-C memory management (autorelease).

- - -

Object Literals and Subscripting

- - -

Clang provides support for Object Literals -and Subscripting in Objective-C, which simplifies common Objective-C -programming patterns, makes programs more concise, and improves the safety of -container creation. There are several feature macros associated with object -literals and subscripting: __has_feature(objc_array_literals) -tests the availability of array literals; -__has_feature(objc_dictionary_literals) tests the availability of -dictionary literals; __has_feature(objc_subscripting) tests the -availability of object subscripting.

- - -

Objective-C Autosynthesis of Properties

- - -

Clang provides support for autosynthesis of declared properties. Using this -feature, clang provides default synthesis of those properties not declared @dynamic -and not having user provided backing getter and setter methods. -__has_feature(objc_default_synthesize_properties) checks for availability -of this feature in version of clang being used.

- - -

Function Overloading in C

- - -

Clang provides support for C++ function overloading in C. Function -overloading in C is introduced using the overloadable attribute. For -example, one might provide several overloaded versions of a tgsin -function that invokes the appropriate standard function computing the sine of a -value with float, double, or long double -precision:

- -
-
-#include <math.h>
-float __attribute__((overloadable)) tgsin(float x) { return sinf(x); }
-double __attribute__((overloadable)) tgsin(double x) { return sin(x); }
-long double __attribute__((overloadable)) tgsin(long double x) { return sinl(x); }
-
-
- -

Given these declarations, one can call tgsin with a -float value to receive a float result, with a -double to receive a double result, etc. Function -overloading in C follows the rules of C++ function overloading to pick -the best overload given the call arguments, with a few C-specific -semantics:

- - -

The declaration of overloadable functions is restricted to -function declarations and definitions. Most importantly, if any -function with a given name is given the overloadable -attribute, then all function declarations and definitions with that -name (and in that scope) must have the overloadable -attribute. This rule even applies to redeclarations of functions whose original -declaration had the overloadable attribute, e.g.,

- -
-
-int f(int) __attribute__((overloadable));
-float f(float); // error: declaration of "f" must have the "overloadable" attribute
-
-int g(int) __attribute__((overloadable));
-int g(int) { } // error: redeclaration of "g" must also have the "overloadable" attribute
-
-
- -

Functions marked overloadable must have -prototypes. Therefore, the following code is ill-formed:

- -
-
-int h() __attribute__((overloadable)); // error: h does not have a prototype
-
-
- -

However, overloadable functions are allowed to use a -ellipsis even if there are no named parameters (as is permitted in C++). This feature is particularly useful when combined with the unavailable attribute:

- -
-
-void honeypot(...) __attribute__((overloadable, unavailable)); // calling me is an error
-
-
- -

Functions declared with the overloadable attribute have -their names mangled according to the same rules as C++ function -names. For example, the three tgsin functions in our -motivating example get the mangled names _Z5tgsinf, -_Z5tgsind, and _Z5tgsine, respectively. There are two -caveats to this use of name mangling:

- - - -

Query for this feature with __has_extension(attribute_overloadable).

- - -

Initializer lists for complex numbers in C

- - -

clang supports an extension which allows the following in C:

- -
-
-#include <math.h>
-#include <complex.h>
-complex float x = { 1.0f, INFINITY }; // Init to (1, Inf)
-
-
- -

This construct is useful because there is no way to separately -initialize the real and imaginary parts of a complex variable in -standard C, given that clang does not support _Imaginary. -(clang also supports the __real__ and __imag__ -extensions from gcc, which help in some cases, but are not usable in -static initializers.) - -

Note that this extension does not allow eliding the braces; the -meaning of the following two lines is different:

- -
-
-complex float x[] = { { 1.0f, 1.0f } }; // [0] = (1, 1)
-complex float x[] = { 1.0f, 1.0f }; // [0] = (1, 0), [1] = (1, 0)
-
-
- -

This extension also works in C++ mode, as far as that goes, but does not - apply to the C++ std::complex. (In C++11, list - initialization allows the same syntax to be used with - std::complex with the same meaning.) - - -

Builtin Functions

- - -

Clang supports a number of builtin library functions with the same syntax as -GCC, including things like __builtin_nan, -__builtin_constant_p, __builtin_choose_expr, -__builtin_types_compatible_p, __sync_fetch_and_add, etc. In -addition to the GCC builtins, Clang supports a number of builtins that GCC does -not, which are listed here.

- -

Please note that Clang does not and will not support all of the GCC builtins -for vector operations. Instead of using builtins, you should use the functions -defined in target-specific header files like <xmmintrin.h>, which -define portable wrappers for these. Many of the Clang versions of these -functions are implemented directly in terms of extended -vector support instead of builtins, in order to reduce the number of -builtins that we need to implement.

- - -

__builtin_readcyclecounter

- - -

__builtin_readcyclecounter is used to access the cycle counter -register (or a similar low-latency, high-accuracy clock) on those targets that -support it. -

- -

Syntax:

- -
-__builtin_readcyclecounter()
-
- -

Example of Use:

- -
-unsigned long long t0 = __builtin_readcyclecounter();
-do_something();
-unsigned long long t1 = __builtin_readcyclecounter();
-unsigned long long cycles_to_do_something = t1 - t0; // assuming no overflow
-
- -

Description:

- -

The __builtin_readcyclecounter() builtin returns the cycle counter value, -which may be either global or process/thread-specific depending on the target. -As the backing counters often overflow quickly (on the order of -seconds) this should only be used for timing small intervals. When not -supported by the target, the return value is always zero. This builtin -takes no arguments and produces an unsigned long long result. -

- -

Query for this feature with __has_builtin(__builtin_readcyclecounter).

- - -

__builtin_shufflevector

- - -

__builtin_shufflevector is used to express generic vector -permutation/shuffle/swizzle operations. This builtin is also very important for -the implementation of various target-specific header files like -<xmmintrin.h>. -

- -

Syntax:

- -
-__builtin_shufflevector(vec1, vec2, index1, index2, ...)
-
- -

Examples:

- -
-  // Identity operation - return 4-element vector V1.
-  __builtin_shufflevector(V1, V1, 0, 1, 2, 3)
-
-  // "Splat" element 0 of V1 into a 4-element result.
-  __builtin_shufflevector(V1, V1, 0, 0, 0, 0)
-
-  // Reverse 4-element vector V1.
-  __builtin_shufflevector(V1, V1, 3, 2, 1, 0)
-
-  // Concatenate every other element of 4-element vectors V1 and V2.
-  __builtin_shufflevector(V1, V2, 0, 2, 4, 6)
-
-  // Concatenate every other element of 8-element vectors V1 and V2.
-  __builtin_shufflevector(V1, V2, 0, 2, 4, 6, 8, 10, 12, 14)
-
- -

Description:

- -

The first two arguments to __builtin_shufflevector are vectors that have the -same element type. The remaining arguments are a list of integers that specify -the elements indices of the first two vectors that should be extracted and -returned in a new vector. These element indices are numbered sequentially -starting with the first vector, continuing into the second vector. Thus, if -vec1 is a 4-element vector, index 5 would refer to the second element of vec2. -

- -

The result of __builtin_shufflevector is a vector -with the same element type as vec1/vec2 but that has an element count equal to -the number of indices specified. -

- -

Query for this feature with __has_builtin(__builtin_shufflevector).

- - -

__builtin_unreachable

- - -

__builtin_unreachable is used to indicate that a specific point in -the program cannot be reached, even if the compiler might otherwise think it -can. This is useful to improve optimization and eliminates certain warnings. -For example, without the __builtin_unreachable in the example below, -the compiler assumes that the inline asm can fall through and prints a "function -declared 'noreturn' should not return" warning. -

- -

Syntax:

- -
-__builtin_unreachable()
-
- -

Example of Use:

- -
-void myabort(void) __attribute__((noreturn));
-void myabort(void) {
-    asm("int3");
-    __builtin_unreachable();
-}
-
- -

Description:

- -

The __builtin_unreachable() builtin has completely undefined behavior. Since -it has undefined behavior, it is a statement that it is never reached and the -optimizer can take advantage of this to produce better code. This builtin takes -no arguments and produces a void result. -

- -

Query for this feature with __has_builtin(__builtin_unreachable).

- - -

__sync_swap

- - -

__sync_swap is used to atomically swap integers or pointers in -memory. -

- -

Syntax:

- -
-type __sync_swap(type *ptr, type value, ...)
-
- -

Example of Use:

- -
-int old_value = __sync_swap(&value, new_value);
-
- -

Description:

- -

The __sync_swap() builtin extends the existing __sync_*() family of atomic -intrinsics to allow code to atomically swap the current value with the new -value. More importantly, it helps developers write more efficient and correct -code by avoiding expensive loops around __sync_bool_compare_and_swap() or -relying on the platform specific implementation details of -__sync_lock_test_and_set(). The __sync_swap() builtin is a full barrier. -

- - -

__c11_atomic builtins

- - -

Clang provides a set of builtins which are intended to be used to implement -C11's <stdatomic.h> header. These builtins provide the semantics -of the _explicit form of the corresponding C11 operation, and are named -with a __c11_ prefix. The supported operations are:

- - - - -

Non-standard C++11 Attributes

- - -

Clang supports one non-standard C++11 attribute. It resides in the -clang attribute namespace.

- - -

The clang::fallthrough attribute

- - -

The clang::fallthrough attribute is used along with the --Wimplicit-fallthrough argument to annotate intentional fall-through -between switch labels. It can only be applied to a null statement placed at a -point of execution between any statement and the next switch label. It is common -to mark these places with a specific comment, but this attribute is meant to -replace comments with a more strict annotation, which can be checked by the -compiler. This attribute doesn't change semantics of the code and can be used -wherever an intended fall-through occurs. It is designed to mimic -control-flow statements like break;, so it can be placed in most places -where break; can, but only if there are no statements on the execution -path between it and the next switch label.

-

Here is an example:

-
-// compile with -Wimplicit-fallthrough
-switch (n) {
-case 22:
-case 33:  // no warning: no statements between case labels
-  f();
-case 44:  // warning: unannotated fall-through
-  g();
-  [[clang::fallthrough]];
-case 55:  // no warning
-  if (x) {
-    h();
-    break;
-  }
-  else {
-    i();
-    [[clang::fallthrough]];
-  }
-case 66:  // no warning
-  p();
-  [[clang::fallthrough]];  // warning: fallthrough annotation does not directly precede case label
-  q();
-case 77:  // warning: unannotated fall-through
-  r();
-}
-
- - -

Target-Specific Extensions

- - -

Clang supports some language features conditionally on some targets.

- - -

X86/X86-64 Language Extensions

- - -

The X86 backend has these language extensions:

- - -

Memory references off the GS segment

- - -

Annotating a pointer with address space #256 causes it to be code generated -relative to the X86 GS segment register, and address space #257 causes it to be -relative to the X86 FS segment. Note that this is a very very low-level -feature that should only be used if you know what you're doing (for example in -an OS kernel).

- -

Here is an example:

- -
-#define GS_RELATIVE __attribute__((address_space(256)))
-int foo(int GS_RELATIVE *P) {
-  return *P;
-}
-
- -

Which compiles to (on X86-32):

- -
-_foo:
-	movl	4(%esp), %eax
-	movl	%gs:(%eax), %eax
-	ret
-
- - -

Static Analysis-Specific Extensions

- - -

Clang supports additional attributes that are useful for documenting program -invariants and rules for static analysis tools. The extensions documented here -are used by the path-sensitive static analyzer -engine that is part of Clang's Analysis library.

- -

The analyzer_noreturn attribute

- -

Clang's static analysis engine understands the standard noreturn -attribute. This attribute, which is typically affixed to a function prototype, -indicates that a call to a given function never returns. Function prototypes for -common functions like exit are typically annotated with this attribute, -as well as a variety of common assertion handlers. Users can educate the static -analyzer about their own custom assertion handles (thus cutting down on false -positives due to false paths) by marking their own "panic" functions -with this attribute.

- -

While useful, noreturn is not applicable in all cases. Sometimes -there are special functions that for all intents and purposes should be -considered panic functions (i.e., they are only called when an internal program -error occurs) but may actually return so that the program can fail gracefully. -The analyzer_noreturn attribute allows one to annotate such functions -as being interpreted as "no return" functions by the analyzer (thus -pruning bogus paths) but will not affect compilation (as in the case of -noreturn).

- -

Usage: The analyzer_noreturn attribute can be placed in the -same places where the noreturn attribute can be placed. It is commonly -placed at the end of function prototypes:

- -
-  void foo() __attribute__((analyzer_noreturn));
-
- -

Query for this feature with -__has_attribute(analyzer_noreturn).

- -

The objc_method_family attribute

- -

Many methods in Objective-C have conventional meanings determined -by their selectors. For the purposes of static analysis, it is -sometimes useful to be able to mark a method as having a particular -conventional meaning despite not having the right selector, or as not -having the conventional meaning that its selector would suggest. -For these use cases, we provide an attribute to specifically describe -the method family that a method belongs to.

- -

Usage: __attribute__((objc_method_family(X))), -where X is one of none, alloc, copy, -init, mutableCopy, or new. This attribute -can only be placed at the end of a method declaration:

- -
-  - (NSString*) initMyStringValue __attribute__((objc_method_family(none)));
-
- -

Users who do not wish to change the conventional meaning of a -method, and who merely want to document its non-standard retain and -release semantics, should use the -retaining behavior attributes -described below.

- -

Query for this feature with -__has_attribute(objc_method_family).

- -

Objective-C retaining behavior attributes

- -

In Objective-C, functions and methods are generally assumed to take -and return objects with +0 retain counts, with some exceptions for -special methods like +alloc and init. However, -there are exceptions, and so Clang provides attributes to allow these -exceptions to be documented, which helps the analyzer find leaks (and -ignore non-leaks). Some exceptions may be better described using -the objc_method_family -attribute instead.

- -

Usage: The ns_returns_retained, ns_returns_not_retained, -ns_returns_autoreleased, cf_returns_retained, -and cf_returns_not_retained attributes can be placed on -methods and functions that return Objective-C or CoreFoundation -objects. They are commonly placed at the end of a function prototype -or method declaration:

- -
-  id foo() __attribute__((ns_returns_retained));
-
-  - (NSString*) bar: (int) x __attribute__((ns_returns_retained));
-
- -

The *_returns_retained attributes specify that the -returned object has a +1 retain count. -The *_returns_not_retained attributes specify that the return -object has a +0 retain count, even if the normal convention for its -selector would be +1. ns_returns_autoreleased specifies that the -returned object is +0, but is guaranteed to live at least as long as the -next flush of an autorelease pool.

- -

Usage: The ns_consumed and cf_consumed -attributes can be placed on an parameter declaration; they specify -that the argument is expected to have a +1 retain count, which will be -balanced in some way by the function or method. -The ns_consumes_self attribute can only be placed on an -Objective-C method; it specifies that the method expects -its self parameter to have a +1 retain count, which it will -balance in some way.

- -
-  void foo(__attribute__((ns_consumed)) NSString *string);
-
-  - (void) bar __attribute__((ns_consumes_self));
-  - (void) baz: (id) __attribute__((ns_consumed)) x;
-
- -

Query for these features with __has_attribute(ns_consumed), -__has_attribute(ns_returns_retained), etc.

- - -

Dynamic Analysis-Specific Extensions

- -

AddressSanitizer

-

Use __has_feature(address_sanitizer) -to check if the code is being built with AddressSanitizer. -

-

Use __attribute__((no_address_safety_analysis)) on a function -declaration to specify that address safety instrumentation (e.g. -AddressSanitizer) should not be applied to that function. -

- - -

Thread-Safety Annotation Checking

- - -

Clang supports additional attributes for checking basic locking policies in -multithreaded programs. -Clang currently parses the following list of attributes, although -the implementation for these annotations is currently in development. -For more details, see the -GCC implementation. -

- -

no_thread_safety_analysis

- -

Use __attribute__((no_thread_safety_analysis)) on a function -declaration to specify that the thread safety analysis should not be run on that -function. This attribute provides an escape hatch (e.g. for situations when it -is difficult to annotate the locking policy).

- -

lockable

- -

Use __attribute__((lockable)) on a class definition to specify -that it has a lockable type (e.g. a Mutex class). This annotation is primarily -used to check consistency.

- -

scoped_lockable

- -

Use __attribute__((scoped_lockable)) on a class definition to -specify that it has a "scoped" lockable type. Objects of this type will acquire -the lock upon construction and release it upon going out of scope. - This annotation is primarily used to check -consistency.

- -

guarded_var

- -

Use __attribute__((guarded_var)) on a variable declaration to -specify that the variable must be accessed while holding some lock.

- -

pt_guarded_var

- -

Use __attribute__((pt_guarded_var)) on a pointer declaration to -specify that the pointer must be dereferenced while holding some lock.

- -

guarded_by(l)

- -

Use __attribute__((guarded_by(l))) on a variable declaration to -specify that the variable must be accessed while holding lock l.

- -

pt_guarded_by(l)

- -

Use __attribute__((pt_guarded_by(l))) on a pointer declaration to -specify that the pointer must be dereferenced while holding lock l.

- -

acquired_before(...)

- -

Use __attribute__((acquired_before(...))) on a declaration -of a lockable variable to specify that the lock must be acquired before all -attribute arguments. Arguments must be lockable type, and there must be at -least one argument.

- -

acquired_after(...)

- -

Use __attribute__((acquired_after(...))) on a declaration -of a lockable variable to specify that the lock must be acquired after all -attribute arguments. Arguments must be lockable type, and there must be at -least one argument.

- -

exclusive_lock_function(...)

- -

Use __attribute__((exclusive_lock_function(...))) on a function -declaration to specify that the function acquires all listed locks -exclusively. This attribute takes zero or more arguments: either of lockable -type or integers indexing into function parameters of lockable type. If no -arguments are given, the acquired lock is implicitly this of the -enclosing object.

- -

shared_lock_function(...)

- -

Use __attribute__((shared_lock_function(...))) on a function -declaration to specify that the function acquires all listed locks, although - the locks may be shared (e.g. read locks). This attribute takes zero or more -arguments: either of lockable type or integers indexing into function -parameters of lockable type. If no arguments are given, the acquired lock is -implicitly this of the enclosing object.

- -

exclusive_trylock_function(...)

- -

Use __attribute__((exclusive_lock_function(...))) on a function -declaration to specify that the function will try (without blocking) to acquire -all listed locks exclusively. This attribute takes one or more arguments. The -first argument is an integer or boolean value specifying the return value of a -successful lock acquisition. The remaining arugments are either of lockable type -or integers indexing into function parameters of lockable type. If only one -argument is given, the acquired lock is implicitly this of the -enclosing object.

- -

shared_trylock_function(...)

- -

Use __attribute__((shared_lock_function(...))) on a function -declaration to specify that the function will try (without blocking) to acquire -all listed locks, although the locks may be shared (e.g. read locks). This -attribute takes one or more arguments. The first argument is an integer or -boolean value specifying the return value of a successful lock acquisition. The -remaining arugments are either of lockable type or integers indexing into -function parameters of lockable type. If only one argument is given, the -acquired lock is implicitly this of the enclosing object.

- -

unlock_function(...)

- -

Use __attribute__((unlock_function(...))) on a function -declaration to specify that the function release all listed locks. This -attribute takes zero or more arguments: either of lockable type or integers -indexing into function parameters of lockable type. If no arguments are given, -the acquired lock is implicitly this of the enclosing object.

- -

lock_returned(l)

- -

Use __attribute__((lock_returned(l))) on a function -declaration to specify that the function returns lock l (l -must be of lockable type). This annotation is used to aid in resolving lock -expressions.

- -

locks_excluded(...)

- -

Use __attribute__((locks_excluded(...))) on a function declaration -to specify that the function must not be called with the listed locks. Arguments -must be lockable type, and there must be at least one argument.

- -

exclusive_locks_required(...)

- -

Use __attribute__((exclusive_locks_required(...))) on a function -declaration to specify that the function must be called while holding the listed -exclusive locks. Arguments must be lockable type, and there must be at -least one argument.

- -

shared_locks_required(...)

- -

Use __attribute__((shared_locks_required(...))) on a function -declaration to specify that the function must be called while holding the listed -shared locks. Arguments must be lockable type, and there must be at -least one argument.

- - -

Type Safety Checking

- - -

Clang supports additional attributes to enable checking type safety -properties that can't be enforced by C type system. Usecases include:

- - -

You can detect support for these attributes with __has_attribute(). For -example:

- -
-
-#if defined(__has_attribute)
-#  if __has_attribute(argument_with_type_tag) && \
-      __has_attribute(pointer_with_type_tag) && \
-      __has_attribute(type_tag_for_datatype)
-#    define ATTR_MPI_PWT(buffer_idx, type_idx) __attribute__((pointer_with_type_tag(mpi,buffer_idx,type_idx)))
-/* ... other macros ... */
-#  endif
-#endif
-
-#if !defined(ATTR_MPI_PWT)
-#define ATTR_MPI_PWT(buffer_idx, type_idx)
-#endif
-
-int MPI_Send(void *buf, int count, MPI_Datatype datatype /*, other args omitted */)
-    ATTR_MPI_PWT(1,3);
-
-
- -

argument_with_type_tag(...)

- -

Use __attribute__((argument_with_type_tag(arg_kind, arg_idx, -type_tag_idx))) on a function declaration to specify that the function -accepts a type tag that determines the type of some other argument. -arg_kind is an identifier that should be used when annotating all -applicable type tags.

- -

This attribute is primarily useful for checking arguments of variadic -functions (pointer_with_type_tag can be used in most of non-variadic -cases).

- -

For example:

-
-
-int fcntl(int fd, int cmd, ...)
-      __attribute__(( argument_with_type_tag(fcntl,3,2) ));
-
-
- -

pointer_with_type_tag(...)

- -

Use __attribute__((pointer_with_type_tag(ptr_kind, ptr_idx, -type_tag_idx))) on a function declaration to specify that the -function accepts a type tag that determines the pointee type of some other -pointer argument.

- -

For example:

-
-
-int MPI_Send(void *buf, int count, MPI_Datatype datatype /*, other args omitted */)
-    __attribute__(( pointer_with_type_tag(mpi,1,3) ));
-
-
- -

type_tag_for_datatype(...)

- -

Clang supports annotating type tags of two forms.

- - - -

The attribute also accepts an optional third argument that determines how -the expression is compared to the type tag. There are two supported flags:

- - - -
- - diff --git a/docs/LanguageExtensions.rst b/docs/LanguageExtensions.rst new file mode 100644 index 0000000000..850573a34d --- /dev/null +++ b/docs/LanguageExtensions.rst @@ -0,0 +1,1838 @@ +========================= +Clang Language Extensions +========================= + +.. contents:: + :local: + +Introduction +============ + +This document describes the language extensions provided by Clang. In addition +to the language extensions listed here, Clang aims to support a broad range of +GCC extensions. Please see the `GCC manual +`_ for more information on +these extensions. + +.. _langext-feature_check: + +Feature Checking Macros +======================= + +Language extensions can be very useful, but only if you know you can depend on +them. In order to allow fine-grain features checks, we support three builtin +function-like macros. This allows you to directly test for a feature in your +code without having to resort to something like autoconf or fragile "compiler +version checks". + +``__has_builtin`` +----------------- + +This function-like macro takes a single identifier argument that is the name of +a builtin function. It evaluates to 1 if the builtin is supported or 0 if not. +It can be used like this: + +.. code-block:: c++ + + #ifndef __has_builtin // Optional of course. + #define __has_builtin(x) 0 // Compatibility with non-clang compilers. + #endif + + ... + #if __has_builtin(__builtin_trap) + __builtin_trap(); + #else + abort(); + #endif + ... + +.. _langext-__has_feature-__has_extension: + +``__has_feature`` and ``__has_extension`` +----------------------------------------- + +These function-like macros take a single identifier argument that is the name +of a feature. ``__has_feature`` evaluates to 1 if the feature is both +supported by Clang and standardized in the current language standard or 0 if +not (but see :ref:`below `), while +``__has_extension`` evaluates to 1 if the feature is supported by Clang in the +current language (either as a language extension or a standard language +feature) or 0 if not. They can be used like this: + +.. code-block:: c++ + + #ifndef __has_feature // Optional of course. + #define __has_feature(x) 0 // Compatibility with non-clang compilers. + #endif + #ifndef __has_extension + #define __has_extension __has_feature // Compatibility with pre-3.0 compilers. + #endif + + ... + #if __has_feature(cxx_rvalue_references) + // This code will only be compiled with the -std=c++11 and -std=gnu++11 + // options, because rvalue references are only standardized in C++11. + #endif + + #if __has_extension(cxx_rvalue_references) + // This code will be compiled with the -std=c++11, -std=gnu++11, -std=c++98 + // and -std=gnu++98 options, because rvalue references are supported as a + // language extension in C++98. + #endif + +.. _langext-has-feature-back-compat: + +For backwards compatibility reasons, ``__has_feature`` can also be used to test +for support for non-standardized features, i.e. features not prefixed ``c_``, +``cxx_`` or ``objc_``. + +Another use of ``__has_feature`` is to check for compiler features not related +to the language standard, such as e.g. `AddressSanitizer +`_. + +If the ``-pedantic-errors`` option is given, ``__has_extension`` is equivalent +to ``__has_feature``. + +The feature tag is described along with the language feature below. + +The feature name or extension name can also be specified with a preceding and +following ``__`` (double underscore) to avoid interference from a macro with +the same name. For instance, ``__cxx_rvalue_references__`` can be used instead +of ``cxx_rvalue_references``. + +``__has_attribute`` +------------------- + +This function-like macro takes a single identifier argument that is the name of +an attribute. It evaluates to 1 if the attribute is supported or 0 if not. It +can be used like this: + +.. code-block:: c++ + + #ifndef __has_attribute // Optional of course. + #define __has_attribute(x) 0 // Compatibility with non-clang compilers. + #endif + + ... + #if __has_attribute(always_inline) + #define ALWAYS_INLINE __attribute__((always_inline)) + #else + #define ALWAYS_INLINE + #endif + ... + +The attribute name can also be specified with a preceding and following ``__`` +(double underscore) to avoid interference from a macro with the same name. For +instance, ``__always_inline__`` can be used instead of ``always_inline``. + +Include File Checking Macros +============================ + +Not all developments systems have the same include files. The +:ref:`langext-__has_include` and :ref:`langext-__has_include_next` macros allow +you to check for the existence of an include file before doing a possibly +failing ``#include`` directive. + +.. _langext-__has_include: + +``__has_include`` +----------------- + +This function-like macro takes a single file name string argument that is the +name of an include file. It evaluates to 1 if the file can be found using the +include paths, or 0 otherwise: + +.. code-block:: c++ + + // Note the two possible file name string formats. + #if __has_include("myinclude.h") && __has_include() + # include "myinclude.h" + #endif + + // To avoid problem with non-clang compilers not having this macro. + #if defined(__has_include) && __has_include("myinclude.h") + # include "myinclude.h" + #endif + +To test for this feature, use ``#if defined(__has_include)``. + +.. _langext-__has_include_next: + +``__has_include_next`` +---------------------- + +This function-like macro takes a single file name string argument that is the +name of an include file. It is like ``__has_include`` except that it looks for +the second instance of the given file found in the include paths. It evaluates +to 1 if the second instance of the file can be found using the include paths, +or 0 otherwise: + +.. code-block:: c++ + + // Note the two possible file name string formats. + #if __has_include_next("myinclude.h") && __has_include_next() + # include_next "myinclude.h" + #endif + + // To avoid problem with non-clang compilers not having this macro. + #if defined(__has_include_next) && __has_include_next("myinclude.h") + # include_next "myinclude.h" + #endif + +Note that ``__has_include_next``, like the GNU extension ``#include_next`` +directive, is intended for use in headers only, and will issue a warning if +used in the top-level compilation file. A warning will also be issued if an +absolute path is used in the file argument. + +``__has_warning`` +----------------- + +This function-like macro takes a string literal that represents a command line +option for a warning and returns true if that is a valid warning option. + +.. code-block:: c++ + + #if __has_warning("-Wformat") + ... + #endif + +Builtin Macros +============== + +``__BASE_FILE__`` + Defined to a string that contains the name of the main input file passed to + Clang. + +``__COUNTER__`` + Defined to an integer value that starts at zero and is incremented each time + the ``__COUNTER__`` macro is expanded. + +``__INCLUDE_LEVEL__`` + Defined to an integral value that is the include depth of the file currently + being translated. For the main file, this value is zero. + +``__TIMESTAMP__`` + Defined to the date and time of the last modification of the current source + file. + +``__clang__`` + Defined when compiling with Clang + +``__clang_major__`` + Defined to the major marketing version number of Clang (e.g., the 2 in + 2.0.1). Note that marketing version numbers should not be used to check for + language features, as different vendors use different numbering schemes. + Instead, use the :ref:`langext-feature_check`. + +``__clang_minor__`` + Defined to the minor version number of Clang (e.g., the 0 in 2.0.1). Note + that marketing version numbers should not be used to check for language + features, as different vendors use different numbering schemes. Instead, use + the :ref:`langext-feature_check`. + +``__clang_patchlevel__`` + Defined to the marketing patch level of Clang (e.g., the 1 in 2.0.1). + +``__clang_version__`` + Defined to a string that captures the Clang marketing version, including the + Subversion tag or revision number, e.g., "``1.5 (trunk 102332)``". + +.. _langext-vectors: + +Vectors and Extended Vectors +============================ + +Supports the GCC, OpenCL, AltiVec and NEON vector extensions. + +OpenCL vector types are created using ``ext_vector_type`` attribute. It +support for ``V.xyzw`` syntax and other tidbits as seen in OpenCL. An example +is: + +.. code-block:: c++ + + typedef float float4 __attribute__((ext_vector_type(4))); + typedef float float2 __attribute__((ext_vector_type(2))); + + float4 foo(float2 a, float2 b) { + float4 c; + c.xz = a; + c.yw = b; + return c; + } + +Query for this feature with ``__has_extension(attribute_ext_vector_type)``. + +Giving ``-faltivec`` option to clang enables support for AltiVec vector syntax +and functions. For example: + +.. code-block:: c++ + + vector float foo(vector int a) { + vector int b; + b = vec_add(a, a) + a; + return (vector float)b; + } + +NEON vector types are created using ``neon_vector_type`` and +``neon_polyvector_type`` attributes. For example: + +.. code-block:: c++ + + typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t; + typedef __attribute__((neon_polyvector_type(16))) poly8_t poly8x16_t; + + int8x8_t foo(int8x8_t a) { + int8x8_t v; + v = a; + return v; + } + +Vector Literals +--------------- + +Vector literals can be used to create vectors from a set of scalars, or +vectors. Either parentheses or braces form can be used. In the parentheses +form the number of literal values specified must be one, i.e. referring to a +scalar value, or must match the size of the vector type being created. If a +single scalar literal value is specified, the scalar literal value will be +replicated to all the components of the vector type. In the brackets form any +number of literals can be specified. For example: + +.. code-block:: c++ + + typedef int v4si __attribute__((__vector_size__(16))); + typedef float float4 __attribute__((ext_vector_type(4))); + typedef float float2 __attribute__((ext_vector_type(2))); + + v4si vsi = (v4si){1, 2, 3, 4}; + float4 vf = (float4)(1.0f, 2.0f, 3.0f, 4.0f); + vector int vi1 = (vector int)(1); // vi1 will be (1, 1, 1, 1). + vector int vi2 = (vector int){1}; // vi2 will be (1, 0, 0, 0). + vector int vi3 = (vector int)(1, 2); // error + vector int vi4 = (vector int){1, 2}; // vi4 will be (1, 2, 0, 0). + vector int vi5 = (vector int)(1, 2, 3, 4); + float4 vf = (float4)((float2)(1.0f, 2.0f), (float2)(3.0f, 4.0f)); + +Vector Operations +----------------- + +The table below shows the support for each operation by vector extension. A +dash indicates that an operation is not accepted according to a corresponding +specification. + +============================== ====== ======= === ==== + Opeator OpenCL AltiVec GCC NEON +============================== ====== ======= === ==== +[] yes yes yes -- +unary operators +, -- yes yes yes -- +++, -- -- yes yes yes -- ++,--,*,/,% yes yes yes -- +bitwise operators &,|,^,~ yes yes yes -- +>>,<< yes yes yes -- +!, &&, || no -- -- -- +==, !=, >, <, >=, <= yes yes -- -- += yes yes yes yes +:? yes -- -- -- +sizeof yes yes yes yes +============================== ====== ======= === ==== + +See also :ref:`langext-__builtin_shufflevector`. + +Messages on ``deprecated`` and ``unavailable`` Attributes +========================================================= + +An optional string message can be added to the ``deprecated`` and +``unavailable`` attributes. For example: + +.. code-block:: c++ + + void explode(void) __attribute__((deprecated("extremely unsafe, use 'combust' instead!!!"))); + +If the deprecated or unavailable declaration is used, the message will be +incorporated into the appropriate diagnostic: + +.. code-block:: c++ + + harmless.c:4:3: warning: 'explode' is deprecated: extremely unsafe, use 'combust' instead!!! + [-Wdeprecated-declarations] + explode(); + ^ + +Query for this feature with +``__has_extension(attribute_deprecated_with_message)`` and +``__has_extension(attribute_unavailable_with_message)``. + +Attributes on Enumerators +========================= + +Clang allows attributes to be written on individual enumerators. This allows +enumerators to be deprecated, made unavailable, etc. The attribute must appear +after the enumerator name and before any initializer, like so: + +.. code-block:: c++ + + enum OperationMode { + OM_Invalid, + OM_Normal, + OM_Terrified __attribute__((deprecated)), + OM_AbortOnError __attribute__((deprecated)) = 4 + }; + +Attributes on the ``enum`` declaration do not apply to individual enumerators. + +Query for this feature with ``__has_extension(enumerator_attributes)``. + +'User-Specified' System Frameworks +================================== + +Clang provides a mechanism by which frameworks can be built in such a way that +they will always be treated as being "system frameworks", even if they are not +present in a system framework directory. This can be useful to system +framework developers who want to be able to test building other applications +with development builds of their framework, including the manner in which the +compiler changes warning behavior for system headers. + +Framework developers can opt-in to this mechanism by creating a +"``.system_framework``" file at the top-level of their framework. That is, the +framework should have contents like: + +.. code-block:: none + + .../TestFramework.framework + .../TestFramework.framework/.system_framework + .../TestFramework.framework/Headers + .../TestFramework.framework/Headers/TestFramework.h + ... + +Clang will treat the presence of this file as an indicator that the framework +should be treated as a system framework, regardless of how it was found in the +framework search path. For consistency, we recommend that such files never be +included in installed versions of the framework. + +Availability attribute +====================== + +Clang introduces the ``availability`` attribute, which can be placed on +declarations to describe the lifecycle of that declaration relative to +operating system versions. Consider the function declaration for a +hypothetical function ``f``: + +.. code-block:: c++ + + void f(void) __attribute__((availability(macosx,introduced=10.4,deprecated=10.6,obsoleted=10.7))); + +The availability attribute states that ``f`` was introduced in Mac OS X 10.4, +deprecated in Mac OS X 10.6, and obsoleted in Mac OS X 10.7. This information +is used by Clang to determine when it is safe to use ``f``: for example, if +Clang is instructed to compile code for Mac OS X 10.5, a call to ``f()`` +succeeds. If Clang is instructed to compile code for Mac OS X 10.6, the call +succeeds but Clang emits a warning specifying that the function is deprecated. +Finally, if Clang is instructed to compile code for Mac OS X 10.7, the call +fails because ``f()`` is no longer available. + +The availablility attribute is a comma-separated list starting with the +platform name and then including clauses specifying important milestones in the +declaration's lifetime (in any order) along with additional information. Those +clauses can be: + +introduced=\ *version* + The first version in which this declaration was introduced. + +deprecated=\ *version* + The first version in which this declaration was deprecated, meaning that + users should migrate away from this API. + +obsoleted=\ *version* + The first version in which this declaration was obsoleted, meaning that it + was removed completely and can no longer be used. + +unavailable + This declaration is never available on this platform. + +message=\ *string-literal* + Additional message text that Clang will provide when emitting a warning or + error about use of a deprecated or obsoleted declaration. Useful to direct + users to replacement APIs. + +Multiple availability attributes can be placed on a declaration, which may +correspond to different platforms. Only the availability attribute with the +platform corresponding to the target platform will be used; any others will be +ignored. If no availability attribute specifies availability for the current +target platform, the availability attributes are ignored. Supported platforms +are: + +``ios`` + Apple's iOS operating system. The minimum deployment target is specified by + the ``-mios-version-min=*version*`` or ``-miphoneos-version-min=*version*`` + command-line arguments. + +``macosx`` + Apple's Mac OS X operating system. The minimum deployment target is + specified by the ``-mmacosx-version-min=*version*`` command-line argument. + +A declaration can be used even when deploying back to a platform version prior +to when the declaration was introduced. When this happens, the declaration is +`weakly linked +`_, +as if the ``weak_import`` attribute were added to the declaration. A +weakly-linked declaration may or may not be present a run-time, and a program +can determine whether the declaration is present by checking whether the +address of that declaration is non-NULL. + +Checks for Standard Language Features +===================================== + +The ``__has_feature`` macro can be used to query if certain standard language +features are enabled. The ``__has_extension`` macro can be used to query if +language features are available as an extension when compiling for a standard +which does not provide them. The features which can be tested are listed here. + +C++98 +----- + +The features listed below are part of the C++98 standard. These features are +enabled by default when compiling C++ code. + +C++ exceptions +^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_exceptions)`` to determine if C++ exceptions have been +enabled. For example, compiling code with ``-fno-exceptions`` disables C++ +exceptions. + +C++ RTTI +^^^^^^^^ + +Use ``__has_feature(cxx_rtti)`` to determine if C++ RTTI has been enabled. For +example, compiling code with ``-fno-rtti`` disables the use of RTTI. + +C++11 +----- + +The features listed below are part of the C++11 standard. As a result, all +these features are enabled with the ``-std=c++11`` or ``-std=gnu++11`` option +when compiling C++ code. + +C++11 SFINAE includes access control +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_access_control_sfinae)`` or +``__has_extension(cxx_access_control_sfinae)`` to determine whether +access-control errors (e.g., calling a private constructor) are considered to +be template argument deduction errors (aka SFINAE errors), per `C++ DR1170 +`_. + +C++11 alias templates +^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_alias_templates)`` or +``__has_extension(cxx_alias_templates)`` to determine if support for C++11's +alias declarations and alias templates is enabled. + +C++11 alignment specifiers +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_alignas)`` or ``__has_extension(cxx_alignas)`` to +determine if support for alignment specifiers using ``alignas`` is enabled. + +C++11 attributes +^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_attributes)`` or ``__has_extension(cxx_attributes)`` to +determine if support for attribute parsing with C++11's square bracket notation +is enabled. + +C++11 generalized constant expressions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_constexpr)`` to determine if support for generalized +constant expressions (e.g., ``constexpr``) is enabled. + +C++11 ``decltype()`` +^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_decltype)`` or ``__has_extension(cxx_decltype)`` to +determine if support for the ``decltype()`` specifier is enabled. C++11's +``decltype`` does not require type-completeness of a function call expression. +Use ``__has_feature(cxx_decltype_incomplete_return_types)`` or +``__has_extension(cxx_decltype_incomplete_return_types)`` to determine if +support for this feature is enabled. + +C++11 default template arguments in function templates +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_default_function_template_args)`` or +``__has_extension(cxx_default_function_template_args)`` to determine if support +for default template arguments in function templates is enabled. + +C++11 ``default``\ ed functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_defaulted_functions)`` or +``__has_extension(cxx_defaulted_functions)`` to determine if support for +defaulted function definitions (with ``= default``) is enabled. + +C++11 delegating constructors +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_delegating_constructors)`` to determine if support for +delegating constructors is enabled. + +C++11 ``deleted`` functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_deleted_functions)`` or +``__has_extension(cxx_deleted_functions)`` to determine if support for deleted +function definitions (with ``= delete``) is enabled. + +C++11 explicit conversion functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_explicit_conversions)`` to determine if support for +``explicit`` conversion functions is enabled. + +C++11 generalized initializers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_generalized_initializers)`` to determine if support for +generalized initializers (using braced lists and ``std::initializer_list``) is +enabled. + +C++11 implicit move constructors/assignment operators +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_implicit_moves)`` to determine if Clang will implicitly +generate move constructors and move assignment operators where needed. + +C++11 inheriting constructors +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_inheriting_constructors)`` to determine if support for +inheriting constructors is enabled. Clang does not currently implement this +feature. + +C++11 inline namespaces +^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_inline_namespaces)`` or +``__has_extension(cxx_inline_namespaces)`` to determine if support for inline +namespaces is enabled. + +C++11 lambdas +^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_lambdas)`` or ``__has_extension(cxx_lambdas)`` to +determine if support for lambdas is enabled. + +C++11 local and unnamed types as template arguments +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_local_type_template_args)`` or +``__has_extension(cxx_local_type_template_args)`` to determine if support for +local and unnamed types as template arguments is enabled. + +C++11 noexcept +^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_noexcept)`` or ``__has_extension(cxx_noexcept)`` to +determine if support for noexcept exception specifications is enabled. + +C++11 in-class non-static data member initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_nonstatic_member_init)`` to determine whether in-class +initialization of non-static data members is enabled. + +C++11 ``nullptr`` +^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_nullptr)`` or ``__has_extension(cxx_nullptr)`` to +determine if support for ``nullptr`` is enabled. + +C++11 ``override control`` +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_override_control)`` or +``__has_extension(cxx_override_control)`` to determine if support for the +override control keywords is enabled. + +C++11 reference-qualified functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_reference_qualified_functions)`` or +``__has_extension(cxx_reference_qualified_functions)`` to determine if support +for reference-qualified functions (e.g., member functions with ``&`` or ``&&`` +applied to ``*this``) is enabled. + +C++11 range-based ``for`` loop +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_range_for)`` or ``__has_extension(cxx_range_for)`` to +determine if support for the range-based for loop is enabled. + +C++11 raw string literals +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_raw_string_literals)`` to determine if support for raw +string literals (e.g., ``R"x(foo\bar)x"``) is enabled. + +C++11 rvalue references +^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_rvalue_references)`` or +``__has_extension(cxx_rvalue_references)`` to determine if support for rvalue +references is enabled. + +C++11 ``static_assert()`` +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_static_assert)`` or +``__has_extension(cxx_static_assert)`` to determine if support for compile-time +assertions using ``static_assert`` is enabled. + +C++11 type inference +^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_auto_type)`` or ``__has_extension(cxx_auto_type)`` to +determine C++11 type inference is supported using the ``auto`` specifier. If +this is disabled, ``auto`` will instead be a storage class specifier, as in C +or C++98. + +C++11 strongly typed enumerations +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_strong_enums)`` or +``__has_extension(cxx_strong_enums)`` to determine if support for strongly +typed, scoped enumerations is enabled. + +C++11 trailing return type +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_trailing_return)`` or +``__has_extension(cxx_trailing_return)`` to determine if support for the +alternate function declaration syntax with trailing return type is enabled. + +C++11 Unicode string literals +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_unicode_literals)`` to determine if support for Unicode +string literals is enabled. + +C++11 unrestricted unions +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_unrestricted_unions)`` to determine if support for +unrestricted unions is enabled. + +C++11 user-defined literals +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_user_literals)`` to determine if support for +user-defined literals is enabled. + +C++11 variadic templates +^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(cxx_variadic_templates)`` or +``__has_extension(cxx_variadic_templates)`` to determine if support for +variadic templates is enabled. + +C11 +--- + +The features listed below are part of the C11 standard. As a result, all these +features are enabled with the ``-std=c11`` or ``-std=gnu11`` option when +compiling C code. Additionally, because these features are all +backward-compatible, they are available as extensions in all language modes. + +C11 alignment specifiers +^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(c_alignas)`` or ``__has_extension(c_alignas)`` to determine +if support for alignment specifiers using ``_Alignas`` is enabled. + +C11 atomic operations +^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(c_atomic)`` or ``__has_extension(c_atomic)`` to determine +if support for atomic types using ``_Atomic`` is enabled. Clang also provides +:ref:`a set of builtins ` which can be used to implement +the ```` operations on ``_Atomic`` types. + +C11 generic selections +^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(c_generic_selections)`` or +``__has_extension(c_generic_selections)`` to determine if support for generic +selections is enabled. + +As an extension, the C11 generic selection expression is available in all +languages supported by Clang. The syntax is the same as that given in the C11 +standard. + +In C, type compatibility is decided according to the rules given in the +appropriate standard, but in C++, which lacks the type compatibility rules used +in C, types are considered compatible only if they are equivalent. + +C11 ``_Static_assert()`` +^^^^^^^^^^^^^^^^^^^^^^^^ + +Use ``__has_feature(c_static_assert)`` or ``__has_extension(c_static_assert)`` +to determine if support for compile-time assertions using ``_Static_assert`` is +enabled. + +Checks for Type Traits +====================== + +Clang supports the `GNU C++ type traits +`_ and a subset of the +`Microsoft Visual C++ Type traits +`_. For each +supported type trait ``__X``, ``__has_extension(X)`` indicates the presence of +the type trait. For example: + +.. code-block:: c++ + + #if __has_extension(is_convertible_to) + template + struct is_convertible_to { + static const bool value = __is_convertible_to(From, To); + }; + #else + // Emulate type trait + #endif + +The following type traits are supported by Clang: + +* ``__has_nothrow_assign`` (GNU, Microsoft) +* ``__has_nothrow_copy`` (GNU, Microsoft) +* ``__has_nothrow_constructor`` (GNU, Microsoft) +* ``__has_trivial_assign`` (GNU, Microsoft) +* ``__has_trivial_copy`` (GNU, Microsoft) +* ``__has_trivial_constructor`` (GNU, Microsoft) +* ``__has_trivial_destructor`` (GNU, Microsoft) +* ``__has_virtual_destructor`` (GNU, Microsoft) +* ``__is_abstract`` (GNU, Microsoft) +* ``__is_base_of`` (GNU, Microsoft) +* ``__is_class`` (GNU, Microsoft) +* ``__is_convertible_to`` (Microsoft) +* ``__is_empty`` (GNU, Microsoft) +* ``__is_enum`` (GNU, Microsoft) +* ``__is_interface_class`` (Microsoft) +* ``__is_pod`` (GNU, Microsoft) +* ``__is_polymorphic`` (GNU, Microsoft) +* ``__is_union`` (GNU, Microsoft) +* ``__is_literal(type)``: Determines whether the given type is a literal type +* ``__is_final``: Determines whether the given type is declared with a + ``final`` class-virt-specifier. +* ``__underlying_type(type)``: Retrieves the underlying type for a given + ``enum`` type. This trait is required to implement the C++11 standard + library. +* ``__is_trivially_assignable(totype, fromtype)``: Determines whether a value + of type ``totype`` can be assigned to from a value of type ``fromtype`` such + that no non-trivial functions are called as part of that assignment. This + trait is required to implement the C++11 standard library. +* ``__is_trivially_constructible(type, argtypes...)``: Determines whether a + value of type ``type`` can be direct-initialized with arguments of types + ``argtypes...`` such that no non-trivial functions are called as part of + that initialization. This trait is required to implement the C++11 standard + library. + +Blocks +====== + +The syntax and high level language feature description is in +`BlockLanguageSpec.txt `_. Implementation and ABI +details for the clang implementation are in `Block-ABI-Apple.txt +`_. + +Query for this feature with ``__has_extension(blocks)``. + +Objective-C Features +==================== + +Related result types +-------------------- + +According to Cocoa conventions, Objective-C methods with certain names +("``init``", "``alloc``", etc.) always return objects that are an instance of +the receiving class's type. Such methods are said to have a "related result +type", meaning that a message send to one of these methods will have the same +static type as an instance of the receiver class. For example, given the +following classes: + +.. code-block:: objc + + @interface NSObject + + (id)alloc; + - (id)init; + @end + + @interface NSArray : NSObject + @end + +and this common initialization pattern + +.. code-block:: objc + + NSArray *array = [[NSArray alloc] init]; + +the type of the expression ``[NSArray alloc]`` is ``NSArray*`` because +``alloc`` implicitly has a related result type. Similarly, the type of the +expression ``[[NSArray alloc] init]`` is ``NSArray*``, since ``init`` has a +related result type and its receiver is known to have the type ``NSArray *``. +If neither ``alloc`` nor ``init`` had a related result type, the expressions +would have had type ``id``, as declared in the method signature. + +A method with a related result type can be declared by using the type +``instancetype`` as its result type. ``instancetype`` is a contextual keyword +that is only permitted in the result type of an Objective-C method, e.g. + +.. code-block:: objc + + @interface A + + (instancetype)constructAnA; + @end + +The related result type can also be inferred for some methods. To determine +whether a method has an inferred related result type, the first word in the +camel-case selector (e.g., "``init``" in "``initWithObjects``") is considered, +and the method will have a related result type if its return type is compatible +with the type of its class and if: + +* the first word is "``alloc``" or "``new``", and the method is a class method, + or + +* the first word is "``autorelease``", "``init``", "``retain``", or "``self``", + and the method is an instance method. + +If a method with a related result type is overridden by a subclass method, the +subclass method must also return a type that is compatible with the subclass +type. For example: + +.. code-block:: objc + + @interface NSString : NSObject + - (NSUnrelated *)init; // incorrect usage: NSUnrelated is not NSString or a superclass of NSString + @end + +Related result types only affect the type of a message send or property access +via the given method. In all other respects, a method with a related result +type is treated the same way as method that returns ``id``. + +Use ``__has_feature(objc_instancetype)`` to determine whether the +``instancetype`` contextual keyword is available. + +Automatic reference counting +---------------------------- + +Clang provides support for `automated reference counting +`_ in Objective-C, which eliminates the need +for manual ``retain``/``release``/``autorelease`` message sends. There are two +feature macros associated with automatic reference counting: +``__has_feature(objc_arc)`` indicates the availability of automated reference +counting in general, while ``__has_feature(objc_arc_weak)`` indicates that +automated reference counting also includes support for ``__weak`` pointers to +Objective-C objects. + +Enumerations with a fixed underlying type +----------------------------------------- + +Clang provides support for C++11 enumerations with a fixed underlying type +within Objective-C. For example, one can write an enumeration type as: + +.. code-block:: c++ + + typedef enum : unsigned char { Red, Green, Blue } Color; + +This specifies that the underlying type, which is used to store the enumeration +value, is ``unsigned char``. + +Use ``__has_feature(objc_fixed_enum)`` to determine whether support for fixed +underlying types is available in Objective-C. + +Interoperability with C++11 lambdas +----------------------------------- + +Clang provides interoperability between C++11 lambdas and blocks-based APIs, by +permitting a lambda to be implicitly converted to a block pointer with the +corresponding signature. For example, consider an API such as ``NSArray``'s +array-sorting method: + +.. code-block:: objc + + - (NSArray *)sortedArrayUsingComparator:(NSComparator)cmptr; + +``NSComparator`` is simply a typedef for the block pointer ``NSComparisonResult +(^)(id, id)``, and parameters of this type are generally provided with block +literals as arguments. However, one can also use a C++11 lambda so long as it +provides the same signature (in this case, accepting two parameters of type +``id`` and returning an ``NSComparisonResult``): + +.. code-block:: objc + + NSArray *array = @[@"string 1", @"string 21", @"string 12", @"String 11", + @"String 02"]; + const NSStringCompareOptions comparisonOptions + = NSCaseInsensitiveSearch | NSNumericSearch | + NSWidthInsensitiveSearch | NSForcedOrderingSearch; + NSLocale *currentLocale = [NSLocale currentLocale]; + NSArray *sorted + = [array sortedArrayUsingComparator:[=](id s1, id s2) -> NSComparisonResult { + NSRange string1Range = NSMakeRange(0, [s1 length]); + return [s1 compare:s2 options:comparisonOptions + range:string1Range locale:currentLocale]; + }]; + NSLog(@"sorted: %@", sorted); + +This code relies on an implicit conversion from the type of the lambda +expression (an unnamed, local class type called the *closure type*) to the +corresponding block pointer type. The conversion itself is expressed by a +conversion operator in that closure type that produces a block pointer with the +same signature as the lambda itself, e.g., + +.. code-block:: objc + + operator NSComparisonResult (^)(id, id)() const; + +This conversion function returns a new block that simply forwards the two +parameters to the lambda object (which it captures by copy), then returns the +result. The returned block is first copied (with ``Block_copy``) and then +autoreleased. As an optimization, if a lambda expression is immediately +converted to a block pointer (as in the first example, above), then the block +is not copied and autoreleased: rather, it is given the same lifetime as a +block literal written at that point in the program, which avoids the overhead +of copying a block to the heap in the common case. + +The conversion from a lambda to a block pointer is only available in +Objective-C++, and not in C++ with blocks, due to its use of Objective-C memory +management (autorelease). + +Object Literals and Subscripting +-------------------------------- + +Clang provides support for `Object Literals and Subscripting +`_ in Objective-C, which simplifies common Objective-C +programming patterns, makes programs more concise, and improves the safety of +container creation. There are several feature macros associated with object +literals and subscripting: ``__has_feature(objc_array_literals)`` tests the +availability of array literals; ``__has_feature(objc_dictionary_literals)`` +tests the availability of dictionary literals; +``__has_feature(objc_subscripting)`` tests the availability of object +subscripting. + +Objective-C Autosynthesis of Properties +--------------------------------------- + +Clang provides support for autosynthesis of declared properties. Using this +feature, clang provides default synthesis of those properties not declared +@dynamic and not having user provided backing getter and setter methods. +``__has_feature(objc_default_synthesize_properties)`` checks for availability +of this feature in version of clang being used. + +Function Overloading in C +========================= + +Clang provides support for C++ function overloading in C. Function overloading +in C is introduced using the ``overloadable`` attribute. For example, one +might provide several overloaded versions of a ``tgsin`` function that invokes +the appropriate standard function computing the sine of a value with ``float``, +``double``, or ``long double`` precision: + +.. code-block:: c + + #include + float __attribute__((overloadable)) tgsin(float x) { return sinf(x); } + double __attribute__((overloadable)) tgsin(double x) { return sin(x); } + long double __attribute__((overloadable)) tgsin(long double x) { return sinl(x); } + +Given these declarations, one can call ``tgsin`` with a ``float`` value to +receive a ``float`` result, with a ``double`` to receive a ``double`` result, +etc. Function overloading in C follows the rules of C++ function overloading +to pick the best overload given the call arguments, with a few C-specific +semantics: + +* Conversion from ``float`` or ``double`` to ``long double`` is ranked as a + floating-point promotion (per C99) rather than as a floating-point conversion + (as in C++). + +* A conversion from a pointer of type ``T*`` to a pointer of type ``U*`` is + considered a pointer conversion (with conversion rank) if ``T`` and ``U`` are + compatible types. + +* A conversion from type ``T`` to a value of type ``U`` is permitted if ``T`` + and ``U`` are compatible types. This conversion is given "conversion" rank. + +The declaration of ``overloadable`` functions is restricted to function +declarations and definitions. Most importantly, if any function with a given +name is given the ``overloadable`` attribute, then all function declarations +and definitions with that name (and in that scope) must have the +``overloadable`` attribute. This rule even applies to redeclarations of +functions whose original declaration had the ``overloadable`` attribute, e.g., + +.. code-block:: c + + int f(int) __attribute__((overloadable)); + float f(float); // error: declaration of "f" must have the "overloadable" attribute + + int g(int) __attribute__((overloadable)); + int g(int) { } // error: redeclaration of "g" must also have the "overloadable" attribute + +Functions marked ``overloadable`` must have prototypes. Therefore, the +following code is ill-formed: + +.. code-block:: c + + int h() __attribute__((overloadable)); // error: h does not have a prototype + +However, ``overloadable`` functions are allowed to use a ellipsis even if there +are no named parameters (as is permitted in C++). This feature is particularly +useful when combined with the ``unavailable`` attribute: + +.. code-block:: c++ + + void honeypot(...) __attribute__((overloadable, unavailable)); // calling me is an error + +Functions declared with the ``overloadable`` attribute have their names mangled +according to the same rules as C++ function names. For example, the three +``tgsin`` functions in our motivating example get the mangled names +``_Z5tgsinf``, ``_Z5tgsind``, and ``_Z5tgsine``, respectively. There are two +caveats to this use of name mangling: + +* Future versions of Clang may change the name mangling of functions overloaded + in C, so you should not depend on an specific mangling. To be completely + safe, we strongly urge the use of ``static inline`` with ``overloadable`` + functions. + +* The ``overloadable`` attribute has almost no meaning when used in C++, + because names will already be mangled and functions are already overloadable. + However, when an ``overloadable`` function occurs within an ``extern "C"`` + linkage specification, it's name *will* be mangled in the same way as it + would in C. + +Query for this feature with ``__has_extension(attribute_overloadable)``. + +Initializer lists for complex numbers in C +========================================== + +clang supports an extension which allows the following in C: + +.. code-block:: c++ + + #include + #include + complex float x = { 1.0f, INFINITY }; // Init to (1, Inf) + +This construct is useful because there is no way to separately initialize the +real and imaginary parts of a complex variable in standard C, given that clang +does not support ``_Imaginary``. (Clang also supports the ``__real__`` and +``__imag__`` extensions from gcc, which help in some cases, but are not usable +in static initializers.) + +Note that this extension does not allow eliding the braces; the meaning of the +following two lines is different: + +.. code-block:: c++ + + complex float x[] = { { 1.0f, 1.0f } }; // [0] = (1, 1) + complex float x[] = { 1.0f, 1.0f }; // [0] = (1, 0), [1] = (1, 0) + +This extension also works in C++ mode, as far as that goes, but does not apply +to the C++ ``std::complex``. (In C++11, list initialization allows the same +syntax to be used with ``std::complex`` with the same meaning.) + +Builtin Functions +================= + +Clang supports a number of builtin library functions with the same syntax as +GCC, including things like ``__builtin_nan``, ``__builtin_constant_p``, +``__builtin_choose_expr``, ``__builtin_types_compatible_p``, +``__sync_fetch_and_add``, etc. In addition to the GCC builtins, Clang supports +a number of builtins that GCC does not, which are listed here. + +Please note that Clang does not and will not support all of the GCC builtins +for vector operations. Instead of using builtins, you should use the functions +defined in target-specific header files like ````, which define +portable wrappers for these. Many of the Clang versions of these functions are +implemented directly in terms of :ref:`extended vector support +` instead of builtins, in order to reduce the number of +builtins that we need to implement. + +``__builtin_readcyclecounter`` +------------------------------ + +``__builtin_readcyclecounter`` is used to access the cycle counter register (or +a similar low-latency, high-accuracy clock) on those targets that support it. + +**Syntax**: + +.. code-block:: c++ + + __builtin_readcyclecounter() + +**Example of Use**: + +.. code-block:: c++ + + unsigned long long t0 = __builtin_readcyclecounter(); + do_something(); + unsigned long long t1 = __builtin_readcyclecounter(); + unsigned long long cycles_to_do_something = t1 - t0; // assuming no overflow + +**Description**: + +The ``__builtin_readcyclecounter()`` builtin returns the cycle counter value, +which may be either global or process/thread-specific depending on the target. +As the backing counters often overflow quickly (on the order of seconds) this +should only be used for timing small intervals. When not supported by the +target, the return value is always zero. This builtin takes no arguments and +produces an unsigned long long result. + +Query for this feature with ``__has_builtin(__builtin_readcyclecounter)``. + +.. _langext-__builtin_shufflevector: + +``__builtin_shufflevector`` +--------------------------- + +``__builtin_shufflevector`` is used to express generic vector +permutation/shuffle/swizzle operations. This builtin is also very important +for the implementation of various target-specific header files like +````. + +**Syntax**: + +.. code-block:: c++ + + __builtin_shufflevector(vec1, vec2, index1, index2, ...) + +**Examples**: + +.. code-block:: c++ + + // Identity operation - return 4-element vector V1. + __builtin_shufflevector(V1, V1, 0, 1, 2, 3) + + // "Splat" element 0 of V1 into a 4-element result. + __builtin_shufflevector(V1, V1, 0, 0, 0, 0) + + // Reverse 4-element vector V1. + __builtin_shufflevector(V1, V1, 3, 2, 1, 0) + + // Concatenate every other element of 4-element vectors V1 and V2. + __builtin_shufflevector(V1, V2, 0, 2, 4, 6) + + // Concatenate every other element of 8-element vectors V1 and V2. + __builtin_shufflevector(V1, V2, 0, 2, 4, 6, 8, 10, 12, 14) + +**Description**: + +The first two arguments to ``__builtin_shufflevector`` are vectors that have +the same element type. The remaining arguments are a list of integers that +specify the elements indices of the first two vectors that should be extracted +and returned in a new vector. These element indices are numbered sequentially +starting with the first vector, continuing into the second vector. Thus, if +``vec1`` is a 4-element vector, index 5 would refer to the second element of +``vec2``. + +The result of ``__builtin_shufflevector`` is a vector with the same element +type as ``vec1``/``vec2`` but that has an element count equal to the number of +indices specified. + +Query for this feature with ``__has_builtin(__builtin_shufflevector)``. + +``__builtin_unreachable`` +------------------------- + +``__builtin_unreachable`` is used to indicate that a specific point in the +program cannot be reached, even if the compiler might otherwise think it can. +This is useful to improve optimization and eliminates certain warnings. For +example, without the ``__builtin_unreachable`` in the example below, the +compiler assumes that the inline asm can fall through and prints a "function +declared '``noreturn``' should not return" warning. + +**Syntax**: + +.. code-block:: c++ + + __builtin_unreachable() + +**Example of use**: + +.. code-block:: c++ + + void myabort(void) __attribute__((noreturn)); + void myabort(void) { + asm("int3"); + __builtin_unreachable(); + } + +**Description**: + +The ``__builtin_unreachable()`` builtin has completely undefined behavior. +Since it has undefined behavior, it is a statement that it is never reached and +the optimizer can take advantage of this to produce better code. This builtin +takes no arguments and produces a void result. + +Query for this feature with ``__has_builtin(__builtin_unreachable)``. + +``__sync_swap`` +--------------- + +``__sync_swap`` is used to atomically swap integers or pointers in memory. + +**Syntax**: + +.. code-block:: c++ + + type __sync_swap(type *ptr, type value, ...) + +**Example of Use**: + +.. code-block:: c++ + + int old_value = __sync_swap(&value, new_value); + +**Description**: + +The ``__sync_swap()`` builtin extends the existing ``__sync_*()`` family of +atomic intrinsics to allow code to atomically swap the current value with the +new value. More importantly, it helps developers write more efficient and +correct code by avoiding expensive loops around +``__sync_bool_compare_and_swap()`` or relying on the platform specific +implementation details of ``__sync_lock_test_and_set()``. The +``__sync_swap()`` builtin is a full barrier. + +.. _langext-__c11_atomic: + +__c11_atomic builtins +--------------------- + +Clang provides a set of builtins which are intended to be used to implement +C11's ```` header. These builtins provide the semantics of the +``_explicit`` form of the corresponding C11 operation, and are named with a +``__c11_`` prefix. The supported operations are: + +* ``__c11_atomic_init`` +* ``__c11_atomic_thread_fence`` +* ``__c11_atomic_signal_fence`` +* ``__c11_atomic_is_lock_free`` +* ``__c11_atomic_store`` +* ``__c11_atomic_load`` +* ``__c11_atomic_exchange`` +* ``__c11_atomic_compare_exchange_strong`` +* ``__c11_atomic_compare_exchange_weak`` +* ``__c11_atomic_fetch_add`` +* ``__c11_atomic_fetch_sub`` +* ``__c11_atomic_fetch_and`` +* ``__c11_atomic_fetch_or`` +* ``__c11_atomic_fetch_xor`` + +Non-standard C++11 Attributes +============================= + +Clang supports one non-standard C++11 attribute. It resides in the ``clang`` +attribute namespace. + +The ``clang::fallthrough`` attribute +------------------------------------ + +The ``clang::fallthrough`` attribute is used along with the +``-Wimplicit-fallthrough`` argument to annotate intentional fall-through +between switch labels. It can only be applied to a null statement placed at a +point of execution between any statement and the next switch label. It is +common to mark these places with a specific comment, but this attribute is +meant to replace comments with a more strict annotation, which can be checked +by the compiler. This attribute doesn't change semantics of the code and can +be used wherever an intended fall-through occurs. It is designed to mimic +control-flow statements like ``break;``, so it can be placed in most places +where ``break;`` can, but only if there are no statements on the execution path +between it and the next switch label. + +Here is an example: + +.. code-block:: c++ + + // compile with -Wimplicit-fallthrough + switch (n) { + case 22: + case 33: // no warning: no statements between case labels + f(); + case 44: // warning: unannotated fall-through + g(); + [[clang::fallthrough]]; + case 55: // no warning + if (x) { + h(); + break; + } + else { + i(); + [[clang::fallthrough]]; + } + case 66: // no warning + p(); + [[clang::fallthrough]]; // warning: fallthrough annotation does not + // directly precede case label + q(); + case 77: // warning: unannotated fall-through + r(); + } + +Target-Specific Extensions +========================== + +Clang supports some language features conditionally on some targets. + +X86/X86-64 Language Extensions +------------------------------ + +The X86 backend has these language extensions: + +Memory references off the GS segment +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Annotating a pointer with address space #256 causes it to be code generated +relative to the X86 GS segment register, and address space #257 causes it to be +relative to the X86 FS segment. Note that this is a very very low-level +feature that should only be used if you know what you're doing (for example in +an OS kernel). + +Here is an example: + +.. code-block:: c++ + + #define GS_RELATIVE __attribute__((address_space(256))) + int foo(int GS_RELATIVE *P) { + return *P; + } + +Which compiles to (on X86-32): + +.. code-block:: gas + + _foo: + movl 4(%esp), %eax + movl %gs:(%eax), %eax + ret + +Static Analysis-Specific Extensions +=================================== + +Clang supports additional attributes that are useful for documenting program +invariants and rules for static analysis tools. The extensions documented here +are used by the `path-sensitive static analyzer engine +`_ that is part of Clang's Analysis +library. + +The ``analyzer_noreturn`` attribute +----------------------------------- + +Clang's static analysis engine understands the standard ``noreturn`` attribute. +This attribute, which is typically affixed to a function prototype, indicates +that a call to a given function never returns. Function prototypes for common +functions like ``exit`` are typically annotated with this attribute, as well as +a variety of common assertion handlers. Users can educate the static analyzer +about their own custom assertion handles (thus cutting down on false positives +due to false paths) by marking their own "panic" functions with this attribute. + +While useful, ``noreturn`` is not applicable in all cases. Sometimes there are +special functions that for all intents and purposes should be considered panic +functions (i.e., they are only called when an internal program error occurs) +but may actually return so that the program can fail gracefully. The +``analyzer_noreturn`` attribute allows one to annotate such functions as being +interpreted as "no return" functions by the analyzer (thus pruning bogus paths) +but will not affect compilation (as in the case of ``noreturn``). + +**Usage**: The ``analyzer_noreturn`` attribute can be placed in the same places +where the ``noreturn`` attribute can be placed. It is commonly placed at the +end of function prototypes: + +.. code-block:: c++ + + void foo() __attribute__((analyzer_noreturn)); + +Query for this feature with ``__has_attribute(analyzer_noreturn)``. + +.. _langext-objc_method_family: + +The ``objc_method_family`` attribute +------------------------------------ + +Many methods in Objective-C have conventional meanings determined by their +selectors. For the purposes of static analysis, it is sometimes useful to be +able to mark a method as having a particular conventional meaning despite not +having the right selector, or as not having the conventional meaning that its +selector would suggest. For these use cases, we provide an attribute to +specifically describe the "method family" that a method belongs to. + +**Usage**: ``__attribute__((objc_method_family(X)))``, where ``X`` is one of +``none``, ``alloc``, ``copy``, ``init``, ``mutableCopy``, or ``new``. This +attribute can only be placed at the end of a method declaration: + +.. code-block:: objc + + - (NSString*) initMyStringValue __attribute__((objc_method_family(none))); + +Users who do not wish to change the conventional meaning of a method, and who +merely want to document its non-standard retain and release semantics, should +use the :ref:`retaining behavior attributes ` +described below. + +Query for this feature with ``__has_attribute(objc_method_family)``. + +.. _langext-objc-retain-release: + +Objective-C retaining behavior attributes +----------------------------------------- + +In Objective-C, functions and methods are generally assumed to take and return +objects with +0 retain counts, with some exceptions for special methods like +``+alloc`` and ``init``. However, there are exceptions, and so Clang provides +attributes to allow these exceptions to be documented, which helps the analyzer +find leaks (and ignore non-leaks). Some exceptions may be better described +using the :ref:`objc_method_family ` attribute +instead. + +**Usage**: The ``ns_returns_retained``, ``ns_returns_not_retained``, + ``ns_returns_autoreleased``, ``cf_returns_retained``, and + ``cf_returns_not_retained`` attributes can be placed on methods and functions + that return Objective-C or CoreFoundation objects. They are commonly placed + at the end of a function prototype or method declaration: + +.. code-block:: objc + + id foo() __attribute__((ns_returns_retained)); + + - (NSString*) bar: (int) x __attribute__((ns_returns_retained)); + +The ``*_returns_retained`` attributes specify that the returned object has a +1 +retain count. The ``*_returns_not_retained`` attributes specify that the return +object has a +0 retain count, even if the normal convention for its selector +would be +1. ``ns_returns_autoreleased`` specifies that the returned object is ++0, but is guaranteed to live at least as long as the next flush of an +autorelease pool. + +**Usage**: The ``ns_consumed`` and ``cf_consumed`` attributes can be placed on +an parameter declaration; they specify that the argument is expected to have a ++1 retain count, which will be balanced in some way by the function or method. +The ``ns_consumes_self`` attribute can only be placed on an Objective-C +method; it specifies that the method expects its ``self`` parameter to have a ++1 retain count, which it will balance in some way. + +.. code-block:: objc + + void foo(__attribute__((ns_consumed)) NSString *string); + + - (void) bar __attribute__((ns_consumes_self)); + - (void) baz: (id) __attribute__((ns_consumed)) x; + +Query for these features with ``__has_attribute(ns_consumed)``, +``__has_attribute(ns_returns_retained)``, etc. + +Dynamic Analysis-Specific Extensions +==================================== + +.. _langext-address_sanitizer: + +AddressSanitizer +---------------- + +Use ``__has_feature(address_sanitizer)`` to check if the code is being built +with `AddressSanitizer `_. + +Use ``__attribute__((no_address_safety_analysis))`` on a function declaration +to specify that address safety instrumentation (e.g. AddressSanitizer) should +not be applied to that function. + +Thread-Safety Annotation Checking +================================= + +Clang supports additional attributes for checking basic locking policies in +multithreaded programs. Clang currently parses the following list of +attributes, although **the implementation for these annotations is currently in +development.** For more details, see the `GCC implementation +`_. + +``no_thread_safety_analysis`` +----------------------------- + +Use ``__attribute__((no_thread_safety_analysis))`` on a function declaration to +specify that the thread safety analysis should not be run on that function. +This attribute provides an escape hatch (e.g. for situations when it is +difficult to annotate the locking policy). + +``lockable`` +------------ + +Use ``__attribute__((lockable))`` on a class definition to specify that it has +a lockable type (e.g. a Mutex class). This annotation is primarily used to +check consistency. + +``scoped_lockable`` +------------------- + +Use ``__attribute__((scoped_lockable))`` on a class definition to specify that +it has a "scoped" lockable type. Objects of this type will acquire the lock +upon construction and release it upon going out of scope. This annotation is +primarily used to check consistency. + +``guarded_var`` +--------------- + +Use ``__attribute__((guarded_var))`` on a variable declaration to specify that +the variable must be accessed while holding some lock. + +``pt_guarded_var`` +------------------ + +Use ``__attribute__((pt_guarded_var))`` on a pointer declaration to specify +that the pointer must be dereferenced while holding some lock. + +``guarded_by(l)`` +----------------- + +Use ``__attribute__((guarded_by(l)))`` on a variable declaration to specify +that the variable must be accessed while holding lock ``l``. + +``pt_guarded_by(l)`` +-------------------- + +Use ``__attribute__((pt_guarded_by(l)))`` on a pointer declaration to specify +that the pointer must be dereferenced while holding lock ``l``. + +``acquired_before(...)`` +------------------------ + +Use ``__attribute__((acquired_before(...)))`` on a declaration of a lockable +variable to specify that the lock must be acquired before all attribute +arguments. Arguments must be lockable type, and there must be at least one +argument. + +``acquired_after(...)`` +----------------------- + +Use ``__attribute__((acquired_after(...)))`` on a declaration of a lockable +variable to specify that the lock must be acquired after all attribute +arguments. Arguments must be lockable type, and there must be at least one +argument. + +``exclusive_lock_function(...)`` +-------------------------------- + +Use ``__attribute__((exclusive_lock_function(...)))`` on a function declaration +to specify that the function acquires all listed locks exclusively. This +attribute takes zero or more arguments: either of lockable type or integers +indexing into function parameters of lockable type. If no arguments are given, +the acquired lock is implicitly ``this`` of the enclosing object. + +``shared_lock_function(...)`` +----------------------------- + +Use ``__attribute__((shared_lock_function(...)))`` on a function declaration to +specify that the function acquires all listed locks, although the locks may be +shared (e.g. read locks). This attribute takes zero or more arguments: either +of lockable type or integers indexing into function parameters of lockable +type. If no arguments are given, the acquired lock is implicitly ``this`` of +the enclosing object. + +``exclusive_trylock_function(...)`` +----------------------------------- + +Use ``__attribute__((exclusive_lock_function(...)))`` on a function declaration +to specify that the function will try (without blocking) to acquire all listed +locks exclusively. This attribute takes one or more arguments. The first +argument is an integer or boolean value specifying the return value of a +successful lock acquisition. The remaining arugments are either of lockable +type or integers indexing into function parameters of lockable type. If only +one argument is given, the acquired lock is implicitly ``this`` of the +enclosing object. + +``shared_trylock_function(...)`` +-------------------------------- + +Use ``__attribute__((shared_lock_function(...)))`` on a function declaration to +specify that the function will try (without blocking) to acquire all listed +locks, although the locks may be shared (e.g. read locks). This attribute +takes one or more arguments. The first argument is an integer or boolean value +specifying the return value of a successful lock acquisition. The remaining +arugments are either of lockable type or integers indexing into function +parameters of lockable type. If only one argument is given, the acquired lock +is implicitly ``this`` of the enclosing object. + +``unlock_function(...)`` +------------------------ + +Use ``__attribute__((unlock_function(...)))`` on a function declaration to +specify that the function release all listed locks. This attribute takes zero +or more arguments: either of lockable type or integers indexing into function +parameters of lockable type. If no arguments are given, the acquired lock is +implicitly ``this`` of the enclosing object. + +``lock_returned(l)`` +-------------------- + +Use ``__attribute__((lock_returned(l)))`` on a function declaration to specify +that the function returns lock ``l`` (``l`` must be of lockable type). This +annotation is used to aid in resolving lock expressions. + +``locks_excluded(...)`` +----------------------- + +Use ``__attribute__((locks_excluded(...)))`` on a function declaration to +specify that the function must not be called with the listed locks. Arguments +must be lockable type, and there must be at least one argument. + +``exclusive_locks_required(...)`` +--------------------------------- + +Use ``__attribute__((exclusive_locks_required(...)))`` on a function +declaration to specify that the function must be called while holding the +listed exclusive locks. Arguments must be lockable type, and there must be at +least one argument. + +``shared_locks_required(...)`` +------------------------------ + +Use ``__attribute__((shared_locks_required(...)))`` on a function declaration +to specify that the function must be called while holding the listed shared +locks. Arguments must be lockable type, and there must be at least one +argument. + +Type Safety Checking +==================== + +Clang supports additional attributes to enable checking type safety properties +that can't be enforced by C type system. Usecases include: + +* MPI library implementations, where these attributes enable checking that + buffer type matches the passed ``MPI_Datatype``; +* for HDF5 library there is a similar usecase as MPI; +* checking types of variadic functions' arguments for functions like + ``fcntl()`` and ``ioctl()``. + +You can detect support for these attributes with ``__has_attribute()``. For +example: + +.. code-block:: c++ + + #if defined(__has_attribute) + # if __has_attribute(argument_with_type_tag) && \ + __has_attribute(pointer_with_type_tag) && \ + __has_attribute(type_tag_for_datatype) + # define ATTR_MPI_PWT(buffer_idx, type_idx) __attribute__((pointer_with_type_tag(mpi,buffer_idx,type_idx))) + /* ... other macros ... */ + # endif + #endif + + #if !defined(ATTR_MPI_PWT) + # define ATTR_MPI_PWT(buffer_idx, type_idx) + #endif + + int MPI_Send(void *buf, int count, MPI_Datatype datatype /*, other args omitted */) + ATTR_MPI_PWT(1,3); + +``argument_with_type_tag(...)`` +------------------------------- + +Use ``__attribute__((argument_with_type_tag(arg_kind, arg_idx, +type_tag_idx)))`` on a function declaration to specify that the function +accepts a type tag that determines the type of some other argument. +``arg_kind`` is an identifier that should be used when annotating all +applicable type tags. + +This attribute is primarily useful for checking arguments of variadic functions +(``pointer_with_type_tag`` can be used in most of non-variadic cases). + +For example: + +.. code-block:: c++ + + int fcntl(int fd, int cmd, ...) + __attribute__(( argument_with_type_tag(fcntl,3,2) )); + +``pointer_with_type_tag(...)`` +------------------------------ + +Use ``__attribute__((pointer_with_type_tag(ptr_kind, ptr_idx, type_tag_idx)))`` +on a function declaration to specify that the function accepts a type tag that +determines the pointee type of some other pointer argument. + +For example: + +.. code-block:: c++ + + int MPI_Send(void *buf, int count, MPI_Datatype datatype /*, other args omitted */) + __attribute__(( pointer_with_type_tag(mpi,1,3) )); + +``type_tag_for_datatype(...)`` +------------------------------ + +Clang supports annotating type tags of two forms. + +* **Type tag that is an expression containing a reference to some declared + identifier.** Use ``__attribute__((type_tag_for_datatype(kind, type)))`` on a + declaration with that identifier: + + .. code-block:: c++ + + extern struct mpi_datatype mpi_datatype_int + __attribute__(( type_tag_for_datatype(mpi,int) )); + #define MPI_INT ((MPI_Datatype) &mpi_datatype_int) + +* **Type tag that is an integral literal.** Introduce a ``static const`` + variable with a corresponding initializer value and attach + ``__attribute__((type_tag_for_datatype(kind, type)))`` on that declaration, + for example: + + .. code-block:: c++ + + #define MPI_INT ((MPI_Datatype) 42) + static const MPI_Datatype mpi_datatype_int + __attribute__(( type_tag_for_datatype(mpi,int) )) = 42 + +The attribute also accepts an optional third argument that determines how the +expression is compared to the type tag. There are two supported flags: + +* ``layout_compatible`` will cause types to be compared according to + layout-compatibility rules (C++11 [class.mem] p 17, 18). This is + implemented to support annotating types like ``MPI_DOUBLE_INT``. + + For example: + + .. code-block:: c++ + + /* In mpi.h */ + struct internal_mpi_double_int { double d; int i; }; + extern struct mpi_datatype mpi_datatype_double_int + __attribute__(( type_tag_for_datatype(mpi, struct internal_mpi_double_int, layout_compatible) )); + + #define MPI_DOUBLE_INT ((MPI_Datatype) &mpi_datatype_double_int) + + /* In user code */ + struct my_pair { double a; int b; }; + struct my_pair *buffer; + MPI_Send(buffer, 1, MPI_DOUBLE_INT /*, ... */); // no warning + + struct my_int_pair { int a; int b; } + struct my_int_pair *buffer2; + MPI_Send(buffer2, 1, MPI_DOUBLE_INT /*, ... */); // warning: actual buffer element + // type 'struct my_int_pair' + // doesn't match specified MPI_Datatype + +* ``must_be_null`` specifies that the expression should be a null pointer + constant, for example: + + .. code-block:: c++ + + /* In mpi.h */ + extern struct mpi_datatype mpi_datatype_null + __attribute__(( type_tag_for_datatype(mpi, void, must_be_null) )); + + #define MPI_DATATYPE_NULL ((MPI_Datatype) &mpi_datatype_null) + + /* In user code */ + MPI_Send(buffer, 1, MPI_DATATYPE_NULL /*, ... */); // warning: MPI_DATATYPE_NULL + // was specified but buffer + // is not a null pointer + diff --git a/docs/LibASTMatchers.html b/docs/LibASTMatchers.html deleted file mode 100644 index 8142c191a3..0000000000 --- a/docs/LibASTMatchers.html +++ /dev/null @@ -1,130 +0,0 @@ - - - -Matching the Clang AST - - - - - - - -
- -

Matching the Clang AST

-

This document explains how to use Clang's LibASTMatchers to match interesting -nodes of the AST and execute code that uses the matched nodes. Combined with -LibTooling, LibASTMatchers helps to write -code-to-code transformation tools or query tools.

- -

We assume basic knowledge about the Clang AST. See the -Introduction to the Clang AST if -you want to learn more about how the AST is structured.

- - - - -

Introduction

- - -

LibASTMatchers provides a domain specific language to create predicates on Clang's -AST. This DSL is written in and can be used from C++, allowing users to write -a single program to both match AST nodes and access the node's C++ interface -to extract attributes, source locations, or any other information provided on -the AST level.

- -

AST matchers are predicates on nodes in the AST. Matchers are created -by calling creator functions that allow building up a tree of matchers, where -inner matchers are used to make the match more specific.

- -

For example, to create a matcher that matches all class or union declarations -in the AST of a translation unit, you can call -recordDecl(). -To narrow the match down, for example to find all class or union declarations with the name "Foo", -insert a hasName -matcher: the call recordDecl(hasName("Foo")) returns a matcher that matches classes -or unions that are named "Foo", in any namespace. By default, matchers that accept -multiple inner matchers use an implicit allOf(). -This allows further narrowing down the match, for example to match all classes -that are derived from "Bar": recordDecl(hasName("Foo"), isDerivedFrom("Bar")).

- - -

How to create a matcher

- - -

With more than a thousand classes in the Clang AST, one can quickly get lost -when trying to figure out how to create a matcher for a specific pattern. This -section will teach you how to use a rigorous step-by-step pattern to build the -matcher you are interested in. Note that there will always be matchers missing -for some part of the AST. See the section about how to write -your own AST matchers later in this document.

- -

The precondition to using the matchers is to understand how the AST -for what you want to match looks like. The Introduction to the Clang AST -teaches you how to dump a translation unit's AST into a human readable format.

- - - - -

In general, the strategy to create the right matchers is:

-
    -
  1. Find the outermost class in Clang's AST you want to match.
  2. -
  3. Look at the AST Matcher Reference for matchers that either match the -node you're interested in or narrow down attributes on the node.
  4. -
  5. Create your outer match expression. Verify that it works as expected.
  6. -
  7. Examine the matchers for what the next inner node you want to match is.
  8. -
  9. Repeat until the matcher is finished.
  10. -
- - -

Binding nodes in match expressions

- - -

Matcher expressions allow you to specify which parts of the AST are interesting -for a certain task. Often you will want to then do something with the nodes -that were matched, like building source code transformations.

- -

To that end, matchers that match specific AST nodes (so called node matchers) -are bindable; for example, recordDecl(hasName("MyClass")).bind("id") will bind -the matched recordDecl node to the string "id", to be later retrieved in the -match callback.

- - - - - -

Writing your own matchers

- - -

There are multiple different ways to define a matcher, depending on its -type and flexibility.

-
    -
  • VariadicDynCastAllOfMatcher<Base, Derived>

    Those match all nodes -of type Base if they can be dynamically casted to Derived. The -names of those matchers are nouns, which closely resemble Derived. -VariadicDynCastAllOfMatchers are the backbone of the matcher hierarchy. Most -often, your match expression will start with one of them, and you can -bind the node they represent to ids for later processing.

    -

    VariadicDynCastAllOfMatchers are callable classes that model variadic -template functions in C++03. They take an aribtrary number of Matcher<Derived> -and return a Matcher<Base>.

  • -
  • AST_MATCHER_P(Type, Name, ParamType, Param)

    Most matcher definitions -use the matcher creation macros. Those define both the matcher of type Matcher<Type> -itself, and a matcher-creation function named Name that takes a parameter -of type ParamType and returns the corresponding matcher.

    -

    There are multiple matcher definition macros that deal with polymorphic return -values and different parameter counts. See ASTMatchersMacros.h. -

  • -
  • Matcher creation functions

    Matchers are generated by nesting -calls to matcher creation functions. Most of the time those functions are either -created by using VariadicDynCastAllOfMatcher or the matcher creation macros -(see below). The free-standing functions are an indication that this matcher -is just a combination of other matchers, as is for example the case with -callee.

  • -
- -
- - - diff --git a/docs/LibASTMatchers.rst b/docs/LibASTMatchers.rst new file mode 100644 index 0000000000..56a1a7fa4b --- /dev/null +++ b/docs/LibASTMatchers.rst @@ -0,0 +1,134 @@ +====================== +Matching the Clang AST +====================== + +This document explains how to use Clang's LibASTMatchers to match interesting +nodes of the AST and execute code that uses the matched nodes. Combined with +:doc:`LibTooling`, LibASTMatchers helps to write code-to-code transformation +tools or query tools. + +We assume basic knowledge about the Clang AST. See the `Introduction to the +Clang AST `_ if you want to learn more about +how the AST is structured. + +.. FIXME: create tutorial and link to the tutorial + +Introduction +------------ + +LibASTMatchers provides a domain specific language to create predicates on +Clang's AST. This DSL is written in and can be used from C++, allowing users +to write a single program to both match AST nodes and access the node's C++ +interface to extract attributes, source locations, or any other information +provided on the AST level. + +AST matchers are predicates on nodes in the AST. Matchers are created by +calling creator functions that allow building up a tree of matchers, where +inner matchers are used to make the match more specific. + +For example, to create a matcher that matches all class or union declarations +in the AST of a translation unit, you can call `recordDecl() +`_. To narrow the match down, +for example to find all class or union declarations with the name "``Foo``", +insert a `hasName `_ matcher: the +call ``recordDecl(hasName("Foo"))`` returns a matcher that matches classes or +unions that are named "``Foo``", in any namespace. By default, matchers that +accept multiple inner matchers use an implicit `allOf() +`_. This allows further narrowing +down the match, for example to match all classes that are derived from +"``Bar``": ``recordDecl(hasName("Foo"), isDerivedFrom("Bar"))``. + +How to create a matcher +----------------------- + +With more than a thousand classes in the Clang AST, one can quickly get lost +when trying to figure out how to create a matcher for a specific pattern. This +section will teach you how to use a rigorous step-by-step pattern to build the +matcher you are interested in. Note that there will always be matchers missing +for some part of the AST. See the section about :ref:`how to write your own +AST matchers ` later in this document. + +.. FIXME: why is it linking back to the same section?! + +The precondition to using the matchers is to understand how the AST for what you +want to match looks like. The +`Introduction to the Clang AST `_ teaches you +how to dump a translation unit's AST into a human readable format. + +.. FIXME: Introduce link to ASTMatchersTutorial.html +.. FIXME: Introduce link to ASTMatchersCookbook.html + +In general, the strategy to create the right matchers is: + +#. Find the outermost class in Clang's AST you want to match. +#. Look at the `AST Matcher Reference `_ for + matchers that either match the node you're interested in or narrow down + attributes on the node. +#. Create your outer match expression. Verify that it works as expected. +#. Examine the matchers for what the next inner node you want to match is. +#. Repeat until the matcher is finished. + +.. _astmatchers-bind: + +Binding nodes in match expressions +---------------------------------- + +Matcher expressions allow you to specify which parts of the AST are interesting +for a certain task. Often you will want to then do something with the nodes +that were matched, like building source code transformations. + +To that end, matchers that match specific AST nodes (so called node matchers) +are bindable; for example, ``recordDecl(hasName("MyClass")).bind("id")`` will +bind the matched ``recordDecl`` node to the string "``id``", to be later +retrieved in the `match callback +`_. + +.. FIXME: Introduce link to ASTMatchersTutorial.html +.. FIXME: Introduce link to ASTMatchersCookbook.html + +Writing your own matchers +------------------------- + +There are multiple different ways to define a matcher, depending on its type +and flexibility. + +``VariadicDynCastAllOfMatcher`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Those match all nodes of type *Base* if they can be dynamically casted to +*Derived*. The names of those matchers are nouns, which closely resemble +*Derived*. ``VariadicDynCastAllOfMatchers`` are the backbone of the matcher +hierarchy. Most often, your match expression will start with one of them, and +you can :ref:`bind ` the node they represent to ids for later +processing. + +``VariadicDynCastAllOfMatchers`` are callable classes that model variadic +template functions in C++03. They take an aribtrary number of +``Matcher`` and return a ``Matcher``. + +``AST_MATCHER_P(Type, Name, ParamType, Param)`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Most matcher definitions use the matcher creation macros. Those define both +the matcher of type ``Matcher`` itself, and a matcher-creation function +named *Name* that takes a parameter of type *ParamType* and returns the +corresponding matcher. + +There are multiple matcher definition macros that deal with polymorphic return +values and different parameter counts. See `ASTMatchersMacros.h +`_. + +.. _astmatchers-writing: + +Matcher creation functions +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Matchers are generated by nesting calls to matcher creation functions. Most of +the time those functions are either created by using +``VariadicDynCastAllOfMatcher`` or the matcher creation macros (see below). +The free-standing functions are an indication that this matcher is just a +combination of other matchers, as is for example the case with `callee +`_. + +.. FIXME: "... macros (see below)" --- there isn't anything below + diff --git a/docs/LibTooling.html b/docs/LibTooling.html deleted file mode 100644 index 163d24a7f1..0000000000 --- a/docs/LibTooling.html +++ /dev/null @@ -1,212 +0,0 @@ - - - -LibTooling - - - - - - - -
- -

LibTooling

-

LibTooling is a library to support writing standalone tools based on -Clang. This document will provide a basic walkthrough of how to write -a tool using LibTooling.

-

For the information on how to setup Clang Tooling for LLVM see -HowToSetupToolingForLLVM.html

- - -

Introduction

- - -

Tools built with LibTooling, like Clang Plugins, run -FrontendActions over code. - -In this tutorial, we'll demonstrate the different ways of running clang's -SyntaxOnlyAction, which runs a quick syntax check, over a bunch of -code.

- - -

Parsing a code snippet in memory.

- - -

If you ever wanted to run a FrontendAction over some sample -code, for example to unit test parts of the Clang AST, -runToolOnCode is what you looked for. Let me give you an example: -

-  #include "clang/Tooling/Tooling.h"
-
-  TEST(runToolOnCode, CanSyntaxCheckCode) {
-    // runToolOnCode returns whether the action was correctly run over the
-    // given code.
-    EXPECT_TRUE(runToolOnCode(new clang::SyntaxOnlyAction, "class X {};"));
-  }
-
- - -

Writing a standalone tool.

- - -

Once you unit tested your FrontendAction to the point where it -cannot possibly break, it's time to create a standalone tool. For a standalone -tool to run clang, it first needs to figure out what command line arguments to -use for a specified file. To that end we create a -CompilationDatabase. There are different ways to create a -compilation database, and we need to support all of them depending on -command-line options. There's the CommonOptionsParser class -that takes the responsibility to parse command-line parameters related to -compilation databases and inputs, so that all tools share the implementation. -

- -

Parsing common tools options.

-

CompilationDatabase can be read from a build directory or the -command line. Using CommonOptionsParser allows for explicit -specification of a compile command line, specification of build path using the --p command-line option, and automatic location of the compilation -database using source files paths. -

-#include "clang/Tooling/CommonOptionsParser.h"
-
-using namespace clang::tooling;
-
-int main(int argc, const char **argv) {
-  // CommonOptionsParser constructor will parse arguments and create a
-  // CompilationDatabase. In case of error it will terminate the program.
-  CommonOptionsParser OptionsParser(argc, argv);
-
-  // Use OptionsParser.GetCompilations() and OptionsParser.GetSourcePathList()
-  // to retrieve CompilationDatabase and the list of input file paths.
-}
-
-

- -

Creating and running a ClangTool.

-

Once we have a CompilationDatabase, we can create a -ClangTool and run our FrontendAction over some code. -For example, to run the SyntaxOnlyAction over the files "a.cc" and -"b.cc" one would write: -

-  // A clang tool can run over a number of sources in the same process...
-  std::vector<std::string> Sources;
-  Sources.push_back("a.cc");
-  Sources.push_back("b.cc");
-
-  // We hand the CompilationDatabase we created and the sources to run over into
-  // the tool constructor.
-  ClangTool Tool(OptionsParser.GetCompilations(), Sources);
-
-  // The ClangTool needs a new FrontendAction for each translation unit we run
-  // on. Thus, it takes a FrontendActionFactory as parameter. To create a
-  // FrontendActionFactory from a given FrontendAction type, we call
-  // newFrontendActionFactory<clang::SyntaxOnlyAction>().
-  int result = Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>());
-
-

- -

Putting it together - the first tool.

-

Now we combine the two previous steps into our first real tool. This example -tool is also checked into the clang tree at tools/clang-check/ClangCheck.cpp. -

-// Declares clang::SyntaxOnlyAction.
-#include "clang/Frontend/FrontendActions.h"
-#include "clang/Tooling/CommonOptionsParser.h"
-#include "clang/Tooling/Tooling.h"
-// Declares llvm::cl::extrahelp.
-#include "llvm/Support/CommandLine.h"
-
-using namespace clang::tooling;
-using namespace llvm;
-
-// CommonOptionsParser declares HelpMessage with a description of the common
-// command-line options related to the compilation database and input files.
-// It's nice to have this help message in all tools.
-static cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage);
-
-// A help message for this specific tool can be added afterwards.
-static cl::extrahelp MoreHelp("\nMore help text...");
-
-int main(int argc, const char **argv) {
-  CommonOptionsParser OptionsParser(argc, argv);
-  ClangTool Tool(OptionsParser.GetCompilations(),
-                 OptionsParser.GetSourcePathList());
-  return Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>());
-}
-
-

- -

Running the tool on some code.

-

When you check out and build clang, clang-check is already built and -available to you in bin/clang-check inside your build directory.

-

You can run clang-check on a file in the llvm repository by specifying -all the needed parameters after a "--" separator: -

-  $ cd /path/to/source/llvm
-  $ export BD=/path/to/build/llvm
-  $ $BD/bin/clang-check tools/clang/tools/clang-check/ClangCheck.cpp -- \
-    clang++ -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS \
-    -Itools/clang/include -I$BD/include -Iinclude -Itools/clang/lib/Headers -c
-
-

- -

As an alternative, you can also configure cmake to output a compile command -database into its build directory: -

-  # Alternatively to calling cmake, use ccmake, toggle to advanced mode and
-  # set the parameter CMAKE_EXPORT_COMPILE_COMMANDS from the UI.
-  $ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .
-
-

-

-This creates a file called compile_commands.json in the build directory. Now -you can run clang-check over files in the project by specifying the build path -as first argument and some source files as further positional arguments: -

-  $ cd /path/to/source/llvm
-  $ export BD=/path/to/build/llvm
-  $ $BD/bin/clang-check -p $BD tools/clang/tools/clang-check/ClangCheck.cpp
-
-

- -

Builtin includes.

-

Clang tools need their builtin headers and search for them the same way clang -does. Thus, the default location to look for builtin headers is in a path -$(dirname /path/to/tool)/../lib/clang/3.2/include relative to the tool -binary. This works out-of-the-box for tools running from llvm's toplevel -binary directory after building clang-headers, or if the tool is running -from the binary directory of a clang install next to the clang binary.

- -

Tips: if your tool fails to find stddef.h or similar headers, call -the tool with -v and look at the search paths it looks through.

- -

Linking.

-

Please note that this presents the linking requirements at the time of this -writing. For the most up-to-date information, look at one of the tools' -Makefiles (for example -clang-check/Makefile). -

- -

To link a binary using the tooling infrastructure, link in the following -libraries: -

    -
  • Tooling
  • -
  • Frontend
  • -
  • Driver
  • -
  • Serialization
  • -
  • Parse
  • -
  • Sema
  • -
  • Analysis
  • -
  • Edit
  • -
  • AST
  • -
  • Lex
  • -
  • Basic
  • -
-

- -
- - - diff --git a/docs/LibTooling.rst b/docs/LibTooling.rst new file mode 100644 index 0000000000..d9c17f681e --- /dev/null +++ b/docs/LibTooling.rst @@ -0,0 +1,206 @@ +========== +LibTooling +========== + +LibTooling is a library to support writing standalone tools based on Clang. +This document will provide a basic walkthrough of how to write a tool using +LibTooling. + +For the information on how to setup Clang Tooling for LLVM see +`HowToSetupToolingForLLVM.html `_ + +Introduction +------------ + +Tools built with LibTooling, like Clang Plugins, run ``FrontendActions`` over +code. + +.. See FIXME for a tutorial on how to write FrontendActions. + +In this tutorial, we'll demonstrate the different ways of running Clang's +``SyntaxOnlyAction``, which runs a quick syntax check, over a bunch of code. + +Parsing a code snippet in memory +-------------------------------- + +If you ever wanted to run a ``FrontendAction`` over some sample code, for +example to unit test parts of the Clang AST, ``runToolOnCode`` is what you +looked for. Let me give you an example: + +.. code-block:: c++ + + #include "clang/Tooling/Tooling.h" + + TEST(runToolOnCode, CanSyntaxCheckCode) { + // runToolOnCode returns whether the action was correctly run over the + // given code. + EXPECT_TRUE(runToolOnCode(new clang::SyntaxOnlyAction, "class X {};")); + } + +Writing a standalone tool +------------------------- + +Once you unit tested your ``FrontendAction`` to the point where it cannot +possibly break, it's time to create a standalone tool. For a standalone tool +to run clang, it first needs to figure out what command line arguments to use +for a specified file. To that end we create a ``CompilationDatabase``. There +are different ways to create a compilation database, and we need to support all +of them depending on command-line options. There's the ``CommonOptionsParser`` +class that takes the responsibility to parse command-line parameters related to +compilation databases and inputs, so that all tools share the implementation. + +Parsing common tools options +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``CompilationDatabase`` can be read from a build directory or the command line. +Using ``CommonOptionsParser`` allows for explicit specification of a compile +command line, specification of build path using the ``-p`` command-line option, +and automatic location of the compilation database using source files paths. + +.. code-block:: c++ + + #include "clang/Tooling/CommonOptionsParser.h" + + using namespace clang::tooling; + + int main(int argc, const char **argv) { + // CommonOptionsParser constructor will parse arguments and create a + // CompilationDatabase. In case of error it will terminate the program. + CommonOptionsParser OptionsParser(argc, argv); + + // Use OptionsParser.GetCompilations() and OptionsParser.GetSourcePathList() + // to retrieve CompilationDatabase and the list of input file paths. + } + +Creating and running a ClangTool +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Once we have a ``CompilationDatabase``, we can create a ``ClangTool`` and run +our ``FrontendAction`` over some code. For example, to run the +``SyntaxOnlyAction`` over the files "a.cc" and "b.cc" one would write: + +.. code-block:: c++ + + // A clang tool can run over a number of sources in the same process... + std::vector Sources; + Sources.push_back("a.cc"); + Sources.push_back("b.cc"); + + // We hand the CompilationDatabase we created and the sources to run over into + // the tool constructor. + ClangTool Tool(OptionsParser.GetCompilations(), Sources); + + // The ClangTool needs a new FrontendAction for each translation unit we run + // on. Thus, it takes a FrontendActionFactory as parameter. To create a + // FrontendActionFactory from a given FrontendAction type, we call + // newFrontendActionFactory(). + int result = Tool.run(newFrontendActionFactory()); + +Putting it together --- the first tool +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Now we combine the two previous steps into our first real tool. This example +tool is also checked into the clang tree at +``tools/clang-check/ClangCheck.cpp``. + +.. code-block:: c++ + + // Declares clang::SyntaxOnlyAction. + #include "clang/Frontend/FrontendActions.h" + #include "clang/Tooling/CommonOptionsParser.h" + #include "clang/Tooling/Tooling.h" + // Declares llvm::cl::extrahelp. + #include "llvm/Support/CommandLine.h" + + using namespace clang::tooling; + using namespace llvm; + + // CommonOptionsParser declares HelpMessage with a description of the common + // command-line options related to the compilation database and input files. + // It's nice to have this help message in all tools. + static cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage); + + // A help message for this specific tool can be added afterwards. + static cl::extrahelp MoreHelp("\nMore help text..."); + + int main(int argc, const char **argv) { + CommonOptionsParser OptionsParser(argc, argv); + ClangTool Tool(OptionsParser.GetCompilations(), + OptionsParser.GetSourcePathList()); + return Tool.run(newFrontendActionFactory()); + } + +Running the tool on some code +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When you check out and build clang, clang-check is already built and available +to you in bin/clang-check inside your build directory. + +You can run clang-check on a file in the llvm repository by specifying all the +needed parameters after a "``--``" separator: + +.. code-block:: bash + + $ cd /path/to/source/llvm + $ export BD=/path/to/build/llvm + $ $BD/bin/clang-check tools/clang/tools/clang-check/ClangCheck.cpp -- \ + clang++ -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS \ + -Itools/clang/include -I$BD/include -Iinclude \ + -Itools/clang/lib/Headers -c + +As an alternative, you can also configure cmake to output a compile command +database into its build directory: + +.. code-block:: bash + + # Alternatively to calling cmake, use ccmake, toggle to advanced mode and + # set the parameter CMAKE_EXPORT_COMPILE_COMMANDS from the UI. + $ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON . + +This creates a file called ``compile_commands.json`` in the build directory. +Now you can run :program:`clang-check` over files in the project by specifying +the build path as first argument and some source files as further positional +arguments: + +.. code-block:: bash + + $ cd /path/to/source/llvm + $ export BD=/path/to/build/llvm + $ $BD/bin/clang-check -p $BD tools/clang/tools/clang-check/ClangCheck.cpp + +Builtin includes +^^^^^^^^^^^^^^^^ + +Clang tools need their builtin headers and search for them the same way Clang +does. Thus, the default location to look for builtin headers is in a path +``$(dirname /path/to/tool)/../lib/clang/3.2/include`` relative to the tool +binary. This works out-of-the-box for tools running from llvm's toplevel +binary directory after building clang-headers, or if the tool is running from +the binary directory of a clang install next to the clang binary. + +Tips: if your tool fails to find ``stddef.h`` or similar headers, call the tool +with ``-v`` and look at the search paths it looks through. + +Linking +^^^^^^^ + +Please note that this presents the linking requirements at the time of this +writing. For the most up-to-date information, look at one of the tools' +Makefiles (for example `clang-check/Makefile +`_). + +To link a binary using the tooling infrastructure, link in the following +libraries: + +* Tooling +* Frontend +* Driver +* Serialization +* Parse +* Sema +* Analysis +* Edit +* AST +* Lex +* Basic + diff --git a/docs/PCHInternals.html b/docs/PCHInternals.html deleted file mode 100644 index 7fed5bab84..0000000000 --- a/docs/PCHInternals.html +++ /dev/null @@ -1,658 +0,0 @@ - - - - Precompiled Header and Modules Internals - - - - - - - - - -
- -

Precompiled Header and Modules Internals

- -

This document describes the design and implementation of Clang's - precompiled headers (PCH) and modules. If you are interested in the end-user - view, please see the User's Manual.

- -

Table of Contents

- - -

Using Precompiled Headers with clang

- -

The Clang compiler frontend, clang -cc1, supports two command line -options for generating and using PCH files.

- -

To generate PCH files using clang -cc1, use the option --emit-pch: - -

 $ clang -cc1 test.h -emit-pch -o test.h.pch 
- -

This option is transparently used by clang when generating -PCH files. The resulting PCH file contains the serialized form of the -compiler's internal representation after it has completed parsing and -semantic analysis. The PCH file can then be used as a prefix header -with the -include-pch option:

- -
-  $ clang -cc1 -include-pch test.h.pch test.c -o test.s
-
- -

Design Philosophy

- -

Precompiled headers are meant to improve overall compile times for - projects, so the design of precompiled headers is entirely driven by - performance concerns. The use case for precompiled headers is - relatively simple: when there is a common set of headers that is - included in nearly every source file in the project, we - precompile that bundle of headers into a single precompiled - header (PCH file). Then, when compiling the source files in the - project, we load the PCH file first (as a prefix header), which acts - as a stand-in for that bundle of headers.

- -

A precompiled header implementation improves performance when:

-
    -
  • Loading the PCH file is significantly faster than re-parsing the - bundle of headers stored within the PCH file. Thus, a precompiled - header design attempts to minimize the cost of reading the PCH - file. Ideally, this cost should not vary with the size of the - precompiled header file.
  • - -
  • The cost of generating the PCH file initially is not so large - that it counters the per-source-file performance improvement due to - eliminating the need to parse the bundled headers in the first - place. This is particularly important on multi-core systems, because - PCH file generation serializes the build when all compilations - require the PCH file to be up-to-date.
  • -
- -

Modules, as implemented in Clang, use the same mechanisms as -precompiled headers to save a serialized AST file (one per module) and -use those AST modules. From an implementation standpoint, modules are -a generalization of precompiled headers, lifting a number of -restrictions placed on precompiled headers. In particular, there can -only be one precompiled header and it must be included at the -beginning of the translation unit. The extensions to the AST file -format required for modules are discussed in the section on modules.

- -

Clang's AST files are designed with a compact on-disk -representation, which minimizes both creation time and the time -required to initially load the AST file. The AST file itself contains -a serialized representation of Clang's abstract syntax trees and -supporting data structures, stored using the same compressed bitstream -as LLVM's bitcode -file format.

- -

Clang's AST files are loaded "lazily" from disk. When an -AST file is initially loaded, Clang reads only a small amount of data -from the AST file to establish where certain important data structures -are stored. The amount of data read in this initial load is -independent of the size of the AST file, such that a larger AST file -does not lead to longer AST load times. The actual header data in the -AST file--macros, functions, variables, types, etc.--is loaded only -when it is referenced from the user's code, at which point only that -entity (and those entities it depends on) are deserialized from the -AST file. With this approach, the cost of using an AST file -for a translation unit is proportional to the amount of code actually -used from the AST file, rather than being proportional to the size of -the AST file itself.

- -

When given the -print-stats option, Clang produces -statistics describing how much of the AST file was actually -loaded from disk. For a simple "Hello, World!" program that includes -the Apple Cocoa.h header (which is built as a precompiled -header), this option illustrates how little of the actual precompiled -header is required:

- -
-*** PCH Statistics:
-  933 stat cache hits
-  4 stat cache misses
-  895/39981 source location entries read (2.238563%)
-  19/15315 types read (0.124061%)
-  20/82685 declarations read (0.024188%)
-  154/58070 identifiers read (0.265197%)
-  0/7260 selectors read (0.000000%)
-  0/30842 statements read (0.000000%)
-  4/8400 macros read (0.047619%)
-  1/4995 lexical declcontexts read (0.020020%)
-  0/4413 visible declcontexts read (0.000000%)
-  0/7230 method pool entries read (0.000000%)
-  0 method pool misses
-
- -

For this small program, only a tiny fraction of the source -locations, types, declarations, identifiers, and macros were actually -deserialized from the precompiled header. These statistics can be -useful to determine whether the AST file implementation can -be improved by making more of the implementation lazy.

- -

Precompiled headers can be chained. When you create a PCH while -including an existing PCH, Clang can create the new PCH by referencing -the original file and only writing the new data to the new file. For -example, you could create a PCH out of all the headers that are very -commonly used throughout your project, and then create a PCH for every -single source file in the project that includes the code that is -specific to that file, so that recompiling the file itself is very fast, -without duplicating the data from the common headers for every -file. The mechanisms behind chained precompiled headers are discussed -in a later section. - -

AST File Contents

- -Precompiled header layout - -

Clang's AST files are organized into several different -blocks, each of which contains the serialized representation of a part -of Clang's internal representation. Each of the blocks corresponds to -either a block or a record within LLVM's bitstream -format. The contents of each of these logical blocks are described -below.

- -

For a given AST file, the llvm-bcanalyzer -utility can be used to examine the actual structure of the bitstream -for the AST file. This information can be used both to help -understand the structure of the AST file and to isolate -areas where AST files can still be optimized, e.g., through -the introduction of abbreviations.

- -

Metadata Block

- -

The metadata block contains several records that provide -information about how the AST file was built. This metadata -is primarily used to validate the use of an AST file. For -example, a precompiled header built for a 32-bit x86 target cannot be used -when compiling for a 64-bit x86 target. The metadata block contains -information about:

- -
-
Language options
-
Describes the particular language dialect used to compile the -AST file, including major options (e.g., Objective-C support) and more -minor options (e.g., support for "//" comments). The contents of this -record correspond to the LangOptions class.
- -
Target architecture
-
The target triple that describes the architecture, platform, and -ABI for which the AST file was generated, e.g., -i386-apple-darwin9.
- -
AST version
-
The major and minor version numbers of the AST file -format. Changes in the minor version number should not affect backward -compatibility, while changes in the major version number imply that a -newer compiler cannot read an older precompiled header (and -vice-versa).
- -
Original file name
-
The full path of the header that was used to generate the -AST file.
- -
Predefines buffer
-
Although not explicitly stored as part of the metadata, the -predefines buffer is used in the validation of the AST file. -The predefines buffer itself contains code generated by the compiler -to initialize the preprocessor state according to the current target, -platform, and command-line options. For example, the predefines buffer -will contain "#define __STDC__ 1" when we are compiling C -without Microsoft extensions. The predefines buffer itself is stored -within the source manager block, but its -contents are verified along with the rest of the metadata.
- -
- -

A chained PCH file (that is, one that references another PCH) and a -module (which may import other modules) have additional metadata -containing the list of all AST files that this AST file depends -on. Each of those files will be loaded along with this AST file.

- -

For chained precompiled headers, the language options, target -architecture and predefines buffer data is taken from the end of the -chain, since they have to match anyway.

- -

Source Manager Block

- -

The source manager block contains the serialized representation of -Clang's SourceManager class, -which handles the mapping from source locations (as represented in -Clang's abstract syntax tree) into actual column/line positions within -a source file or macro instantiation. The AST file's -representation of the source manager also includes information about -all of the headers that were (transitively) included when building the -AST file.

- -

The bulk of the source manager block is dedicated to information -about the various files, buffers, and macro instantiations into which -a source location can refer. Each of these is referenced by a numeric -"file ID", which is a unique number (allocated starting at 1) stored -in the source location. Clang serializes the information for each kind -of file ID, along with an index that maps file IDs to the position -within the AST file where the information about that file ID is -stored. The data associated with a file ID is loaded only when -required by the front end, e.g., to emit a diagnostic that includes a -macro instantiation history inside the header itself.

- -

The source manager block also contains information about all of the -headers that were included when building the AST file. This -includes information about the controlling macro for the header (e.g., -when the preprocessor identified that the contents of the header -dependent on a macro like LLVM_CLANG_SOURCEMANAGER_H) -along with a cached version of the results of the stat() -system calls performed when building the AST file. The -latter is particularly useful in reducing system time when searching -for include files.

- -

Preprocessor Block

- -

The preprocessor block contains the serialized representation of -the preprocessor. Specifically, it contains all of the macros that -have been defined by the end of the header used to build the -AST file, along with the token sequences that comprise each -macro. The macro definitions are only read from the AST file when the -name of the macro first occurs in the program. This lazy loading of -macro definitions is triggered by lookups into the identifier table.

- -

Types Block

- -

The types block contains the serialized representation of all of -the types referenced in the translation unit. Each Clang type node -(PointerType, FunctionProtoType, etc.) has a -corresponding record type in the AST file. When types are deserialized -from the AST file, the data within the record is used to -reconstruct the appropriate type node using the AST context.

- -

Each type has a unique type ID, which is an integer that uniquely -identifies that type. Type ID 0 represents the NULL type, type IDs -less than NUM_PREDEF_TYPE_IDS represent predefined types -(void, float, etc.), while other -"user-defined" type IDs are assigned consecutively from -NUM_PREDEF_TYPE_IDS upward as the types are encountered. -The AST file has an associated mapping from the user-defined types -block to the location within the types block where the serialized -representation of that type resides, enabling lazy deserialization of -types. When a type is referenced from within the AST file, that -reference is encoded using the type ID shifted left by 3 bits. The -lower three bits are used to represent the const, -volatile, and restrict qualifiers, as in -Clang's QualType -class.

- -

Declarations Block

- -

The declarations block contains the serialized representation of -all of the declarations referenced in the translation unit. Each Clang -declaration node (VarDecl, FunctionDecl, -etc.) has a corresponding record type in the AST file. When -declarations are deserialized from the AST file, the data -within the record is used to build and populate a new instance of the -corresponding Decl node. As with types, each declaration -node has a numeric ID that is used to refer to that declaration within -the AST file. In addition, a lookup table provides a mapping from that -numeric ID to the offset within the precompiled header where that -declaration is described.

- -

Declarations in Clang's abstract syntax trees are stored -hierarchically. At the top of the hierarchy is the translation unit -(TranslationUnitDecl), which contains all of the -declarations in the translation unit but is not actually written as a -specific declaration node. Its child declarations (such as -functions or struct types) may also contain other declarations inside -them, and so on. Within Clang, each declaration is stored within a declaration -context, as represented by the DeclContext class. -Declaration contexts provide the mechanism to perform name lookup -within a given declaration (e.g., find the member named x -in a structure) and iterate over the declarations stored within a -context (e.g., iterate over all of the fields of a structure for -structure layout).

- -

In Clang's AST file format, deserializing a declaration -that is a DeclContext is a separate operation from -deserializing all of the declarations stored within that declaration -context. Therefore, Clang will deserialize the translation unit -declaration without deserializing the declarations within that -translation unit. When required, the declarations stored within a -declaration context will be deserialized. There are two representations -of the declarations within a declaration context, which correspond to -the name-lookup and iteration behavior described above:

- -
    -
  • When the front end performs name lookup to find a name - x within a given declaration context (for example, - during semantic analysis of the expression p->x, - where p's type is defined in the precompiled header), - Clang refers to an on-disk hash table that maps from the names - within that declaration context to the declaration IDs that - represent each visible declaration with that name. The actual - declarations will then be deserialized to provide the results of - name lookup.
  • - -
  • When the front end performs iteration over all of the - declarations within a declaration context, all of those declarations - are immediately de-serialized. For large declaration contexts (e.g., - the translation unit), this operation is expensive; however, large - declaration contexts are not traversed in normal compilation, since - such a traversal is unnecessary. However, it is common for the code - generator and semantic analysis to traverse declaration contexts for - structs, classes, unions, and enumerations, although those contexts - contain relatively few declarations in the common case.
  • -
- -

Statements and Expressions

- -

Statements and expressions are stored in the AST file in -both the types and the declarations blocks, because every statement or -expression will be associated with either a type or declaration. The -actual statement and expression records are stored immediately -following the declaration or type that owns the statement or -expression. For example, the statement representing the body of a -function will be stored directly following the declaration of the -function.

- -

As with types and declarations, each statement and expression kind -in Clang's abstract syntax tree (ForStmt, -CallExpr, etc.) has a corresponding record type in the -AST file, which contains the serialized representation of -that statement or expression. Each substatement or subexpression -within an expression is stored as a separate record (which keeps most -records to a fixed size). Within the AST file, the -subexpressions of an expression are stored, in reverse order, prior to the expression -that owns those expression, using a form of Reverse -Polish Notation. For example, an expression 3 - 4 + 5 -would be represented as follows:

- - - - - - - - -
IntegerLiteral(5)
IntegerLiteral(4)
IntegerLiteral(3)
BinaryOperator(-)
BinaryOperator(+)
STOP
- -

When reading this representation, Clang evaluates each expression -record it encounters, builds the appropriate abstract syntax tree node, -and then pushes that expression on to a stack. When a record contains N -subexpressions--BinaryOperator has two of them--those -expressions are popped from the top of the stack. The special STOP -code indicates that we have reached the end of a serialized expression -or statement; other expression or statement records may follow, but -they are part of a different expression.

- -

Identifier Table Block

- -

The identifier table block contains an on-disk hash table that maps -each identifier mentioned within the AST file to the -serialized representation of the identifier's information (e.g, the -IdentifierInfo structure). The serialized representation -contains:

- -
    -
  • The actual identifier string.
  • -
  • Flags that describe whether this identifier is the name of a - built-in, a poisoned identifier, an extension token, or a - macro.
  • -
  • If the identifier names a macro, the offset of the macro - definition within the preprocessor - block.
  • -
  • If the identifier names one or more declarations visible from - translation unit scope, the declaration IDs of these - declarations.
  • -
- -

When an AST file is loaded, the AST file reader -mechanism introduces itself into the identifier table as an external -lookup source. Thus, when the user program refers to an identifier -that has not yet been seen, Clang will perform a lookup into the -identifier table. If an identifier is found, its contents (macro -definitions, flags, top-level declarations, etc.) will be -deserialized, at which point the corresponding -IdentifierInfo structure will have the same contents it -would have after parsing the headers in the AST file.

- -

Within the AST file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk -hash table where that identifier is stored. This mapping is used when -deserializing the name of a declaration, the identifier of a token, or -any other construct in the AST file that refers to a name.

- -

Method Pool Block

- -

The method pool block is represented as an on-disk hash table that -serves two purposes: it provides a mapping from the names of -Objective-C selectors to the set of Objective-C instance and class -methods that have that particular selector (which is required for -semantic analysis in Objective-C) and also stores all of the selectors -used by entities within the AST file. The design of the -method pool is similar to that of the identifier -table: the first time a particular selector is formed during the -compilation of the program, Clang will search in the on-disk hash -table of selectors; if found, Clang will read the Objective-C methods -associated with that selector into the appropriate front-end data -structure (Sema::InstanceMethodPool and -Sema::FactoryMethodPool for instance and class methods, -respectively).

- -

As with identifiers, selectors are represented by numeric values -within the AST file. A separate index maps these numeric selector -values to the offset of the selector within the on-disk hash table, -and will be used when de-serializing an Objective-C method declaration -(or other Objective-C construct) that refers to the selector.

- -

AST Reader Integration Points

- -

The "lazy" deserialization behavior of AST files requires -their integration into several completely different submodules of -Clang. For example, lazily deserializing the declarations during name -lookup requires that the name-lookup routines be able to query the -AST file to find entities stored there.

- -

For each Clang data structure that requires direct interaction with -the AST reader logic, there is an abstract class that provides -the interface between the two modules. The ASTReader -class, which handles the loading of an AST file, inherits -from all of these abstract classes to provide lazy deserialization of -Clang's data structures. ASTReader implements the -following abstract classes:

- -
-
StatSysCallCache
-
This abstract interface is associated with the - FileManager class, and is used whenever the file - manager is going to perform a stat() system call.
- -
ExternalSLocEntrySource
-
This abstract interface is associated with the - SourceManager class, and is used whenever the - source manager needs to load the details - of a file, buffer, or macro instantiation.
- -
IdentifierInfoLookup
-
This abstract interface is associated with the - IdentifierTable class, and is used whenever the - program source refers to an identifier that has not yet been seen. - In this case, the AST reader searches for - this identifier within its identifier table - to load any top-level declarations or macros associated with that - identifier.
- -
ExternalASTSource
-
This abstract interface is associated with the - ASTContext class, and is used whenever the abstract - syntax tree nodes need to loaded from the AST file. It - provides the ability to de-serialize declarations and types - identified by their numeric values, read the bodies of functions - when required, and read the declarations stored within a - declaration context (either for iteration or for name lookup).
- -
ExternalSemaSource
-
This abstract interface is associated with the Sema - class, and is used whenever semantic analysis needs to read - information from the global method - pool.
-
- -

Chained precompiled headers

- -

Chained precompiled headers were initially intended to improve the -performance of IDE-centric operations such as syntax highlighting and -code completion while a particular source file is being edited by the -user. To minimize the amount of reparsing required after a change to -the file, a form of precompiled header--called a precompiled -preamble--is automatically generated by parsing all of the -headers in the source file, up to and including the last -#include. When only the source file changes (and none of the headers -it depends on), reparsing of that source file can use the precompiled -preamble and start parsing after the #includes, so parsing time is -proportional to the size of the source file (rather than all of its -includes). However, the compilation of that translation unit -may already use a precompiled header: in this case, Clang will create -the precompiled preamble as a chained precompiled header that refers -to the original precompiled header. This drastically reduces the time -needed to serialize the precompiled preamble for use in reparsing.

- -

Chained precompiled headers get their name because each precompiled header -can depend on one other precompiled header, forming a chain of -dependencies. A translation unit will then include the precompiled -header that starts the chain (i.e., nothing depends on it). This -linearity of dependencies is important for the semantic model of -chained precompiled headers, because the most-recent precompiled -header can provide information that overrides the information provided -by the precompiled headers it depends on, just like a header file -B.h that includes another header A.h can -modify the state produced by parsing A.h, e.g., by -#undef'ing a macro defined in A.h.

- -

There are several ways in which chained precompiled headers -generalize the AST file model:

- -
-
Numbering of IDs
-
Many different kinds of entities--identifiers, declarations, - types, etc.---have ID numbers that start at 1 or some other - predefined constant and grow upward. Each precompiled header records - the maximum ID number it has assigned in each category. Then, when a - new precompiled header is generated that depends on (chains to) - another precompiled header, it will start counting at the next - available ID number. This way, one can determine, given an ID - number, which AST file actually contains the entity.
- -
Name lookup
-
When writing a chained precompiled header, Clang attempts to - write only information that has changed from the precompiled header - on which it is based. This changes the lookup algorithm for the - various tables, such as the identifier table: - the search starts at the most-recent precompiled header. If no entry - is found, lookup then proceeds to the identifier table in the - precompiled header it depends on, and so one. Once a lookup - succeeds, that result is considered definitive, overriding any - results from earlier precompiled headers.
- -
Update records
-
There are various ways in which a later precompiled header can - modify the entities described in an earlier precompiled header. For - example, later precompiled headers can add entries into the various - name-lookup tables for the translation unit or namespaces, or add - new categories to an Objective-C class. Each of these updates is - captured in an "update record" that is stored in the chained - precompiled header file and will be loaded along with the original - entity.
-
- -

Modules

- -

Modules generalize the chained precompiled header model yet -further, from a linear chain of precompiled headers to an arbitrary -directed acyclic graph (DAG) of AST files. All of the same techniques -used to make chained precompiled headers work---ID number, name -lookup, update records---are shared with modules. However, the DAG -nature of modules introduce a number of additional complications to -the model: - -

-
Numbering of IDs
-
The simple, linear numbering scheme used in chained precompiled - headers falls apart with the module DAG, because different modules - may end up with different numbering schemes for entities they - imported from common shared modules. To account for this, each - module file provides information about which modules it depends on - and which ID numbers it assigned to the entities in those modules, - as well as which ID numbers it took for its own new entities. The - AST reader then maps these "local" ID numbers into a "global" ID - number space for the current translation unit, providing a 1-1 - mapping between entities (in whatever AST file they inhabit) and - global ID numbers. If that translation unit is then serialized into - an AST file, this mapping will be stored for use when the AST file - is imported.
- -
Declaration merging
-
It is possible for a given entity (from the language's - perspective) to be declared multiple times in different places. For - example, two different headers can have the declaration of - printf or could forward-declare struct stat. If - each of those headers is included in a module, and some third party - imports both of those modules, there is a potentially serious - problem: name lookup for printf or struct stat will - find both declarations, but the AST nodes are unrelated. This would - result in a compilation error, due to an ambiguity in name - lookup. Therefore, the AST reader performs declaration merging - according to the appropriate language semantics, ensuring that the - two disjoint declarations are merged into a single redeclaration - chain (with a common canonical declaration), so that it is as if one - of the headers had been included before the other.
- -
Name Visibility
-
Modules allow certain names that occur during module creation to - be "hidden", so that they are not part of the public interface of - the module and are not visible to its clients. The AST reader - maintains a "visible" bit on various AST nodes (declarations, macros, - etc.) to indicate whether that particular AST node is currently - visible; the various name lookup mechanisms in Clang inspect the - visible bit to determine whether that entity, which is still in the - AST (because other, visible AST nodes may depend on it), can - actually be found by name lookup. When a new (sub)module is - imported, it may make existing, non-visible, already-deserialized - AST nodes visible; it is the responsibility of the AST reader to - find and update these AST nodes when it is notified of the import.
- -
- -
- - - diff --git a/docs/PCHInternals.rst b/docs/PCHInternals.rst new file mode 100644 index 0000000000..a7174f3af0 --- /dev/null +++ b/docs/PCHInternals.rst @@ -0,0 +1,573 @@ +======================================== +Precompiled Header and Modules Internals +======================================== + +.. contents:: + :local: + +This document describes the design and implementation of Clang's precompiled +headers (PCH) and modules. If you are interested in the end-user view, please +see the `User's Manual `_. + +Using Precompiled Headers with ``clang`` +---------------------------------------- + +The Clang compiler frontend, ``clang -cc1``, supports two command line options +for generating and using PCH files. + +To generate PCH files using ``clang -cc1``, use the option :option:`-emit-pch`: + +.. code-block:: bash + + $ clang -cc1 test.h -emit-pch -o test.h.pch + +This option is transparently used by ``clang`` when generating PCH files. The +resulting PCH file contains the serialized form of the compiler's internal +representation after it has completed parsing and semantic analysis. The PCH +file can then be used as a prefix header with the :option:`-include-pch` +option: + +.. code-block:: bash + + $ clang -cc1 -include-pch test.h.pch test.c -o test.s + +Design Philosophy +----------------- + +Precompiled headers are meant to improve overall compile times for projects, so +the design of precompiled headers is entirely driven by performance concerns. +The use case for precompiled headers is relatively simple: when there is a +common set of headers that is included in nearly every source file in the +project, we *precompile* that bundle of headers into a single precompiled +header (PCH file). Then, when compiling the source files in the project, we +load the PCH file first (as a prefix header), which acts as a stand-in for that +bundle of headers. + +A precompiled header implementation improves performance when: + +* Loading the PCH file is significantly faster than re-parsing the bundle of + headers stored within the PCH file. Thus, a precompiled header design + attempts to minimize the cost of reading the PCH file. Ideally, this cost + should not vary with the size of the precompiled header file. + +* The cost of generating the PCH file initially is not so large that it + counters the per-source-file performance improvement due to eliminating the + need to parse the bundled headers in the first place. This is particularly + important on multi-core systems, because PCH file generation serializes the + build when all compilations require the PCH file to be up-to-date. + +Modules, as implemented in Clang, use the same mechanisms as precompiled +headers to save a serialized AST file (one per module) and use those AST +modules. From an implementation standpoint, modules are a generalization of +precompiled headers, lifting a number of restrictions placed on precompiled +headers. In particular, there can only be one precompiled header and it must +be included at the beginning of the translation unit. The extensions to the +AST file format required for modules are discussed in the section on +:ref:`modules `. + +Clang's AST files are designed with a compact on-disk representation, which +minimizes both creation time and the time required to initially load the AST +file. The AST file itself contains a serialized representation of Clang's +abstract syntax trees and supporting data structures, stored using the same +compressed bitstream as `LLVM's bitcode file format +`_. + +Clang's AST files are loaded "lazily" from disk. When an AST file is initially +loaded, Clang reads only a small amount of data from the AST file to establish +where certain important data structures are stored. The amount of data read in +this initial load is independent of the size of the AST file, such that a +larger AST file does not lead to longer AST load times. The actual header data +in the AST file --- macros, functions, variables, types, etc. --- is loaded +only when it is referenced from the user's code, at which point only that +entity (and those entities it depends on) are deserialized from the AST file. +With this approach, the cost of using an AST file for a translation unit is +proportional to the amount of code actually used from the AST file, rather than +being proportional to the size of the AST file itself. + +When given the :option:`-print-stats` option, Clang produces statistics +describing how much of the AST file was actually loaded from disk. For a +simple "Hello, World!" program that includes the Apple ``Cocoa.h`` header +(which is built as a precompiled header), this option illustrates how little of +the actual precompiled header is required: + +.. code-block:: none + + *** PCH Statistics: + 933 stat cache hits + 4 stat cache misses + 895/39981 source location entries read (2.238563%) + 19/15315 types read (0.124061%) + 20/82685 declarations read (0.024188%) + 154/58070 identifiers read (0.265197%) + 0/7260 selectors read (0.000000%) + 0/30842 statements read (0.000000%) + 4/8400 macros read (0.047619%) + 1/4995 lexical declcontexts read (0.020020%) + 0/4413 visible declcontexts read (0.000000%) + 0/7230 method pool entries read (0.000000%) + 0 method pool misses + +For this small program, only a tiny fraction of the source locations, types, +declarations, identifiers, and macros were actually deserialized from the +precompiled header. These statistics can be useful to determine whether the +AST file implementation can be improved by making more of the implementation +lazy. + +Precompiled headers can be chained. When you create a PCH while including an +existing PCH, Clang can create the new PCH by referencing the original file and +only writing the new data to the new file. For example, you could create a PCH +out of all the headers that are very commonly used throughout your project, and +then create a PCH for every single source file in the project that includes the +code that is specific to that file, so that recompiling the file itself is very +fast, without duplicating the data from the common headers for every file. The +mechanisms behind chained precompiled headers are discussed in a :ref:`later +section `. + +AST File Contents +----------------- + +Clang's AST files are organized into several different blocks, each of which +contains the serialized representation of a part of Clang's internal +representation. Each of the blocks corresponds to either a block or a record +within `LLVM's bitstream format `_. +The contents of each of these logical blocks are described below. + +.. image:: PCHLayout.png + +For a given AST file, the `llvm-bcanalyzer +`_ utility can be used +to examine the actual structure of the bitstream for the AST file. This +information can be used both to help understand the structure of the AST file +and to isolate areas where AST files can still be optimized, e.g., through the +introduction of abbreviations. + +Metadata Block +^^^^^^^^^^^^^^ + +The metadata block contains several records that provide information about how +the AST file was built. This metadata is primarily used to validate the use of +an AST file. For example, a precompiled header built for a 32-bit x86 target +cannot be used when compiling for a 64-bit x86 target. The metadata block +contains information about: + +Language options + Describes the particular language dialect used to compile the AST file, + including major options (e.g., Objective-C support) and more minor options + (e.g., support for "``//``" comments). The contents of this record correspond to + the ``LangOptions`` class. + +Target architecture + The target triple that describes the architecture, platform, and ABI for + which the AST file was generated, e.g., ``i386-apple-darwin9``. + +AST version + The major and minor version numbers of the AST file format. Changes in the + minor version number should not affect backward compatibility, while changes + in the major version number imply that a newer compiler cannot read an older + precompiled header (and vice-versa). + +Original file name + The full path of the header that was used to generate the AST file. + +Predefines buffer + Although not explicitly stored as part of the metadata, the predefines buffer + is used in the validation of the AST file. The predefines buffer itself + contains code generated by the compiler to initialize the preprocessor state + according to the current target, platform, and command-line options. For + example, the predefines buffer will contain "``#define __STDC__ 1``" when we + are compiling C without Microsoft extensions. The predefines buffer itself + is stored within the :ref:`pchinternals-sourcemgr`, but its contents are + verified along with the rest of the metadata. + +A chained PCH file (that is, one that references another PCH) and a module +(which may import other modules) have additional metadata containing the list +of all AST files that this AST file depends on. Each of those files will be +loaded along with this AST file. + +For chained precompiled headers, the language options, target architecture and +predefines buffer data is taken from the end of the chain, since they have to +match anyway. + +.. _pchinternals-sourcemgr: + +Source Manager Block +^^^^^^^^^^^^^^^^^^^^ + +The source manager block contains the serialized representation of Clang's +`SourceManager `_ class, which handles the +mapping from source locations (as represented in Clang's abstract syntax tree) +into actual column/line positions within a source file or macro instantiation. +The AST file's representation of the source manager also includes information +about all of the headers that were (transitively) included when building the +AST file. + +The bulk of the source manager block is dedicated to information about the +various files, buffers, and macro instantiations into which a source location +can refer. Each of these is referenced by a numeric "file ID", which is a +unique number (allocated starting at 1) stored in the source location. Clang +serializes the information for each kind of file ID, along with an index that +maps file IDs to the position within the AST file where the information about +that file ID is stored. The data associated with a file ID is loaded only when +required by the front end, e.g., to emit a diagnostic that includes a macro +instantiation history inside the header itself. + +The source manager block also contains information about all of the headers +that were included when building the AST file. This includes information about +the controlling macro for the header (e.g., when the preprocessor identified +that the contents of the header dependent on a macro like +``LLVM_CLANG_SOURCEMANAGER_H``) along with a cached version of the results of +the ``stat()`` system calls performed when building the AST file. The latter +is particularly useful in reducing system time when searching for include +files. + +.. _pchinternals-preprocessor: + +Preprocessor Block +^^^^^^^^^^^^^^^^^^ + +The preprocessor block contains the serialized representation of the +preprocessor. Specifically, it contains all of the macros that have been +defined by the end of the header used to build the AST file, along with the +token sequences that comprise each macro. The macro definitions are only read +from the AST file when the name of the macro first occurs in the program. This +lazy loading of macro definitions is triggered by lookups into the +:ref:`identifier table `. + +.. _pchinternals-types: + +Types Block +^^^^^^^^^^^ + +The types block contains the serialized representation of all of the types +referenced in the translation unit. Each Clang type node (``PointerType``, +``FunctionProtoType``, etc.) has a corresponding record type in the AST file. +When types are deserialized from the AST file, the data within the record is +used to reconstruct the appropriate type node using the AST context. + +Each type has a unique type ID, which is an integer that uniquely identifies +that type. Type ID 0 represents the NULL type, type IDs less than +``NUM_PREDEF_TYPE_IDS`` represent predefined types (``void``, ``float``, etc.), +while other "user-defined" type IDs are assigned consecutively from +``NUM_PREDEF_TYPE_IDS`` upward as the types are encountered. The AST file has +an associated mapping from the user-defined types block to the location within +the types block where the serialized representation of that type resides, +enabling lazy deserialization of types. When a type is referenced from within +the AST file, that reference is encoded using the type ID shifted left by 3 +bits. The lower three bits are used to represent the ``const``, ``volatile``, +and ``restrict`` qualifiers, as in Clang's +`QualType `_ class. + +.. _pchinternals-decls: + +Declarations Block +^^^^^^^^^^^^^^^^^^ + +The declarations block contains the serialized representation of all of the +declarations referenced in the translation unit. Each Clang declaration node +(``VarDecl``, ``FunctionDecl``, etc.) has a corresponding record type in the +AST file. When declarations are deserialized from the AST file, the data +within the record is used to build and populate a new instance of the +corresponding ``Decl`` node. As with types, each declaration node has a +numeric ID that is used to refer to that declaration within the AST file. In +addition, a lookup table provides a mapping from that numeric ID to the offset +within the precompiled header where that declaration is described. + +Declarations in Clang's abstract syntax trees are stored hierarchically. At +the top of the hierarchy is the translation unit (``TranslationUnitDecl``), +which contains all of the declarations in the translation unit but is not +actually written as a specific declaration node. Its child declarations (such +as functions or struct types) may also contain other declarations inside them, +and so on. Within Clang, each declaration is stored within a `declaration +context `_, as +represented by the ``DeclContext`` class. Declaration contexts provide the +mechanism to perform name lookup within a given declaration (e.g., find the +member named ``x`` in a structure) and iterate over the declarations stored +within a context (e.g., iterate over all of the fields of a structure for +structure layout). + +In Clang's AST file format, deserializing a declaration that is a +``DeclContext`` is a separate operation from deserializing all of the +declarations stored within that declaration context. Therefore, Clang will +deserialize the translation unit declaration without deserializing the +declarations within that translation unit. When required, the declarations +stored within a declaration context will be deserialized. There are two +representations of the declarations within a declaration context, which +correspond to the name-lookup and iteration behavior described above: + +* When the front end performs name lookup to find a name ``x`` within a given + declaration context (for example, during semantic analysis of the expression + ``p->x``, where ``p``'s type is defined in the precompiled header), Clang + refers to an on-disk hash table that maps from the names within that + declaration context to the declaration IDs that represent each visible + declaration with that name. The actual declarations will then be + deserialized to provide the results of name lookup. +* When the front end performs iteration over all of the declarations within a + declaration context, all of those declarations are immediately + de-serialized. For large declaration contexts (e.g., the translation unit), + this operation is expensive; however, large declaration contexts are not + traversed in normal compilation, since such a traversal is unnecessary. + However, it is common for the code generator and semantic analysis to + traverse declaration contexts for structs, classes, unions, and + enumerations, although those contexts contain relatively few declarations in + the common case. + +Statements and Expressions +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Statements and expressions are stored in the AST file in both the :ref:`types +` and the :ref:`declarations ` blocks, +because every statement or expression will be associated with either a type or +declaration. The actual statement and expression records are stored +immediately following the declaration or type that owns the statement or +expression. For example, the statement representing the body of a function +will be stored directly following the declaration of the function. + +As with types and declarations, each statement and expression kind in Clang's +abstract syntax tree (``ForStmt``, ``CallExpr``, etc.) has a corresponding +record type in the AST file, which contains the serialized representation of +that statement or expression. Each substatement or subexpression within an +expression is stored as a separate record (which keeps most records to a fixed +size). Within the AST file, the subexpressions of an expression are stored, in +reverse order, prior to the expression that owns those expression, using a form +of `Reverse Polish Notation +`_. For example, an +expression ``3 - 4 + 5`` would be represented as follows: + ++-----------------------+ +| ``IntegerLiteral(5)`` | ++-----------------------+ +| ``IntegerLiteral(4)`` | ++-----------------------+ +| ``IntegerLiteral(3)`` | ++-----------------------+ +| ``IntegerLiteral(-)`` | ++-----------------------+ +| ``IntegerLiteral(+)`` | ++-----------------------+ +| ``STOP`` | ++-----------------------+ + +When reading this representation, Clang evaluates each expression record it +encounters, builds the appropriate abstract syntax tree node, and then pushes +that expression on to a stack. When a record contains *N* subexpressions --- +``BinaryOperator`` has two of them --- those expressions are popped from the +top of the stack. The special STOP code indicates that we have reached the end +of a serialized expression or statement; other expression or statement records +may follow, but they are part of a different expression. + +.. _pchinternals-ident-table: + +Identifier Table Block +^^^^^^^^^^^^^^^^^^^^^^ + +The identifier table block contains an on-disk hash table that maps each +identifier mentioned within the AST file to the serialized representation of +the identifier's information (e.g, the ``IdentifierInfo`` structure). The +serialized representation contains: + +* The actual identifier string. +* Flags that describe whether this identifier is the name of a built-in, a + poisoned identifier, an extension token, or a macro. +* If the identifier names a macro, the offset of the macro definition within + the :ref:`pchinternals-preprocessor`. +* If the identifier names one or more declarations visible from translation + unit scope, the :ref:`declaration IDs ` of these + declarations. + +When an AST file is loaded, the AST file reader mechanism introduces itself +into the identifier table as an external lookup source. Thus, when the user +program refers to an identifier that has not yet been seen, Clang will perform +a lookup into the identifier table. If an identifier is found, its contents +(macro definitions, flags, top-level declarations, etc.) will be deserialized, +at which point the corresponding ``IdentifierInfo`` structure will have the +same contents it would have after parsing the headers in the AST file. + +Within the AST file, the identifiers used to name declarations are represented +with an integral value. A separate table provides a mapping from this integral +value (the identifier ID) to the location within the on-disk hash table where +that identifier is stored. This mapping is used when deserializing the name of +a declaration, the identifier of a token, or any other construct in the AST +file that refers to a name. + +.. _pchinternals-method-pool: + +Method Pool Block +^^^^^^^^^^^^^^^^^ + +The method pool block is represented as an on-disk hash table that serves two +purposes: it provides a mapping from the names of Objective-C selectors to the +set of Objective-C instance and class methods that have that particular +selector (which is required for semantic analysis in Objective-C) and also +stores all of the selectors used by entities within the AST file. The design +of the method pool is similar to that of the :ref:`identifier table +`: the first time a particular selector is formed +during the compilation of the program, Clang will search in the on-disk hash +table of selectors; if found, Clang will read the Objective-C methods +associated with that selector into the appropriate front-end data structure +(``Sema::InstanceMethodPool`` and ``Sema::FactoryMethodPool`` for instance and +class methods, respectively). + +As with identifiers, selectors are represented by numeric values within the AST +file. A separate index maps these numeric selector values to the offset of the +selector within the on-disk hash table, and will be used when de-serializing an +Objective-C method declaration (or other Objective-C construct) that refers to +the selector. + +AST Reader Integration Points +----------------------------- + +The "lazy" deserialization behavior of AST files requires their integration +into several completely different submodules of Clang. For example, lazily +deserializing the declarations during name lookup requires that the name-lookup +routines be able to query the AST file to find entities stored there. + +For each Clang data structure that requires direct interaction with the AST +reader logic, there is an abstract class that provides the interface between +the two modules. The ``ASTReader`` class, which handles the loading of an AST +file, inherits from all of these abstract classes to provide lazy +deserialization of Clang's data structures. ``ASTReader`` implements the +following abstract classes: + +``StatSysCallCache`` + This abstract interface is associated with the ``FileManager`` class, and is + used whenever the file manager is going to perform a ``stat()`` system call. + +``ExternalSLocEntrySource`` + This abstract interface is associated with the ``SourceManager`` class, and + is used whenever the :ref:`source manager ` needs to + load the details of a file, buffer, or macro instantiation. + +``IdentifierInfoLookup`` + This abstract interface is associated with the ``IdentifierTable`` class, and + is used whenever the program source refers to an identifier that has not yet + been seen. In this case, the AST reader searches for this identifier within + its :ref:`identifier table ` to load any top-level + declarations or macros associated with that identifier. + +``ExternalASTSource`` + This abstract interface is associated with the ``ASTContext`` class, and is + used whenever the abstract syntax tree nodes need to loaded from the AST + file. It provides the ability to de-serialize declarations and types + identified by their numeric values, read the bodies of functions when + required, and read the declarations stored within a declaration context + (either for iteration or for name lookup). + +``ExternalSemaSource`` + This abstract interface is associated with the ``Sema`` class, and is used + whenever semantic analysis needs to read information from the :ref:`global + method pool `. + +.. _pchinternals-chained: + +Chained precompiled headers +--------------------------- + +Chained precompiled headers were initially intended to improve the performance +of IDE-centric operations such as syntax highlighting and code completion while +a particular source file is being edited by the user. To minimize the amount +of reparsing required after a change to the file, a form of precompiled header +--- called a precompiled *preamble* --- is automatically generated by parsing +all of the headers in the source file, up to and including the last +``#include``. When only the source file changes (and none of the headers it +depends on), reparsing of that source file can use the precompiled preamble and +start parsing after the ``#include``\ s, so parsing time is proportional to the +size of the source file (rather than all of its includes). However, the +compilation of that translation unit may already use a precompiled header: in +this case, Clang will create the precompiled preamble as a chained precompiled +header that refers to the original precompiled header. This drastically +reduces the time needed to serialize the precompiled preamble for use in +reparsing. + +Chained precompiled headers get their name because each precompiled header can +depend on one other precompiled header, forming a chain of dependencies. A +translation unit will then include the precompiled header that starts the chain +(i.e., nothing depends on it). This linearity of dependencies is important for +the semantic model of chained precompiled headers, because the most-recent +precompiled header can provide information that overrides the information +provided by the precompiled headers it depends on, just like a header file +``B.h`` that includes another header ``A.h`` can modify the state produced by +parsing ``A.h``, e.g., by ``#undef``'ing a macro defined in ``A.h``. + +There are several ways in which chained precompiled headers generalize the AST +file model: + +Numbering of IDs + Many different kinds of entities --- identifiers, declarations, types, etc. + --- have ID numbers that start at 1 or some other predefined constant and + grow upward. Each precompiled header records the maximum ID number it has + assigned in each category. Then, when a new precompiled header is generated + that depends on (chains to) another precompiled header, it will start + counting at the next available ID number. This way, one can determine, given + an ID number, which AST file actually contains the entity. + +Name lookup + When writing a chained precompiled header, Clang attempts to write only + information that has changed from the precompiled header on which it is + based. This changes the lookup algorithm for the various tables, such as the + :ref:`identifier table `: the search starts at the + most-recent precompiled header. If no entry is found, lookup then proceeds + to the identifier table in the precompiled header it depends on, and so one. + Once a lookup succeeds, that result is considered definitive, overriding any + results from earlier precompiled headers. + +Update records + There are various ways in which a later precompiled header can modify the + entities described in an earlier precompiled header. For example, later + precompiled headers can add entries into the various name-lookup tables for + the translation unit or namespaces, or add new categories to an Objective-C + class. Each of these updates is captured in an "update record" that is + stored in the chained precompiled header file and will be loaded along with + the original entity. + +.. _pchinternals-modules: + +Modules +------- + +Modules generalize the chained precompiled header model yet further, from a +linear chain of precompiled headers to an arbitrary directed acyclic graph +(DAG) of AST files. All of the same techniques used to make chained +precompiled headers work --- ID number, name lookup, update records --- are +shared with modules. However, the DAG nature of modules introduce a number of +additional complications to the model: + +Numbering of IDs + The simple, linear numbering scheme used in chained precompiled headers falls + apart with the module DAG, because different modules may end up with + different numbering schemes for entities they imported from common shared + modules. To account for this, each module file provides information about + which modules it depends on and which ID numbers it assigned to the entities + in those modules, as well as which ID numbers it took for its own new + entities. The AST reader then maps these "local" ID numbers into a "global" + ID number space for the current translation unit, providing a 1-1 mapping + between entities (in whatever AST file they inhabit) and global ID numbers. + If that translation unit is then serialized into an AST file, this mapping + will be stored for use when the AST file is imported. + +Declaration merging + It is possible for a given entity (from the language's perspective) to be + declared multiple times in different places. For example, two different + headers can have the declaration of ``printf`` or could forward-declare + ``struct stat``. If each of those headers is included in a module, and some + third party imports both of those modules, there is a potentially serious + problem: name lookup for ``printf`` or ``struct stat`` will find both + declarations, but the AST nodes are unrelated. This would result in a + compilation error, due to an ambiguity in name lookup. Therefore, the AST + reader performs declaration merging according to the appropriate language + semantics, ensuring that the two disjoint declarations are merged into a + single redeclaration chain (with a common canonical declaration), so that it + is as if one of the headers had been included before the other. + +Name Visibility + Modules allow certain names that occur during module creation to be "hidden", + so that they are not part of the public interface of the module and are not + visible to its clients. The AST reader maintains a "visible" bit on various + AST nodes (declarations, macros, etc.) to indicate whether that particular + AST node is currently visible; the various name lookup mechanisms in Clang + inspect the visible bit to determine whether that entity, which is still in + the AST (because other, visible AST nodes may depend on it), can actually be + found by name lookup. When a new (sub)module is imported, it may make + existing, non-visible, already-deserialized AST nodes visible; it is the + responsibility of the AST reader to find and update these AST nodes when it + is notified of the import. + diff --git a/docs/ThreadSanitizer.html b/docs/ThreadSanitizer.html deleted file mode 100644 index aa251c1153..0000000000 --- a/docs/ThreadSanitizer.html +++ /dev/null @@ -1,126 +0,0 @@ - - - - - - ThreadSanitizer, a race detector - - - - - - - - -
- -

ThreadSanitizer

- - -

Introduction

-ThreadSanitizer is a tool that detects data races.
-It consists of a compiler instrumentation module and a run-time library.
-Typical slowdown introduced by ThreadSanitizer is 5x-15x (TODO: these numbers are -approximate so far). - -

How to build

-Follow the clang build instructions. -CMake build is supported.
- -

Supported Platforms

-ThreadSanitizer is supported on Linux x86_64 (tested on Ubuntu 10.04).
-Support for MacOS 10.7 (64-bit only) is planned for late 2012.
-Support for 32-bit platforms is problematic and not yet planned. - - - -

Usage

-Simply compile your program with -fsanitize=thread -fPIE and link it -with -fsanitize=thread -pie.
-To get a reasonable performance add -O1 or higher.
-Use -g to get file names and line numbers in the warning messages.
- -Example: -
-% cat projects/compiler-rt/lib/tsan/output_tests/tiny_race.c
-#include 
-int Global;
-void *Thread1(void *x) {
-  Global = 42;
-  return x;
-}
-int main() {
-  pthread_t t;
-  pthread_create(&t, NULL, Thread1, NULL);
-  Global = 43;
-  pthread_join(t, NULL);
-  return Global;
-}
-
- -
-% clang -fsanitize=thread -g -O1 tiny_race.c -fPIE -pie
-
- -If a bug is detected, the program will print an error message to stderr. -Currently, ThreadSanitizer symbolizes its output using an external -addr2line -process (this will be fixed in future). -
-% TSAN_OPTIONS=strip_path_prefix=`pwd`/  # Don't print full paths.
-% ./a.out 2> log
-% cat log
-WARNING: ThreadSanitizer: data race (pid=19219)
-  Write of size 4 at 0x7fcf47b21bc0 by thread 1:
-    #0 Thread1 tiny_race.c:4 (exe+0x00000000a360)
-  Previous write of size 4 at 0x7fcf47b21bc0 by main thread:
-    #0 main tiny_race.c:10 (exe+0x00000000a3b4)
-  Thread 1 (running) created at:
-    #0 pthread_create ??:0 (exe+0x00000000c790)
-    #1 main tiny_race.c:9 (exe+0x00000000a3a4)
-
- - -

Limitations

-
    -
  • ThreadSanitizer uses more real memory than a native run. -At the default settings the memory overhead is 9x plus 9Mb per each thread. -Settings with 5x and 3x overhead (but less accurate analysis) are also available. -
  • ThreadSanitizer maps (but does not reserve) a lot of virtual address space. -This means that tools like ulimit may not work as usually expected. -
  • Static linking is not supported. -
  • ThreadSanitizer requires -fPIE -pie -
- - -

Current Status

-ThreadSanitizer is in alpha stage. -It is known to work on large C++ programs using pthreads, but we do not promise -anything (yet).
-C++11 threading is not yet supported.
-The test suite is integrated into CMake build and can be run with -make check-tsan command.
- -We are actively working on enhancing the tool -- stay tuned. -Any help, especially in the form of minimized standalone tests is more than welcome. - -

More Information

-http://code.google.com/p/thread-sanitizer. - - -
- - diff --git a/docs/ThreadSanitizer.rst b/docs/ThreadSanitizer.rst new file mode 100644 index 0000000000..5f04170da6 --- /dev/null +++ b/docs/ThreadSanitizer.rst @@ -0,0 +1,95 @@ +ThreadSanitizer +=============== + +Introduction +------------ + +ThreadSanitizer is a tool that detects data races. It consists of a compiler +instrumentation module and a run-time library. Typical slowdown introduced by +ThreadSanitizer is **5x-15x** (TODO: these numbers are approximate so far). + +How to build +------------ + +Follow the `Clang build instructions <../get_started.html>`_. CMake build is +supported. + +Supported Platforms +------------------- + +ThreadSanitizer is supported on Linux x86_64 (tested on Ubuntu 10.04). Support +for MacOS 10.7 (64-bit only) is planned for late 2012. Support for 32-bit +platforms is problematic and not yet planned. + +Usage +----- + +Simply compile your program with ``-fsanitize=thread -fPIE`` and link it with +``-fsanitize=thread -pie``. To get a reasonable performance add ``-O1`` or +higher. Use ``-g`` to get file names and line numbers in the warning messages. + +Example: + +.. code-block:: c++ + + % cat projects/compiler-rt/lib/tsan/output_tests/tiny_race.c + #include + int Global; + void *Thread1(void *x) { + Global = 42; + return x; + } + int main() { + pthread_t t; + pthread_create(&t, NULL, Thread1, NULL); + Global = 43; + pthread_join(t, NULL); + return Global; + } + + $ clang -fsanitize=thread -g -O1 tiny_race.c -fPIE -pie + +If a bug is detected, the program will print an error message to stderr. +Currently, ThreadSanitizer symbolizes its output using an external +``addr2line`` process (this will be fixed in future). + +.. code-block:: bash + + % TSAN_OPTIONS=strip_path_prefix=`pwd`/ # Don't print full paths. + % ./a.out 2> log + % cat log + WARNING: ThreadSanitizer: data race (pid=19219) + Write of size 4 at 0x7fcf47b21bc0 by thread 1: + #0 Thread1 tiny_race.c:4 (exe+0x00000000a360) + Previous write of size 4 at 0x7fcf47b21bc0 by main thread: + #0 main tiny_race.c:10 (exe+0x00000000a3b4) + Thread 1 (running) created at: + #0 pthread_create ??:0 (exe+0x00000000c790) + #1 main tiny_race.c:9 (exe+0x00000000a3a4) + +Limitations +----------- + +* ThreadSanitizer uses more real memory than a native run. At the default + settings the memory overhead is 9x plus 9Mb per each thread. Settings with 5x + and 3x overhead (but less accurate analysis) are also available. +* ThreadSanitizer maps (but does not reserve) a lot of virtual address space. + This means that tools like ``ulimit`` may not work as usually expected. +* Static linking is not supported. +* ThreadSanitizer requires ``-fPIE -pie``. + +Current Status +-------------- + +ThreadSanitizer is in alpha stage. It is known to work on large C++ programs +using pthreads, but we do not promise anything (yet). C++11 threading is not +yet supported. The test suite is integrated into CMake build and can be run +with ``make check-tsan`` command. + +We are actively working on enhancing the tool --- stay tuned. Any help, +especially in the form of minimized standalone tests is more than welcome. + +More Information +---------------- +`http://code.google.com/p/thread-sanitizer `_. + diff --git a/docs/Tooling.html b/docs/Tooling.html deleted file mode 100644 index 74837f4c99..0000000000 --- a/docs/Tooling.html +++ /dev/null @@ -1,120 +0,0 @@ - - - -Writing Clang Tools - - - - - - - -
- -

Writing Clang Tools

-

Clang provides infrastructure to write tools that need syntactic and semantic -information about a program. This document will give a short introduction of the -different ways to write clang tools, and their pros and cons.

- - -

LibClang

- - -

LibClang is a stable high level C interface to clang. When in doubt LibClang -is probably the interface you want to use. Consider the other interfaces only -when you have a good reason not to use LibClang.

-

Canonical examples of when to use LibClang:

-
    -
  • Xcode
  • -
  • Clang Python Bindings
  • -
-

Use LibClang when you...

-
    -
  • want to interface with clang from other languages than C++
  • -
  • need a stable interface that takes care to be backwards compatible
  • -
  • want powerful high-level abstractions, like iterating through an AST -with a cursor, and don't want to learn all the nitty gritty details of Clang's -AST.
  • -
-

Do not use LibClang when you...

-
    -
  • want full control over the Clang AST
  • -
- - -

Clang Plugins

- - -

Clang Plugins allow you to run additional actions on the AST as part of -a compilation. Plugins are dynamic libraries that are loaded at runtime by -the compiler, and they're easy to integrate into your build environment.

-

Canonical examples of when to use Clang Plugins:

-
    -
  • special lint-style warnings or errors for your project
  • -
  • creating additional build artifacts from a single compile step
  • -
-

Use Clang Plugins when you...

-
    -
  • need your tool to rerun if any of the dependencies change
  • -
  • want your tool to make or break a build
  • -
  • need full control over the Clang AST
  • -
-

Do not use Clang Plugins when you...

-
    -
  • want to run tools outside of your build environment
  • -
  • want full control on how Clang is set up, including mapping of in-memory - virtual files
  • -
  • need to run over a specific subset of files in your project which is not - necessarily related to any changes which would trigger rebuilds
  • -
- - -

LibTooling

- - -

LibTooling is a C++ interface aimed at writing standalone tools, as well as -integrating into services that run clang tools.

-

Canonical examples of when to use LibTooling:

-
    -
  • a simple syntax checker
  • -
  • refactoring tools
  • -
-

Use LibTooling when you...

-
    -
  • want to run tools over a single file, or a specific subset of files, - independently of the build system
  • -
  • want full control over the Clang AST
  • -
  • want to share code with Clang Plugins
  • -
-

Do not use LibTooling when you...

-
    -
  • want to run as part of the build triggered by dependency changes
  • -
  • want a stable interface so you don't need to change your code when the - AST API changes
  • -
  • want high level abstractions like cursors and code completion out of the - box
  • -
  • do not want to write your tools in C++
  • -
- - -

Clang Tools

- - -

These are a collection of specific developer tools built on top of the -LibTooling infrastructure as part of the Clang project. They are targeted at -automating and improving core development activities of C/C++ developers.

-

Examples of tools we are building or planning as part of the Clang -project:

-
    -
  • Syntax checking (clang-check)
  • -
  • Automatic fixing of compile errors (clangc-fixit)
  • -
  • Automatic code formatting
  • -
  • Migration tools for new features in new language standards
  • -
  • Core refactoring tools
  • -
- -
- - - diff --git a/docs/Tooling.rst b/docs/Tooling.rst new file mode 100644 index 0000000000..78b92efa9f --- /dev/null +++ b/docs/Tooling.rst @@ -0,0 +1,100 @@ +=================== +Writing Clang Tools +=================== + +Clang provides infrastructure to write tools that need syntactic and semantic +information about a program. This document will give a short introduction of +the different ways to write clang tools, and their pros and cons. + +LibClang +-------- + +`LibClang `_ is a stable high +level C interface to clang. When in doubt LibClang is probably the interface +you want to use. Consider the other interfaces only when you have a good +reason not to use LibClang. + +Canonical examples of when to use LibClang: + +* Xcode +* Clang Python Bindings + +Use LibClang when you...: + +* want to interface with clang from other languages than C++ +* need a stable interface that takes care to be backwards compatible +* want powerful high-level abstractions, like iterating through an AST with a + cursor, and don't want to learn all the nitty gritty details of Clang's AST. + +Do not use LibClang when you...: + +* want full control over the Clang AST + +Clang Plugins +------------- + +`Clang Plugins `_ allow you to run additional actions on the +AST as part of a compilation. Plugins are dynamic libraries that are loaded at +runtime by the compiler, and they're easy to integrate into your build +environment. + +Canonical examples of when to use Clang Plugins: + +* special lint-style warnings or errors for your project +* creating additional build artifacts from a single compile step + +Use Clang Plugins when you...: + +* need your tool to rerun if any of the dependencies change +* want your tool to make or break a build +* need full control over the Clang AST + +Do not use Clang Plugins when you...: + +* want to run tools outside of your build environment +* want full control on how Clang is set up, including mapping of in-memory + virtual files +* need to run over a specific subset of files in your project which is not + necessarily related to any changes which would trigger rebuilds + +LibTooling +---------- + +`LibTooling `_ is a C++ interface aimed at writing standalone +tools, as well as integrating into services that run clang tools. Canonical +examples of when to use LibTooling: + +* a simple syntax checker +* refactoring tools + +Use LibTooling when you...: + +* want to run tools over a single file, or a specific subset of files, + independently of the build system +* want full control over the Clang AST +* want to share code with Clang Plgins + +Do not use LibTooling when you...: + +* want to run as part of the build triggered by dependency changes +* want a stable interface so you don't need to change your code when the AST API + changes +* want high level abstractions like cursors and code completion out of the box +* do not want to write your tools in C++ + +Clang Tools +----------- + +`Clang tools `_ are a collection of specific developer tools +built on top of the LibTooling infrastructure as part of the Clang project. +They are targeted at automating and improving core development activities of +C/C++ developers. + +Examples of tools we are building or planning as part of the Clang project: + +* Syntax checking (:program:`clang-check`) +* Automatic fixing of compile errors (:program:`clang-fixit`) +* Automatic code formatting +* Migration tools for new features in new language standards +* Core refactoring tools + diff --git a/docs/index.rst b/docs/index.rst index e82d70472a..fab47916b4 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -12,6 +12,12 @@ progress. This page will get filled out with docs soon... .. toctree:: :maxdepth: 2 + LanguageExtensions + LibASTMatchers + LibTooling + PCHInternals + ThreadSanitizer + Tooling Indices and tables