From: John McCall Date: Wed, 15 Jun 2011 21:21:53 +0000 (+0000) Subject: The specification document for the new ObjC Automatic Reference Counting X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=8246702d0cbecc3fd5748b58614ffed7ad9e04a5;p=clang The specification document for the new ObjC Automatic Reference Counting feature. Implementation to follow. :) git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@133090 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/docs/AutomaticReferenceCounting.html b/docs/AutomaticReferenceCounting.html new file mode 100644 index 0000000000..3d34ddb310 --- /dev/null +++ b/docs/AutomaticReferenceCounting.html @@ -0,0 +1,1350 @@ + + +Objective-C Automatic Reference Counting (ARC) + + + + + + + + + + +
+

Automatic Reference Counting

+ +
+
+ +
+

About this document

+ +
+

Purpose

+ +

The first and primary purpose of this document is to serve as a +complete technical specification of Automatic Reference Counting. +Given a core Objective-C compiler and runtime, it should be possible +to write a compiler and runtime which implements these new +semantics.

+ +

The secondary purpose is to act as a rationale for why ARC was +designed in this way. This should remain tightly focused on the +technical design and should not stray into marketing speculation.

+ +
+ +
+

Background

+ +

This document assumes a basic familiarity with C.

+ +

Blocks are a C language extension for +creating anonymous functions. Users interact with and transfer block +objects using block pointers, which are +represented like a normal pointer. A block may capture values from +local variables; when this occurs, memory must be dynamically +allocated. The initial allocation is done on the stack, but the +runtime provides a Block_copy function which, given a block +pointer, either copies the underlying block object to the heap, +setting its reference count to 1 and returning the new block pointer, +or (if the block object is already on the heap) increases its +reference count by 1. The paired function is Block_release, +which decreases the reference count by 1 and destroys the object if +the count reaches zero and is on the heap.

+ +

Objective-C is a set of language extensions, significant enough to +be considered a different language. It is a strict superset of C. +The extensions can also be imposed on C++, producing a language called +Objective-C++. The primary feature is a single-inheritance object +system; we briefly describe the modern dialect.

+ +

Objective-C defines a new type kind, collectively called +the object pointer types. This kind has two +notable builtin members, id and Class; id +is the final supertype of all object pointers. The validity of +conversions between object pointer types is not checked at runtime. +Users may define classes; each class is a +type, and the pointer to that type is an object pointer type. A class +may have a superclass; its pointer type is a subtype of its +superclass's pointer type. A class has a set +of ivars, fields which appear on all +instances of that class. For every class T there's an +associated metaclass; it has no fields, its superclass is the +metaclass of T's superclass, and its metaclass is a global +class. Every class has a global object whose class is the +class's metaclass; metaclasses have no associated type, so pointers to +this object have type Class.

+ +

A class declaration (@interface) declares a set +of methods. A method has a return type, a +list of argument types, and a selector: a +name like foo:bar:baz:, where the number of colons +corresponds to the number of formal arguments. A method may be an +instance method, in which case it can be invoked on objects of the +class, or a class method, in which case it can be invoked on objects +of the metaclass. A method may be invoked by providing an object +(called the receiver) and a list of formal +arguments interspersed with the selector, like so:

+ +
[receiver foo: fooArg bar: barArg baz: bazArg]
+ +

This looks in the dynamic class of the receiver for a method with +this name, then in that class's superclass, etc., until it finds +something it can execute. The receiver expression may also be +the name of a class, in which case the actual receiver is the class +object for that class, or (within method definitions) it may +be super, in which case the lookup algorithm starts with the +static superclass instead of the dynamic class. The actual methods +dynamically found in a class are not those declared in the +@interface, but those defined in a separate +@implementation declaration; however, when compiling a +call, typechecking is done based on the methods declared in the +@interface.

+ +

Method declarations may also be grouped into +protocols, which are not inherently +associated with any class, but which classes may claim to follow. +Object pointer types may be qualified with additional protocols that +the object is known to support.

+ +

Class extensions are collections of ivars +and methods, designed to allow a class's @interface to be +split across multiple files; however, there is still a primary +implementation file which must see the @interfaces of all +class extensions. +Categories allow methods (but not ivars) to +be declared post hoc on an arbitrary class; the methods in the +category's @implementation will be dynamically added to that +class's method tables which the category is loaded at runtime, +replacing those methods in case of a collision.

+ +

In the standard environment, objects are allocated on the heap, and +their lifetime is manually managed using a reference count. This is +done using two instance methods which all classes are expected to +implement: retain increases the object's reference count by +1, whereas release decreases it by 1 and calls the instance +method dealloc if the count reaches 0. To simplify certain +operations, there is also an autorelease +pool, a thread-local list of objects to call release +on later; an object can be added to this pool by +calling autorelease on it.

+ +

Block pointers may be converted to type id; block objects +are laid out in a way that makes them compatible with Objective-C +objects. There is a builtin class that all block objects are +considered to be objects of; this class implements retain by +adjusting the reference count, not by calling Block_copy.

+ +
+ +
+ +
+

General

+ +

Automatic Reference Counting implements automatic memory management +for Objective-C objects and blocks, freeing the programmer from the +need explicitly insert retains and releases. It does not provide a +cycle collector; users must explicitly manage lifetime instead.

+ +

ARC may be explicitly enabled with the compiler +flag -fobjc-arc. It may also be explicitly disabled with the +compiler flag -fno-objc-arc. The last of these two flags +appearing on the compile line wins.

+ +

If ARC is enabled, __has_feature(objc_arc) will expand to +1 in the preprocessor. For more information about __has_feature, +see the language +extensions document.

+ +
+ +
+

Retainable object pointers

+ +

This section describes retainable object pointers, their basic +operations, and the restrictions imposed on their use under ARC. Note +in particular that it covers the rules for pointer values +(patterns of bits indicating the location of a pointed-to object), not +pointer +objects (locations in memory which store pointer values). +The rules for objects are covered in the next section.

+ +

A retainable object pointer +(or retainable pointer) is a value of +a retainable object pointer type +(retainable type). There are three kinds of retainable object +pointer types:

+
    +
  • block pointers (formed by applying the caret (^) +declarator sigil to a function type)
  • +
  • Objective-C object pointers (id, Class, NSFoo*, etc.)
  • +
  • typedefs marked with __attribute__((NSObject))
  • +
+ +

Other pointer types, such as int* and CFStringRef, +are not subject to ARC's semantics and restrictions.

+ +
+ +

Rationale: We are not at liberty to require +all code to be recompiled with ARC; therefore, ARC must interoperate +with Objective-C code which manages retains and releases manually. In +general, there are three requirements in order for a +compiler-supported reference-count system to provide reliable +interoperation:

+ +
    +
  • The type system must reliably identify which objects are to be +managed. An int* might be a pointer to a malloc'ed +array, or it might be a interior pointer to such an array, or it might +point to some field or local variable. In contrast, values of the +retainable object pointer types are never interior.
  • +
  • The type system must reliably indicate how to +manage objects of a type. This usually means that the type must imply +a procedure for incrementing and decrementing retain counts. +Supporting single-ownership objects requires a lot more explicit +mediation in the language.
  • +
  • There must be reliable conventions for whether and +when ownership is passed between caller and callee, for both +arguments and return values. Objective-C methods follow such a +convention very reliably, at least for system libraries on Mac OS X, +and functions always pass objects at +0. The C-based APIs for Core +Foundation objects, on the other hand, have much more varied transfer +semantics.
  • +
+
+ +

The use of __attribute__((NSObject)) typedefs is not +recommended. If it's absolutely necessary to use this attribute, be +very explicit about using the typedef, and do not assume that it will +be preserved by language features like __typeof and C++ +template argument substitution.

+ +

Rationale: any compiler operation which +incidentally strips type sugar from a type will yield a type +without the attribute, which may result in unexpected +behavior.

+ +
+

Retain count semantics

+ +

A retainable object pointer is either a null +pointer or a pointer to a valid object. Furthermore, if it has +block pointer type and is not null then it must actually be a +pointer to a block object, and if it has Class type (possibly +protocol-qualified) then it must actually be a pointer to a class +object. Otherwise ARC does not enforce the Objective-C type system as +long as the implementing methods follow the signature of the static +type. It is undefined behavior if ARC is exposed to an invalid +pointer.

+ +

For ARC's purposes, a valid object is one with well-behaved +retaining operations. Specifically, the object must be laid out such +that the Objective-C message send machinery can successfully send it +the following messages:

+ +
    +
  • retain, taking no arguments and returning a pointer to +the object.
  • +
  • release, taking no arguments and returning void.
  • +
  • autorelease, taking no arguments and returning a pointer +to the object.
  • +
+ +

The behavior of these methods is constrained in the following ways. +The term high-level semantics is an +intentionally vague term; the intent is that programmers must +implement these methods in a way such that the compiler, modifying +code in ways it deems safe according to these constraints, will not +violate their requirements. For example, if the user puts logging +statements in retain, they should not be surprised if those +statements are executed more or less often depending on optimization +settings. These constraints are not exhaustive of the optimization +opportunities: values held in local variables are subject to +additional restrictions, described later in this document.

+ +

It is undefined behavior if a computation history featuring a send +of retain followed by a send of release to the same +object, with no intervening release on that object, is not +equivalent under the high-level semantics to a computation +history in which these sends are removed. Note that this implies that +these methods may not raise exceptions.

+ +

It is undefined behavior if a computation history features any use +whatsoever of an object following the completion of a send +of release that is not preceded by a send of retain +to the same object.

+ +

The behavior of autorelease must be equivalent to sending +release when one of the autorelease pools currently in scope +is popped. It may not throw an exception.

+ +

When the semantics call for performing one of these operations on a +retainable object pointer, if that pointer is null then the +effect is a no-op.

+ +

All of the semantics described in this document are subject to +additional optimization rules which permit +the removal or optimization of operations based on local knowledge of +data flow. The semantics describe the high-level behaviors that the +compiler implements, not an exact sequence of operations that a +program will be compiled into.

+ +
+ +
+

Retainable object pointers as operands and arguments

+ +

In general, ARC does not perform retain or release operations when +simply using a retainable object pointer as an operand within an +expression. This includes:

+
    +
  • loading a retainable pointer from an object with non-weak +ownership,
  • +
  • passing a retainable pointer as an argument to a function or +method, and
  • +
  • receiving a retainable pointer as the result of a function or +method call.
  • +
+ +

Rationale: while this might seem +uncontroversial, it is actually unsafe when multiple expressions are +evaluated in parallel, as with binary operators and calls, +because (for example) one expression might load from an object while +another writes to it. However, C and C++ already call this undefined +behavior because the evaluations are unsequenced, and ARC simply +exploits that here to avoid needing to retain arguments across a large +number of calls.

+ +

The remainder of this section describes exceptions to these rules, +how those exceptions are detected, and what those exceptions imply +semantically.

+ +
+

Consumed parameters

+ +

A function or method parameter of retainable object pointer type +may be marked as consumed, signifying that +the callee expects to take ownership of a +1 retain count. This is +done by adding the ns_consumed attribute to the parameter +declaration, like so:

+ +
void foo(__attribute((ns_consumed)) id x);
+- (void) foo: (id) __attribute((ns_consumed)) x;
+ +

This attribute is part of the type of the function or method, not +the type of the parameter. It controls only how the argument is +passed and received.

+ +

When passing such an argument, ARC retains the argument prior to +making the call.

+ +

When receiving such an argument, ARC releases the argument at the +end of the function, subject to the usual optimizations for local +values.

+ +

Rationale: this formalizes direct transfers +of ownership from a caller to a callee. The most common scenario here +is passing the self parameter to init, but it is +useful to generalize. Typically, local optimization will remove any +extra retains and releases: on the caller side the retain will be +merged with a +1 source, and on the callee side the release will be +rolled into the initialization of the parameter.

+ +

The implicit self parameter of a method may be marked as +consumed by adding __attribute__((ns_consumes_self)) to the +method declaration. Methods in +the init family are implicitly +marked __attribute__((ns_consumes_self)).

+ +

It is undefined behavior if an Objective-C message send of a method +with ns_consumed parameters (other than self) is made to a +null pointer.

+ +

Rationale: in fact, it's probably a +guaranteed leak.

+ +
+ +
+

Retained return values

+ +

A function or method which returns a retainable object pointer type +may be marked as returning a retained value, signifying that the +caller expects to take ownership of a +1 retain count. This is done +by adding the ns_returns_retained attribute to the function or +method declaration, like so:

+ +
id foo(void) __attribute((ns_returns_retained));
+- (id) foo __attribute((ns_returns_retained));
+ +

This attribute is part of the type of the function or method.

+ +

When returning from such a function or method, ARC retains the +value at the point of evaluation of the return statement, before +leaving all local scopes.

+ +

When receiving a return result from such a function or method, ARC +releases the value at the end of the full-expression it is contained +within, subject to the usual optimizations for local values.

+ +

Rationale: this formalizes direct transfers of +ownership from a callee to a caller. The most common scenario this +models is the retained return from init, alloc, +new, and copy methods, but there are other cases in +the frameworks. After optimization there are typically no extra +retains and releases required.

+ +

Methods in +the alloc, copy, init, mutableCopy, +and new families are implicitly marked +__attribute__((ns_returns_retained)). This may be suppressed +by explicitly marking the +method __attribute__((ns_returns_not_retained)).

+
+ +
+

Unretained return values

+ +

A method or function which returns a retainable object type but +does not return a retained value must ensure that the object is +still valid across the return boundary.

+ +

When returning from such a function or method, ARC retains the +value at the point of evaluation of the return statement, then leaves +all local scopes, and then balances out the retain while ensuring that +the value lives across the call boundary. In the worst case, this may +involve an autorelease, but callers must not assume that the +value is actually in the autorelease pool.

+ +

ARC performs no extra mandatory work on the caller side, although +it may elect to do something to shorten the lifetime of the returned +value.

+ +

Rationale: it is common in non-ARC code to not +return an autoreleased value; therefore the convention does not force +either path. It is convenient to not be required to do unnecessary +retains and autoreleases; this permits optimizations such as eliding +retain/autoreleases when it can be shown that the original pointer +will still be valid at the point of return.

+ +

A method or function may be marked +with __attribute__((ns_returns_autoreleased)) to indicate +that it returns a pointer which is guaranteed to be valid at least as +long as the innermost autorelease pool. There are no additional +semantics enforced in the definition of such a method; it merely +enables optimizations in callers.

+
+ +
+

Bridged casts

+ +

A bridged cast is a C-style cast +annotated with one of three keywords:

+ +
    +
  • (__bridge T) op casts the operand to the destination +type T. If T is a retainable object pointer type, +then op must have a non-retainable pointer type. +If T is a non-retainable pointer type, then op must +have a retainable object pointer type. Otherwise the cast is +ill-formed. There is no transfer of ownership, and ARC inserts +no retain operations.
  • + +
  • (__bridge_retained T) op casts the operand, which must +have retainable object pointer type, to the destination type, which +must be a non-retainable pointer type. ARC retains the value, subject +to the usual optimizations on local values, and the recipient is +responsible for balancing that +1.
  • + +
  • (__bridge_transfer T) op casts the operand, which must +have non-retainable pointer type, to the destination type, which must +be a retainable object pointer type. ARC will release the value at +the end of the enclosing full-expression, subject to the usual +optimizations on local values.
  • +
+ +

These casts are required in order to transfer objects in and out of +ARC control; see the rationale in the section +on conversion of retainable +object pointers.

+ +

Using a __bridge_retained or __bridge_transfer +cast purely to convince ARC to emit an unbalanced retain or release, +respectively, is poor form.

+ +
+ +
+ +
+

Restrictions

+ +
+

Conversion of retainable object pointers

+ +

In general, a program which attempts to implicitly or explicitly +convert a value of retainable object pointer type to any +non-retainable type, or vice-versa, is ill-formed. For example, an +Objective-C object pointer shall not be converted to intptr_t +or void*. The bridged +casts may be used to perform these conversions where +necessary.

+ +

Rationale: we cannot ensure the correct +management of the lifetime of objects if they may be freely passed +around as unmanaged types. The bridged casts are provided so that the +programmer may explicitly describe whether the cast transfers control +into or out of ARC.

+
+ +

An unbridged cast to a retainable object pointer type of the return +value of a Objective-C message send which yields a non-retainable +pointer is treated as a __bridge_transfer cast +if:

+ +
    +
  • the method has the cf_returns_retained attribute, or if +not that,
  • +
  • the method does not have the cf_returns_not_retained +attribute and
  • +
  • the method's selector family would imply +the ns_returns_retained attribute on a method which returned +a retainable object pointer type.
  • +
+ +

Otherwise the cast is treated as a __bridge cast.

+ +
+ +
+ +
+

Ownership qualification

+ +

This section describes the behavior of objects of +retainable object pointer type; that is, locations in memory which +store retainable object pointers.

+ +

A type is a retainable object owner type +if it is a retainable object pointer type or an array type whose +element type is a retainable object owner type.

+ +

An ownership qualifier is a type +qualifier which applies only to retainable object owner types. A +program is ill-formed if it attempts to apply an ownership qualifier +to a type which is already ownership-qualified, even if it is the same +qualifier. An array type is ownership-qualified according to its +element type, and adding an ownership qualifier to an array type so +qualifies its element type.

+ +

Except as described under +the inference rules, a program is +ill-formed if it attempts to form a pointer or reference type to a +retainable object owner type which lacks an ownership qualifier.

+ +

Rationale: these rules, together with the +inference rules, ensure that all objects and lvalues of retainable +object pointer type have an ownership qualifier.

+ +

There are four ownership qualifiers:

+ +
    +
  • __autoreleasing
  • +
  • __strong
  • +
  • __unsafe_unretained
  • +
  • __weak
  • +
+ +

A type is nontrivially ownership-qualified +if it is qualified with __autoreleasing, __strong, or +__weak.

+ +
+

Spelling

+ +

The names of the ownership qualifiers are reserved for the +implementation. A program may not assume that they are or are not +implemented with macros, or what those macros expand to.

+ +

An ownership qualifier may be written anywhere that any other type +qualifier may be written.

+ +

If an ownership qualifier appears in +the declaration-specifiers, the following rules apply:

+ +
    +
  • if the type specifier is a retainable object owner type, the +qualifier applies to that type;
  • +
  • if the outermost non-array part of the declarator is a pointer or +block pointer, the qualifier applies to that type;
  • +
  • otherwise the program is ill-formed.
  • +
+ +

If an ownership qualifier appears on the declarator name, or on the +declared object, it is applied to outermost pointer or block-pointer +type.

+ +

If an ownership qualifier appears anywhere else in a declarator, it +applies to the type there.

+ +
+ +
+

Semantics

+ +

There are five managed operations which +may be performed on an object of retainable object pointer type. Each +qualifier specifies different semantics for each of these operations. +It is still undefined behavior to access an object outside of its +lifetime.

+ +

A load or store with primitive semantics has the same +semantics as the respective operation would have on an void* +lvalue with the same alignment and non-ownership qualification.

+ +

Reading occurs when performing a +lvalue-to-rvalue conversion on an object lvalue. + +

    +
  • For __weak objects, the current pointee is retained and +then released at the end of the current full-expression. This must +execute atomically with respect to assignments and to the final +release of the pointee.
  • +
  • For all other objects, the lvalue is loaded with primitive +semantics.
  • +
+

+ +

Assignment occurs when evaluating +an assignment operator. The semantics vary based on the qualification: +

    +
  • For __strong objects, the new pointee is first retained; +second, the lvalue is loaded with primitive semantics; third, the new +pointee is stored into the lvalue with primitive semantics; and +finally, the old pointee is released. This is not performed +atomically; external synchronization must be used to make this safe in +the face of concurrent loads and stores.
  • +
  • For __weak objects, the lvalue is updated to point to the +new pointee, unless that object is currently undergoing deallocation, +in which case it the lvalue is updated to a null pointer. This must +execute atomically with respect to other assignments to the object, to +reads from the object, and to the final release of the new pointed-to +value.
  • +
  • For __unsafe_unretained objects, the new pointee is +stored into the lvalue using primitive semantics.
  • +
  • For __autoreleasing objects, the new pointee is retained, +autoreleased, and stored into the lvalue using primitive semantics.
  • +
+

+ +

Initialization occurs when an object's +lifetime begins, which depends on its storage duration. +Initialization proceeds in two stages: +

    +
  1. First, a null pointer is stored into the lvalue using primitive +semantics. This step is skipped if the object +is __unsafe_unretained.
  2. +
  3. Second, if the object has an initializer, that expression is +evaluated and then assigned into the object using the usual assignment +semantics.
  4. +
+

+ +

Destruction occurs when an object's +lifetime ends. In all cases it is semantically equivalent to +assigning a null pointer to the object, with the proviso that of +course the object cannot be legally read after the object's lifetime +ends.

+ +

Moving occurs in specific situations +where an lvalue is moved from, meaning that its current pointee +will be used but the object may be left in a different (but still +valid) state. This arises with __block variables and rvalue +references in C++. For __strong lvalues, moving is equivalent +to loading the lvalue with primitive semantics, writing a null pointer +to it with primitive semantics, and then releasing the result of the +load at the end of the current full-expression. For all other +lvalues, moving is equivalent to reading the object.

+ +
+ +
+

Restrictions

+ +
+

Storage duration of __autoreleasing objects

+ +

A program is ill-formed if it declares an __autoreleasing +object of non-automatic storage duration.

+ +

Rationale: autorelease pools are tied to the +current thread and scope by their nature. While it is possible to +have temporary objects whose instance variables are filled with +autoreleased objects, there is no way that ARC can provide any sort of +safety guarantee there.

+ +

It is undefined behavior if a non-null pointer is assigned to +an __autoreleasing object while an autorelease pool is in +scope and then that object is read after the autorelease pool's scope +is left.

+ +
+ +
+

Conversion of pointers to ownership-qualified types

+ +

A program is ill-formed if an expression of type T* is +converted, explicitly or implicitly, to the type U*, +where T and U have different ownership +qualification, unless: +

    +
  • T is qualified with __strong, + __autoreleasing, or __unsafe_unretained, and + U is qualified with both const and + __unsafe_unretained; or
  • +
  • either T or U is cv void, where +cv is an optional sequence of non-ownership qualifiers; or
  • +
  • the conversion is requested with a reinterpret_cast in + Objective-C++; or
  • +
  • the conversion is a +well-formed pass-by-writeback.
  • +
+

+ +

The analogous rule applies to T& and U& in +Objective-C++.

+ +

Rationale: these rules provide a reasonable +level of type-safety for indirect pointers, as long as the underlying +memory is not deallocated. The conversion to const +__unsafe_unretained is permitted because the semantics of reads +are equivalent across all these ownership semantics, and that's a very +useful and common pattern. The interconversion with void* is +useful for allocating memory or otherwise escaping the type system, +but use it carefully. reinterpret_cast is considered to be +an obvious enough sign of taking responsibility for any +problems.

+ +

It is undefined behavior to access an ownership-qualified object +through an lvalue of a differently-qualified type, except that any +non-__weak object may be read through +an __unsafe_unretained lvalue.

+ +

It is undefined behavior if a managed operation is performed on +a __strong or __weak object without a guarantee that +it contains a primitive zero bit-pattern, or if the storage for such +an object is freed or reused without the object being first assigned a +null pointer.

+ +

Rationale: ARC cannot differentiate between +an assignment operator which is intended to initialize dynamic +memory and one which is intended to potentially replace a value. +Therefore the object's pointer must be valid before letting ARC at it. +Similarly, C and Objective-C do not provide any language hooks for +destroying objects held in dynamic memory, so it is the programmer's +responsibility to avoid leaks (__strong objects) and +consistency errors (__weak objects).

+ +

These requirements are followed automatically in Objective-C++ when +creating objects of retainable object owner type with new +or new[] and destroying them with delete, +delete[], or a pseudo-destructor expression. Note that +arrays of nontrivially-ownership-qualified type are not ABI compatible +with non-ARC code because the element type is non-POD: such arrays +that are new[]'d in ARC translation units cannot +be delete[]'d in non-ARC translation units and +vice-versa.

+ +
+ +
+

Passing to an out parameter by writeback

+ +

If the argument passed to a parameter of type +T __autoreleasing * has type U oq *, +where oq is an ownership qualifier, then the argument is a +candidate for pass-by-writeback if:

+ +
    +
  • oq is __strong or __weak, and +
  • it would be legal to initialize a T __strong * with +a U __strong *.
  • +
+ +

For purposes of overload resolution, an implicit conversion +sequence requiring a pass-by-writeback is always worse than an +implicit conversion sequence not requiring a pass-by-writeback.

+ +

The pass-by-writeback is ill-formed if the argument expression does +not have a legal form:

+ +
    +
  • &var, where var is a scalar variable of +automatic storage duration with retainable object pointer type
  • +
  • a conditional expression where the second and third operands are +both legal forms
  • +
  • a cast whose operand is a legal form
  • +
  • a null pointer constant
  • +
+ +

Rationale: the restriction in the form of +the argument serves two purposes. First, it makes it impossible to +pass the address of an array to the argument, which serves to protect +against an otherwise serious risk of mis-inferring an array +argument as an out-parameter. Second, it makes it much less likely +that the user will see confusing aliasing problems due to the +implementation, below, where their store to the writeback temporary is +not immediately seen in the original argument variable.

+ +

A pass-by-writeback is evaluated as follows: +

    +
  1. The argument is evaluated to yield a pointer p of + type U oq *.
  2. +
  3. If p is a null pointer, then a null pointer is passed as + the argument, and no further work is required for the pass-by-writeback.
  4. +
  5. Otherwise, a temporary of type T __autoreleasing is + created and initialized to a null pointer.
  6. +
  7. If the argument is not an Objective-C method parameter marked + out, then *p is read, and the result is written + into the temporary with primitive semantics.
  8. +
  9. The address of the temporary is passed as the argument to the + actual call.
  10. +
  11. After the call completes, the temporary is loaded with primitive + semantics, and that value is assigned into *p.
  12. +

+ +

Rationale: this is all admittedly +convoluted. In an ideal world, we would see that a local variable is +being passed to an out-parameter and retroactively modify its type to +be __autoreleasing rather than __strong. This would +be remarkably difficult and not always well-founded under the C type +system. However, it was judged unacceptably invasive to require +programmers to write __autoreleasing on all the variables +they intend to use for out-parameters. This was the least bad +solution.

+ +
+ +
+

Ownership-qualified fields of structs and unions

+ +

A program is ill-formed if it declares a member of a C struct or +union to have a nontrivially ownership-qualified type.

+ +

Rationale: the resulting type would be +non-POD in the C++ sense, but C does not give us very good language +tools for managing the lifetime of aggregates, so it is more +convenient to simply forbid them. It is still possible to manage this +with a void* or an __unsafe_unretained +object.

+ +

This restriction does not apply in Objective-C++. However, +nontrivally ownership-qualified types are considered non-POD: in C++0x +terms, they are not trivially default constructible, copy +constructible, move constructible, copy assignable, move assignable, +or destructible. It is a violation of C++ One Definition Rule to use +a class outside of ARC that, under ARC, would have an +ownership-qualified member.

+ +

Rationale: unlike in C, we can express all +the necessary ARC semantics for ownership-qualified subobjects as +suboperations of the (default) special member functions for the class. +These functions then become non-trivial. This has the non-obvious +repercussion that the class will have a non-trivial copy constructor +and non-trivial destructor; if it wouldn't outside of ARC, this means +that objects of the type will be passed and returned in an +ABI-incompatible manner.

+ +
+ +
+ +
+

Ownership inference

+ +
+

Objects

+ +

If an object is declared with retainable object owner type, but +without an explicit ownership qualifier, its type is implicitly +adjusted to have __strong qualification.

+ +

As a special case, if the object's base type is Class +(possibly protocol-qualified), the type is adjusted to +have __unsafe_unretained qualification instead.

+ +
+ +
+

Indirect parameters

+ +

If a function or method parameter has type T*, where +T is an ownership-unqualified retainable object pointer type, +then:

+ +
    +
  • if T is const-qualified or Class, then +it is implicitly qualified with __unsafe_unretained;
  • +
  • otherwise, it is implicitly qualified +with __autoreleasing.
  • +
+

+ +

Rationale: __autoreleasing exists +mostly for this case, the Cocoa convention for out-parameters. Since +a pointer to const is obviously not an out-parameter, we +instead use a type more useful for passing arrays. If the user +instead intends to pass in a mutable array, inferring +__autoreleasing is the wrong thing to do; this directs some +of the caution in the following rules about writeback.

+ +

Such a type written anywhere else would be ill-formed by the +general rule requiring ownership qualifiers.

+ +

This rule does not apply in Objective-C++ if a parameter's type is +dependent in a template pattern and is only instantiated to +a type which would be a pointer to an unqualified retainable object +pointer type. Such code is still ill-formed.

+ +

Rationale: the convention is very unlikely +to be intentional in template code.

+ +
+
+
+ +
+

Method families

+ +

An Objective-C method may fall into a method +family, which is a conventional set of behaviors ascribed to it +by the Cocoa conventions.

+ +

A method is in a certain method family if: +

    +
  • it has a objc_method_family attribute placing it in that + family; or if not that,
  • +
  • it does not have an objc_method_family attribute placing + it in a different or no family, and
  • +
  • its selector falls into the corresponding selector family, and
  • +
  • its signature obeys the added restrictions of the method family.
  • +

+ +

A selector is in a certain selector family if, ignoring any leading +underscores, the first component of the selector either consists +entirely of the name of the method family or it begins with that name +followed by a character other than a lowercase letter. For +example, _perform:with: and performWith: would fall +into the perform family (if we recognized one), +but performing:with would not.

+ +

The families and their added restrictions are:

+ +
    +
  • alloc methods must return a retainable object pointer type.
  • +
  • copy methods must return a retainable object pointer type.
  • +
  • mutableCopy methods must return a retainable object pointer type.
  • +
  • new methods must return a retainable object pointer type.
  • +
  • init methods must be instance methods and must return an +Objective-C pointer type. Additionally, a program is ill-formed if it +declares or contains a call to an init method whose return +type is neither id nor a pointer to a super-class or +sub-class of either the declaring class, if the method was declared on +a class, or the static receiver type of the call, if it was declared +on a protocol.

    + +

    Rationale: there are a fair number of existing +methods with init-like selectors which nonetheless don't +follow the init conventions. Typically these are either +accidental naming collisions or helper methods called during +initialization. Because of the peculiar retain/release behavior +of init methods, it's very important not to treat these +methods as init methods if they aren't meant to be. It was +felt that implicitly defining these methods out of the family based on +the exact relationship between the return type and the declaring class +would much too subtle and fragile. Therefore we identify a small +number of legitimate-seeming return types and call everything else an +error. This serves the secondary purpose of encouraging programmers +not to accidentally give methods names in the init family.

    +
  • +
+ +

A program is ill-formed if a method's declarations, +implementations, and overrides do not all have the same method +family.

+ +
+

Explicit method family control

+ +

A method may be annotated with the objc_method_family +attribute to precisely control which method family it belongs to. If +a method in an @implementation does not have this attribute, +but there is a method declared in the corresponding @interface +that does, then the attribute is copied to the declaration in the +@implementation. The attribute is available outside of ARC, +and may be tested for with the preprocessor query +__has_attribute(objc_method_family).

+ +

The attribute is spelled +__attribute__((objc_method_family(family))). +If family is none, the method has no family, even if +it would otherwise be considered to have one based on its selector and +type. Otherwise, family must be one +of alloc, copy, init, +mutableCopy, or new, in which case the method is +considered to belong to the corresponding family regardless of its +selector. It is an error if a method that is explicitly added to a +family in this way does not meet the requirements of the family other +than the selector naming convention.

+ +

Rationale: the rules codified in this document +describe the standard conventions of Objective-C. However, as these +conventions have not heretofore been enforced by an unforgiving +mechanical system, they are only imperfectly kept, especially as they +haven't always even been precisely defined. While it is possible to +define low-level ownership semantics with attributes like +ns_returns_retained, this attribute allows the user to +communicate semantic intent, which of use both to ARC (which, e.g., +treats calls to init specially) and the static analyzer.

+
+ +
+

Semantics of method families

+ +

A method's membership in a method family may imply non-standard +semantics for its parameters and return type.

+ +

Methods in the alloc, copy, mutableCopy, +and new families — that is, methods in all the +currently-defined families except init — transfer +ownership of a +1 retain count on their return value to the calling +function, as if they were implicitly annotated with +the ns_returns_retained attribute. However, this is not true +if the method has either of the ns_returns_autoreleased or +ns_returns_not_retained attributes.

+ +
+

Semantics of init

+ +

Methods in the init family must be transferred ownership +of a +1 retain count on their self parameter, exactly as if +the method had the ns_consumes_self attribute, and must +transfer ownership of a +1 retain count on their return value, exactly +as if they method had the ns_returns_retained attribute. +Neither of these may be altered through attributes.

+ +

A call to an init method with a receiver that is either +self (possibly parenthesized or casted) or super is +called a delegate init call. It is an error +for a delegate init call to be made except from an init +method, and excluding blocks within such methods.

+ +

The variable self is mutable in an init method +and is implicitly qualified as __strong. However, a program +is ill-formed, no diagnostic required, if it alters self +except to assign it the immediate result of a delegate init call. It +is an error to use the previous value of self after the +completion of a delegate init call.

+ +

A program is ill-formed, no diagnostic required, if it causes two +or more calls to init methods on the same object, except that +each init method invocation may perform at most one +delegate init call.

+ +
+ +
+

Related result types

+ +

Certain methods are candidates to have related +result types:

+
    +
  • class methods in the alloc and new method families
  • +
  • instance methods in the init family
  • +
  • the instance method self
  • +
  • outside of ARC, the instance methods retain and autorelease
  • +
+ +

If the formal result type of such a method is id or +protocol-qualified id, or a type equal to the declaring class +or a superclass, then it is said to have a related result type. In +this case, when invoked in an explicit message send, it is assumed to +return a type related to the type of the receiver:

+ +
    +
  • if it is a class method, and the receiver is a class +name T, the message send expression has type T*; +otherwise
  • +
  • if it is an instance method, and the receiver has type T, +the message send expression has type T; otherwise
  • +
  • the message send expression has the normal result type of the +method.
  • +
+ +

This is a new rule of the Objective-C language and applies outside +of ARC.

+ +

Rationale: ARC's automatic code emission is +more prone than most code to signature errors, i.e. errors where a +call was emitted against one method signature, but the implementing +method has an incompatible signature. Having more precise type +information helps drastically lower this risks, as well as catching +a number of latent bugs.

+ +
+
+
+ +
+

Optimization

+ +

ARC applies aggressive rules for the optimization of local +behavior. These rules are based around a core assumption of +local balancing: that other code will +perform retains and releases as necessary (and only as necessary) for +its own safety, and so the optimizer does not need to consider global +properties of the retain and release sequence. For example, if a +retain and release immediately bracket a call, the optimizer can +delete the retain and release on the assumption that the called +function will not do a constant number of unmotivated releases +followed by a constant number of balancing retains, such that +the local retain/release pair is the only thing preventing the called +function from ending up with a dangling reference.

+ +

The optimizer assumes that when a new value enters local control, +e.g. from a load of a non-local object or as the result of a function +call, it is instaneously valid. Subsequently, a retain and release of +a value are necessary on a computation path only if there is a use of +that value before the release and after any operation which might +cause a release of the value (including indirectly or non-locally), +and only if the value is not demonstrably already retained.

+ +

The complete optimization rules are quite complicated, but it would +still be useful to document them here.

+ +
+ +
+

Miscellaneous

+ +
+

@autoreleasepool

+ +

To simplify the use of autorelease pools, and to bring them under +the control of the compiler, a new kind of statement is available in +Objective-C. It is written @autoreleasepool followed by +a compound-statement, i.e. by a new scope delimited by curly +braces. Upon entry to this block, the current state of the +autorelease pool is captured. When the block is exited normally, +whether by fallthrough or directed control flow (such +as return or break), the autorelease pool is +restored to the saved state, releasing all the objects in it. When +the block is exited with an exception, the pool is not drained.

+ +

A program is ill-formed if it refers to the +NSAutoreleasePool class.

+ +

Rationale: autorelease pools are clearly +important for the compiler to reason about, but it is far too much to +expect the compiler to accurately reason about control dependencies +between two calls. It is also very easy to accidentally forget to +drain an autorelease pool when using the manual API, and this can +significantly inflate the process's high-water-mark. The introduction +of a new scope is unfortunate but basically required for sane +interaction with the rest of the language. Not draining the pool +during an unwind is apparently required by the Objective-C exceptions +implementation.

+ +
+ +
+

self

+ +

The self parameter variable of an Objective-C method is +never actually retained by the implementation. It is undefined +behavior, or at least dangerous, to cause an object to be deallocated +during a message send to that object. To make this +safe, self is implicitly const unless the method is +in the init family.

+ +

Rationale: the cost of +retaining self in all methods was found to be prohibitive, as +it tends to be live across calls, preventing the optimizer from +proving that the retain and release are unnecessary — for good +reason, as it's quite possible in theory to cause an object to be +deallocated during its execution without this retain and release. +Since it's extremely uncommon to actually do so, even unintentionally, +and since there's no natural way for the programmer to remove this +retain/release pair otherwise (as there is for other parameters by, +say, making the variable __unsafe_unretained), we chose to +make this optimizing assumption and shift some amount of risk to the +user.

+ +
+ +
+

Fast enumeration iteration variables

+ +

If a variable is declared in the condition of an Objective-C fast +enumeration loop, and the variable has no explicit ownership +qualifier, then it is qualified with const __strong and +objects encountered during the enumeration are not actually +retained.

+ +

Rationale: this is an optimization made +possible because fast enumeration loops promise to keep the objects +retained during enumeration, and the collection itself cannot be +synchronously modified. It can be overridden by explicitly qualifying +the variable with __strong, which will make the variable +mutable again and cause the loop to retain the objects it +encounters.

+ +
+ +
+

Blocks

+ +

The implicit const capture variables created when +evaluating a block literal expression have the same ownership +semantics as the local variables they capture. The capture is +performed by reading from the captured variable and initializing the +capture variable with that value; the capture variable is destroyed +when the block literal is, i.e. at the end of the enclosing scope.

+ +

The inference rules apply +equally to __block variables, which is a shift in semantics +from non-ARC, where __block variables did not implicitly +retain during capture.

+ +

__block variables of retainable object owner type are +moved off the stack by initializing the heap copy with the result of +moving from the stack copy.

+ +

With the exception of retains done as part of initializing +a __strong parameter variable or reading a __weak +variable, whenever these semantics call for retaining a value of +block-pointer type, it has the effect of a Block_copy. The +optimizer may remove such copies when it sees that the result is +used only as an argument to a call.

+ +
+ +
+

Exceptions

+ +

By default in Objective C, ARC is not exception-safe for normal +releases: +

    +
  • It does not end the lifetime of __strong variables when +their scopes are abnormally terminated by an exception.
  • +
  • It does not perform releases which would occur at the end of +a full-expression if that full-expression throws an exception.
  • +
+ +

A program may be compiled with the option +-fobjc-arc-exceptions in order to enable these, or with the +option -fno-objc-arc-exceptions to explicitly disable them, +with the last such argument winning.

+ +

Rationale: the standard Cocoa convention is +that exceptions signal programmer error and are not intended to be +recovered from. Making code exceptions-safe by default would impose +severe runtime and code size penalties on code that typically does not +actually care about exceptions safety. Therefore, ARC-generated code +leaks by default on exceptions, which is just fine if the process is +going to be immediately terminated anyway. Programs which do care +about recovering from exceptions should enable the option.

+ +

In Objective-C++, -fobjc-arc-exceptions is enabled by +default.

+ +

Rationale: C++ already introduces pervasive +exceptions-cleanup code of the sort that ARC introduces. C++ +programmers who have not already disabled exceptions are much more +likely to actual require exception-safety.

+ +

ARC does end the lifetimes of __weak objects when an +exception terminates their scope unless exceptions are disabled in the +compiler.

+ +

Rationale: the consequence of a +local __weak object not being destroyed is very likely to be +corruption of the Objective-C runtime, so we want to be safer here. +Of course, potentially massive leaks are about as likely to take down +the process as this corruption is if the program does try to recover +from exceptions.

+ +
+ +
+
+ +