Original article: http://www.mikeash.com/pyblog/friday-qa-2011-09-30-automatic-reference-counting.html
Bymike Ash
Concept
"Clangstatic analyzer" is a very useful tool for finding memory management errors in code. I often think about the output of this analyzer. "Since you can find out the error, why can't you fix it ?"
In fact, this is the role of arc. The compiler contains memory management rules, but it can only be called by itself, and cannot help programmers find errors.
ARC is between automatic garbage collection (GC) and manual memory management. Like garbage collection, arc eliminates the need for programmers to write retain/release/autorelease statements. However, unlike garbage collection, arc cannot process retaincycles. In Arc, if two objects strongly reference each other (strong references), they will never be released, or even no object references them.
Therefore, although arc can avoid most of the programmer's memory management problems, it still requires the programmer to avoid retaincycles or manually interrupt the retain loop between objects. There is another important difference between Arc and Apple's garbage collection: arc is not mandatory. For Apple's garbage collection, either the whole program is used or not used. That is to say, all O-C code in the app, including all apple frameworks and all 3rd-party libraries must support garbage collection before using garbage collection. On the contrary, arc and non-arc Code can coexist in an app. This enables sporadic migration of projects
Rather than the compatibility and stability problems encountered at the beginning of garbage collection.
Xcode
ARC is valid in xcode 4.2. It is currently in beta and can only be compiled using clang (that is, "Apple llvm compiler "). Set "objective-cautomatic reference counting" to yes to enable arc, and no to disable arc.
If you enable this setting in the old Code, a large number of errors will occur. Arc not only manages the memory for you, but also disables manual memory management. Manually calling the retain/release/autorelease method is forbidden in arc. Such calls can be seen everywhere in non-arc Code, resulting in a large number of errors.
Fortunately, xcode provides a tool to automatically convert old code. Select "edit-> refactor...-> convert to objective-C arc...", and xcode will guide you to convert your code. Although you sometimes need to tell it how to do it, there are still a lot of jobs that are automated.
Basic functions
Cocoa memory management rules are simple:
- If you have an object alloc, new, copy, or retain, you must release or autorelease it.
- If you get an object outside of this, but you need it to exist in the memory for a longer time, you must retain or copy it. Of course, in the end you must release/autorelease it.
This is very suitable for automation. For example, you have written the following code:
Foo * Foo = [[Foo alloc] init];
[Foo something];
Return;
The compiler finds that the alloc of this Code does not have a matching release, so it changes the code:
Foo * Foo = [[Foo alloc] init];
[Foo something];
[Foo release];
Return;
In fact, the compiler does not insert a release message call, but uses the runtime function:
Foo * Foo = [[Foo alloc] init];
[Foo something];
Objc_release (FOO );
Return;
This is an optimization. If the release method is not overwritten, The objc_release function ignores the O-C message, which improves the speed.
ARC also makes the code safer. Most programmers refer to Rule #2 as "long time" to store objects to instance variables or similar places. In this way, the retain and release local temporary objects are not required:
Foo * Foo = [self Foo];
[Foo bar];
[Foo BAZ];
[Foo quux];
However, this situation is very dangerous:
Foo * Foo = [self Foo];
[Foo bar];
[Foo BAZ];
[Self setfoo: newfoo];
[Foo quux]; // crash
The general method to solve this problem is to use retain/autorelease before return in the-Foo getter method. However, this will generate a large number of temporary objects, resulting in excessive memory usage. However, in Arc, additional statements will be inserted, similar to the following:
Foo * Foo = objc_retainautoreleasedreturnvalue ([self Foo]);
[Foo bar];
[Foo BAZ];
[Self setfoo: newfoo];
[Foo quux]; // fine
Objc_release (FOO );
Similarly, if you write a simple getter method, arc will automatically make it safer:
-(FOO *) foo
{
Returnobjc_retainautoreleasereturnvalue (_ Foo );
}
Wait, this still does not solve the problem of memory usage of temporary objects! In the end, arc still calls retain/autorelease in the getter method, and it is also called by retain/autorelease at the same time. This efficiency is too low.
Don't worry, as I mentioned earlier, arc will optimize the code and ignore this extra call and send messages directly. To make reatin/release faster, some operations are reduced when both are called.
When objc_retainautoreleasereturnvalue is called, it will view the stack and obtain the return address from the caller. In this way, it can precisely see what will happen after the function ends. If the compiler optimization is enabled, ojbc_retainautoreleasereturnvalue will use the end call to optimize [1]. However, the address will direct to the ojbc_retainautoreleasedreturnvalue call.
By checking the return address, you will find any redundant operations to be performed during the operation. It will cancel autorelease and tell the caller to cancel its retain by setting the flag. In this way, the entire Code only performs a retain in the getter method once, and a release in the Call Code, both security and efficiency.
Note that this optimization is fully compatible with non-arc Code. If the getter method is not arc, the flag is not set. The caller will execute the complete release/autorelease pair. If the getter method is arc and the caller is "non-arc", the getter method will see that it does not return special runtime function code, therefore, a complete retain/autorelease pair will be executed. Although a little efficiency is lost, at least it will not cause errors.
In addition, arc automatically creates and fills in the-dealloc method for all classes to release the primary variables of the class. You can still manually implement the-dealloc method, which is required for classes that use external resources. However, it is no longer necessary (or impossible) to release instance variables ). ARC will add [superdealloc] to the last sentence, so you can skip this step. In the past, you may write as follows:
-(Void) dealloc
{
[Ivar1 release];
[Ivar2 release];
Free (buffer );
[Super dealloc];
}
Now, you only need to write:
-(Void) dealloc
{
Free (buffer );
}
In this-dealloc method, only instance variables are released and no other operations are required.
Cyclic reference and weak reference
The arc still requires programmers to solve the retain cycles themselves, and the best way to solve the circular references is to use weak references.
Arc provides a zero-weak reference, that is, a weak reference, which not only does not cause the referenced object to remain in the memory, but also automatically changes the referenced object to nil when it is parsed. Zero weak references avoid potential wild pointer problems and unexpected program crashes.
Use prefix _ weak to modify zero-weak variables. For example:
@ Interface FOO: nsobject
{
_ Weak bar * _ weakbar;
}
And local variables:
_ Weak Foo * _ weakfoo = [object Foo];
You can use it like other variables. It will automatically become nil when appropriate:
[_ Weakbar dosomethingifstillalive];
Note that a zero-weak variable is changed to nil at any time. Memory Management itself is a multi-threaded activity. A weak referenced object is released in one thread, and may be accessed in another thread. Such code cannot:
If (_ weakbar)
[Self mustnotbenil: _ weakbar];
Weak reference objects should be stored in a local strong reference variable, for example:
Bar * bar = _ weakbar;
If (bar)
[Self mustnotbenil: bar];
Now bar is a strongly referenced object, which ensures that it remains alive throughout the Code (not nil at the same time ).
The zero-weak reference implementation of arc requires close coordination between the reference counting system of OC and the zero-weak reference system. This means that none of the classes that overwrite retain and release can be referenced as weak references. Of course, this is rare. Some cocoa classes will be subject to this restriction, such as nswindow. If this happens unfortunately, your program will crash immediately and get the following message:
Objc [2478]: cannot form weak reference to instance (0x10360f000) of Class nswindow
If you really need to make weak references to these classes, you can use _ unsafe_unretained instead of _ weak. This creates a non-zero weak reference. You must ensure that once the object to which the reference is directed is released, you do not use this pointer (it is best to manually make it zero reference ). Be careful. A non-zero weak reference is like playing with fire. You can create arc applications running on Mac OSX 10.6 and iOS 4, but do not use weak references. All weak references are _ unsafe_unretained. In my opinion, non-zero weak references are too dangerous, which obviously reduces the attractiveness of arc on these systems.
Attribute
Attributes are closely related to memory management, so here we will introduce some new features of arc.
Arc introduces several new modifiers. Using strong to modify a property indicates that it is a strong reference. Modifying the attribute to weak indicates a zero weak reference. Unsafe_unretained modifier uses a non-zero weak reference. After @ synthesize is used, the compiler creates instance variables of the same storage type.
The existing modifiers assign, copy, and retain are still valid.
Note that assign creates a non-zero weak reference, so do not use it whenever possible.
Except for the new modifier, attributes are the same as before.
Block
Blocks are OC objects and are also managed by arc. Block has some special memory management requirements, which arc will treat differently. The block can only be copied but cannot be retain. That is to say, it is better to copy the block at any time than retain. This is the principle of arc.
In addition, if the block is used after the current scope as the return value, arc considers that the block must be copied. To use non-arc Code, you must explicitly call copy and autorelease in the return statement:
Return [[^ {dosomethingmagical ();} copy] autorelease];
However, the arc Code is simplified:
Return ^ {dosomethingmagical ();};
However, it should be noted that arc does not automatically copy a block (converted to ID), so the write is as follows:
Dispatch_block_t function (void ){
Return ^ {dosomethingmagical ();};
}
In this case, the write is incorrect:
ID function (void ){
Return ^ {dosomethingmagical ();};
}
Simply call the copy message to solve this problem. Pay attention to the following points:
Return [^ {dosomethingmagical ();} copy];
If you pass the block as the ID parameter, You need to explicitly copy the block:
[Myarray addobject: [^ {dosomethingmagical ();} copy];
Fortunately, this is just a small bug that won't cause a crash and may be corrected soon.
If you are not at ease, it is no problem to add an additional copy.
There is also a huge change in Arc: _ block variable. _ Block modifier allows the block to modify variables outside the block:
Id X;
_ Block ID y;
Void (^ block) (void) = ^ {
X = [nsstring string]; // Error
Y = [nsstring string]; // works
};
In non-arc scenarios, __block has a side effect. When this variable is captured by a block, the target variable is not retained. The block will automatically retain and any objects in the scope of the release block, but the _ block pointer is an exception, it is equivalent to a weak pointer. This is a common solution. Use _ block to avoid circular references.
In arc, the __block will rehold the target object, just like the local variable in the block. Loop reference cannot be avoided when _ block is used. Therefore, _ weak can be used instead.
Toll-freebridging free bridging
ARC is only valid for OC type. The corefoundation type must be manually managed by programmers. Therefore, to avoid ambiguity, arc prohibits conversion between pointers and OC objects and other pointer types, including corefoundation object pointers. The following code is common in manual memory management, but is forbidden in Arc:
Id OBJ = (ID) cfdictionarygetvalue (cfdict, key );
For compilation to pass, you must use special conversion modifiers to tell the ownership semantics of arc. These modifiers include:
_ Bridge, _ bridge_retained, and _ bridge_transfer.
The easiest to understand is _ bridge. This is a direct conversion without considering the final relationship. Arc receives this value and manages it normally. To achieve the preceding purpose, you can write as follows:
Id OBJ = (_ bridge ID) cfdictionarygetvalue (cfdict, key );
Other conversion modifiers transfer ownership from the arc or the ownership to the ARC. This will simplify the code for bridging.
For example, in some cases, the returned object needs to be released.
Nsstring * value = (nsstring *) cfpreferencescopyappvalue (cfstr ("somekey"), cfstr ("com. Company. someapp "));
[Self usevalue: value];
[Value release];
If arc is used, add _ bridge and remove the release. The other content remains unchanged. This will cause a memory leak:
Nsstring * value = (_ bridge nsstring *) cfpreferencescopyappvalue (cfstr ("somekey"), cfstr ("com. Company. someapp "));
[Self usevalue: value];
In the code, the use of copy must be balanced using release. During value initialization, arc generates a retain. When the value is no longer used, it uses a release to balance the retain. Therefore, the copy is not balanced, and this object is exposed.
We can solve this problem like this:
Cfstringref valuecf = cfpreferencescopyappvalue (cfstr ("somekey"), cfstr ("com. Company. someapp "));
Nsstring * value = (_ bridge nsstring *) valuecf;
Cfrelease (valuecf );
[Self usevalue: value];
However, this is quite cool. Since the starting point of free bridging is to be as seamless as possible, and the starting point of arc is to reduce memory management code, it is better to make the code simpler and more straightforward.
The _ bridge_transer modifier is used to solve this problem. _ Bridge_transfer also transfers ownership over simply handing pointers over to the ARC. When _ bridge_transfer is used, it will tell the arc that this object has been retain and the arc does not need to retain any more. Because arc has ownership, it can be release when the object is no longer needed. The final result is as follows:
Nsstring * value = (_ bridge_transfer nsstring *) cfpreferencescopyappvalue (cfstr ("somekey"), cfstr ("com. Company. someapp "));
[Self usevalue: value];
Free bridging works in two ways. As mentioned above, arc prohibits conversion from an OC object to a corefoundaiton object. This Code cannot be compiled in Arc:
Cfstringref value = (cfstringref) [selfsomestring];
Usecfstringvalue (value );
If a _ bridge is used, the code can be compiled, but such code is dangerous:
Cfstringref value = (_ bridge cfstringref) [selfsomestring];
Usecfstringvalue (value );
Because arc will not manage the life cycle of value, it will immediately give up the ownership of the object. Before an object is passed to usecfstringvalue, it may cause program crash or unpredictable behavior. Through _ bridge_retained, we can tell arc to transfer ownership to us. Because ownership is transferred, we are now responsible for the release of objects. Just like any CF code:
Cfstringref value = (_ bridge_retainedcfstringref) [self somestring];
Usecfstringvalue (value );
Cfrelease (value );
Type conversion modifiers are also useful in addition to free bridging. When you want to store an object pointer to an unmanaged non-OC object, they can help. There are various void * context pointers in cocoa, such as sheets. In non-arc Code:
Nsdictionary * contextdict = [nsdictionarydictionary...];
[Nsapp beginsheet: sheetwindow
Modalforwindow: mainwindow
Modaldelegate: Self
Didendselector: @ selector (sheetdidend: returncode: contextinfo :)
Contextinfo: [contextdict retain];
-(Void) sheetdidend: (nswindow *) sheet returncode :( nsinteger) code contextinfo: (void *) contextinfo
{
Nsdictionary * contextdict = [(ID) contextinfo autorelease];
If (code = nsunstoppedresponse )...
}
As above, it cannot be compiled under arc. Because arc does not allow direct conversion between objects and non-object pointers. However, using conversion modifiers, arc allows conversion, but at the same time, it needs to make necessary memory management for us:
Nsdictionary * contextdict = [nsdictionarydictionary...];
[Nsapp beginsheet: sheetwindow
Modalforwindow: mainwindow
Modaldelegate: Self
Didendselector: @ selector (sheetdidend: returncode: contextinfo :) contextinfo: (_ bridge_retained void *) contextdict];
-(Void) sheetdidend: (nswindow *) sheet returncode :( nsinteger) code contextinfo: (void *) contextinfo
{
Nsdictionary * contextdict = (_ bridge_transfer nsdictionary *) contextinfo;
If (code = nsunstoppedresponse)
...
}
To sum up:
- _ Bridge transfers pointers between Arc and non-arc, but does not pass ownership.
- _ Bridge_transfer can pass a non-OC pointer to the OC pointer while passing ownership so that the arc can release it for you.
- _ Bridge_retained transfers an OC pointer to a non-OC pointer, while transferring ownership. You and the programmer are responsible for cfrelease later, or releasing the ownership of the object.
Structure
In arc, the structure and OC object pointers are difficult to confuse. The problem is that it is difficult for the compiler to know when a structure is copied and destroy, and there is no place to insert retain and release. In addition, it is not a common behavior to put object pointers in the structure. arc gave up this part completely. If you want to put the OC object pointer in the structure, you must modify it with _ unsafe_unretained and solve all the problems that arise.
Because usually the OC pointer is not put into the structure, you may not encounter those problems. Otherwise, it is best to change the structure to a lightweight OC class. In this way, the issue no longer exists when the arc manages the memory for you.
Documents and resources
Because Apple's official arc documentation is still not public and xcode4.2 is still beta, you can get a lot of information from clang's website:
Http://clang.llvm.org/docs/AutomaticReferenceCounting.html
Conclusion
Arc reduces the programmer's memory management pressure. ARC is not garbage collection. It cannot check out the circular references. It must be handled by the programmer and interrupt the circular references. Writing cocoa code still requires a lot of effort, but zero-weak references are a powerful tool to solve circular references.
Corefoundation objects and free bridging are more troublesome. Arc can only process oC, and programmers still need to manually manage corefoundation. When converting between the OC pointer and corefoundation pointer, you need to use a _ bridge conversion modifier to notify the memory management action during arc conversion. This is Apple's latest programming language technology discussed today.
[1] Tail call optimization: avoid allocating stack memory for the function, so that the called function simply returns the function returned value. Let's look at several examples:
Function foo1 (data)
Return A (data) + 1;
Function foo2 (data)
Return A (data );
Only foo2 belongs to the end of the call. Foo1 needs to perform a + 1 operation after calling a. In this way, the control should be returned to foo1 and the stack should be created for foo1. Therefore, stack stacks of foo1 cannot be directly replaced by stack stacks of foo1. Therefore, foo1 can be changed:
Function foo1 (data)
Return A (data, 1 );
After data is computed by A, add 1 before the returned result. This is the final call optimization. Replace the last sentence with the call to. Because the call to a is at the end of the function, the statuses before the foo1 function do not affect the final calculation result. We can discard the data in this function stack, give the space to the end of the call. This optimization reduces the number of stack creation operations (additional stack space allocation) for function calls ). If the end call is self-called, it becomes tail recursion, and the optimization is more significant, because this means that even "infinite" recursion will not cause stack overflow. This is the advantage of final call optimization.