Tuesday, December 28, 2010

Objective-C Memory Management & Garbage Collection

This article started out as a presentation I did for the Des Moines Cocoaheads.

Introduction

Objective-C Memory Management is something i've seen new people to Cocoa & Objective-C mess up in ways I could just not conceive of on my own. In reality Objective-C memory management is not that hard. You simply need to be aware of some rules and follow a couple of good development practices. Good memory management practices are a good thing to follow on any platform. You don't want to be known as "that" app that hogs lots of memory, I have switched to alternative apps a couple times because the apps I was using were fine, but consumed far too much memory. Especially on iOS you need to follow good memory practices, because otherwise you'll be struggling to deal with low memory alerts and the possibility of your app being killed by iOS.

Objective-C Retain Count

Objective-C in retain count mode (not using garbage collection) is a simple idea. When you explicitly allocate an object it gets a retain count of 1 and when you call release or autorelease on an object it's retain count gets decremented and then the object will be collected. It is the only mode available on iOS Devices and has been in use on Mac OS X since the beginning of the OS.

NSObject *object = [[NSObject aloc] init]; //retain count 1
[object release]; //retain count 0

When a object gets released -dealloc gets called on an object and its memory will be reclaimed. It's important to note that you never call dealloc on an object directly. In fact in all my time as a Cocoa Developer I've heard only 1 legitimate use of calling -dealloc on an object directly, I won't say what it is, but needless to say you would have a lot of Cocoa/Objective-C experience behind you before you would even conceive of doing this. In the same area as -release there is also -autorelease. -autorelease is the same as -release except that it'll perform the release in the future. This works great because with this an object you know you need to release in the future can be taken care of at the beginning of a method or section of code.

-(void)doFoo:(BendingUnit *)bender
{
 if(flexo)
  [flexo autorelease]; //will be sent release later
 ...


Owning an Object
In all these cases when we explicitly perform an -alloc we "own" the object, and when we call -release or -autorelease we relinquish ownership. If you call a method whose name contains 'alloc','new' or 'copy' or if you send a retain message to an object you now own the object and it is your responsibility to send it a -release or -autorelease at an appropriate time. An example of assuming ownership of an object...

-(NSInteger)numberOfInstancesOfString:(NSString *)pattrnString
{
 NSString *patternString = [[pattrnString retain] autorelease];
 NSInteger count = 0;
 // do some stuff here with the string return count;
}

In the above example we are doing something very wise. We are passed in a NSString object which may live beyond the scope of our method. So to ensure that the string lives throughout our method we are calling -retain on the string object and -autorelease on it as well so that at the end of the method it'll be restored to the state it was originally in. If this is a normal string object (retain count 1) then all we are doing is incrementing it's retain count to 2 and then at the end of the method it's retain count will be back at 1. However if we are in the middle of executing this method and somewhere else this object gets a release message then we are ensured to be able to use the object for the lifetime of the method and then the passed object will be released. Note that by calling retain on the string object we assumed ownership of the object and then also did the responsible thing and called autorelease to relinquish ownership of it at the end of the method. What happens we we don't follow the rules?

-(void)dooFoo
{
 NSString *myString = [[NSString alloc] initWithString:@"Good News Everyone!"];
 //... do some stuff
 //... oh noes we are at the end of the method and never sent release to mystring!
}

Oh Noes! At the beginning of this method we create a NSString pointer and explicitly allocate memory for a NSString object and initialize it with a NSString object. The method does some things and then it goes away. However the memory that we allocated is still there, but we no longer have a reference to that memory leaving it forever uncollectible (except under Garbage Collection.) This is a memory leak, it's memory we asked the operating system to set aside for us that we can not reclaim.


NSAutoreleasePool

Autorelease pools are a place where you can collect objects sent an autorelease message and clean them up by sending an NSAutoreleasePool a drain message. When you are running an application based on AppKit, the Cocoa Framework automatically creates a NSAutoreleasePool instance for you and drains it upon your application quitting. You can setup and drain a autorelease pool easily.

NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSString *temp = [NSString stringWithString:@"Name:"];
NSNumber *tempNumber = [NSNumber numberWithInt:55];
[pool drain]; //collects temp & tempNumber

You can also do AutoreleasePools within other autorelease pools
int main(int argc, const char **argv[])
{
 NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; NSArray *files = //fileArray...
 for(NSString *file in files) {
  NSAutoreleasePool *innerPool = [[NSAutoreleasePool alloc] init];

  NSError *fileError = nil;
NSString *file = [NSString stringWithContentsOfFile:file encoding:NSUTF8Encoding error:&error];
  //process file...
  [innerPool drain];
 }
 [pool drain];
}

This can be convenient, especially in iOS to allocate a bunch of objects and free them as soon as possible with an autorelease pool to keep memory pressure down. In some cases they are completely necessary if you intend to say work with Foundation in a command line tool you have to explicitly create an autorelease pool, the same goes for NSOperations main method (except for Garbage Collected apps where this is done automatically for you.)


Coding Horrors

A while before I made a presentation from the same content that is in this article I asked on twitter for some ways people had seen memory abused in Objective-C. From the responses I got back and some things I have heard from various people I won't name here a lot of people have abused Objective-C memory in odd ways. I want to feature some of them here in the hopes that you will never commit these atrocities yourself.


Exhibit 1
-(void)dealloc
{
 while([myObject retainCount] != 0){
  [myObject release];
 }
 [super dealloc];
}

This is wrong on all sorts of levels. Putting this in your code is a sure fire way to show someone how little you understand about Objective-C. First of all let me say this morvo_retain_count.png

If you've been dealing with Objective-C for a while you know that you shouldn't be using the value of retain count, it never actually goes down to zero and you should never be testing against zero anyway. In fact the only reason retain count is still in the Cocoa framework is for legacy reasons and there are a lot of developers petitioning Apple to remove retain count from Cocoa entirely. In any case, the above code should only have 1 release per object in it. If you have objects that have a retain count of 1 or more by the time they've hit this method then you have not properly released or autoreleased the object somewhere else.


Exhibit 2
MyObject *obj = [[MyObject alloc] init];
//do stuff with obj...
[obj dealloc];

Again you shouldn't call dealloc on an object directly. In fact in all of our Cocoaheads group only Eric Rocesecca (a longtime mac veteran, formerly of startly (quickeys) fame) could only come up 1 use for sending dealloc directly to an object for 1 specific case. I won't even go into that specific case, because it's really a specific niche case & beyond the scope of this article.


Exhibit 3
MyClass *obj = [[[[MyClass alloc] init] autorelease] retain];
//do stuff with obj...
[obj release]; 

In this specific example by itself the extra autorelease and retain are just extra garbage at the end since you could just have the alloc and init and do the release and have the exact same effect. This code was shown to me to be in some of Facebooks open source code.


Exhibit 4
MyClass *myObj = [[MyClass alloc] init];
//...do stuff
[myObj release];
[myObj release]; 

If you have multiple retains/releases in 1 place its a very strong signal that you have not done a proper retain release somewhere else. You should really examine your code to make sure you should be doing retain and release elsewhere.


Exhibit 5
NSAutoreleasePool *pool = //...//... do stuff
if(num != kMax)
{
 NSAutoreleasePool *pool2 = //...
NSString *myStr = //...
 //do stuff with myStr
 [pool2 drain]
}
[pool drain]; 

This is something that I could not have conceived on my own. If you are using a lot of autorelease pools and using them as a fix for memory leaks your doing something wrong. There is a legitimate use to using autorelease pools every so often as a means of releasing memory pressure, especially if you are allocating a bunch of objects at once on iOS, but you shouldn't be putting them everywhere because you detect a memory leak.


Exhibit 6
[self dealloc];
while([self retainCount])
 [self release]; 

Similar to Exhibit #1, except with 2x the stupidity. So how do we find what's wrong with our apps?

instruments_memory.png

There are several things you can do
1. Turn on the Clang Static Analyzer, I basically duplicated by debug configuration and named it Analyze and then flipped on the "Run Clang Static Analyzer" checkbox so it's always running the clang static analyzer all the time. The Clang static analyzer will point out many (but not all) places where it can see you are doing something wrong with memory
2. Periodically profile the memory use of your app
3. Run Zombies instrument on your app to track down messages sent to deallocated objects

Objective-C Garbage Collection with AutoZone (libAuto)

Starting in Mac OS X 10.5 we got automatic memory management on Mac OS X with AutoZone (aka libAuto) with 2 collection modes generational and full. Starting in Mac OS X 10.6 we got an additional mode called local. Garbage Collection works on an entirely different principal than retain count. Essentially what you are doing in garbage collection mode (gc) is trading a little bit of cpu time for the sake of letting libauto collect objects that are out of scope and no longer referenced. At this time garbage collection is only available on the Mac. There has been speculation on when it might be made available on iOS, but I suspect that it'll only become available once the low end of all supported iOS devices are running at least the Apple A4 chip, which means the iPhone 3G and iPhone 3GS would all need to be phased out before this is feasible.
Apple has described AutoZone as
"libauto is a scanning, conservative, generational, multi-threaded garbage collector. "
This means several things. The basic principal of libauto is that from time to time it will scan the memory in use by your garbage collected (gc) app and collect out of scope memory. It is conservative in that Apple has said that when big events (like a user begins typing, massive cpu starts going on, etc.) libauto will just back out and stop collecting rather than possibly slow your application down during critical events. The generational part of this is actually how libauto began scanning your app when version 1.0 came out in leopard, it had 2 modes generational (intended to run frequently) and full (slower and to be run less frequently than generational) and now in 10.6 Snow Leopard we have local. I'll go into how these modes work later on. LibAuto is also language agnostic, it is not intended to just work with Objective-C & C, it is also in use in Ruby via MacRuby right now. Additionally libAuto is open source you can browse the source code from here http://opensource.apple.com/source/libauto/libauto-141.2/ and download a zip file from http://opensource.apple.com/release/mac-os-x-1065/.

Garbage collection is nothing new. Many other programming languages such as Ruby, Javascript, Python & Java have had garbage collection for a long time now, but no 2 garbage collectors work exactly the same. When we go into using garbage collection suddenly having to remember your retain & release messages become a thing of the past, instead you become concerned with having references to all objects you need and if those references are strong or weak. Many people have asserted that the same apps under garbage collection are generally less crash prone. Many apps already use Garbage Collection such as Xcode, Interface Builder, Mac OS X System Apps, Rapidweaver, etc..

How libAuto works

libAuto works on a fairly easy concept, at various points during your applications run on a background thread it will scan your applications memory for out of scope memory and collect what it can. You can think of it on a simple level like so
gc-diagram-1.png
We have an application with many points of memory allocated and some that have been allocated, but are no longer referenced by anything. Additionally I put in a weak reference in this diagram which I will go into more detail soon. What libAuto does is scan your application for root objects and then follows the strong references from those objects to all the other objects it can reach. When it's done it knows all the objects that are reachable from the root objects in the application and marks all the non reachable objects for collection. LibAuto then comes in and collects them. It has 3 modes it works in when scanning your memory and collecting objects.

Generational Mode
This is the mode that was used most often on Leopard. It works on the assumption that most objects are temporary and die young, and so the collector will try and focus on the young objects and not the long lived older objects in the app which have less likelihood of needing to be collected.

[NSDateFormatter setDefaultFormatterBehavior:NSDateFormatterBehavior10_4];
NSDateFormatter *formatter = [[NSDateFormatter alloc] init];
[formatter setDateFormat:dateFormat];
NSDate *returnedDate = [formatter dateFromString:dateString];

return returnedDate;

In this example the purpose of this code is to return a NSDate object. However in the process we create and allocate a NSDateFormatter object. The DateFormatter object is only there temporarily for the purposes of being able to set a attribute on it and then have it create another object from that (our NSDate object) and then we basically just ditch the NSDateFormatter object. When the garbage collector comes along it will notice that the NSDateFomatter was allocated and collect it. You probably do this a lot in your code, basically creating temporary objects in the process of accomplishing the task you are trying to achieve.


Full Mode
Full mode pretty much works as you would expect, it runs through all memory on your application and does a thorough collection. It takes the longest amount of time and so therefore doesn't run as often compared to the other modes.
Up until 10.6 these were the only 2 modes in libAuto. In 10.6 we gained a new mode called local.


Local
local mode works on an entirely different principle than the other 2 modes shown so far. This mode only scans 1 threads stack for variables between 1-96 bytes which is supposed to account for a majority of the objects in use according to Apple. Because it's not scanning your whole app and is only focusing on 1 thread it is much quicker than the other 2 modes. It will not scan Core Foundation objects which have a retain count of 1, you'll need to do a CFRelease on Core Foundation objects to make them eligible for collection.

Working with the Garbage Collector

Now we have a basic understanding of how the garbage collector works, but there are some things we still need to examine before we really have a full understanding of how to work with the garbage collector.


How to trigger the Garbage Collector
The garbage collector class is NSGarbageCollector. The 2 most common methods you will probably use are -collectIfNeeded and -collectExhaustively. The idea is that you being the architect of the app you are working on, know the best points when your application is running to trigger the collector. So for instance you may trigger a bunch of code to be run on startup in -awakeFromNib and you know 1 or a few possible points at which the arc of events in the apps startup will end and at those points you may wish to signal to the garbage collector that you have good points at which nothing is going on and the garbage collector can go through your app and collect memory.

-collectIfNeeded - Use this to suggest to the garbage collector that now is a good time to go and collect memory in the application
-collectExhaustively - Use this to force the garbage collector to collect memory in the application

Again even though you may suggest to the garbage collector that it should collect memory it may at anytime stop collection because of things like the user beginning to interact with your application.


Foundation Tools
In Cocoa apps the garbage collector thread is automatically started for you. However in foundation tools you need to manually start up the garbage collector by calling objc_startCollectorThread(). You can also use the method objc_collect(OBJC_COLLECT_IF_NEEDED) to trigger the collector. Apple also suggests if you do this that you call objc_clear_stack() to make sure nothing is falsely rooted on the stack when doing foundation tools.


__weak
The __weak qualifier is needed to solve a very important problem. If you have 2 objects pointing to each other and at least 1 of those objects has a strong reference to it, then those 2 objects will never go away because as far as the garbage collector is concerned they are reachable through root objects & strong references and therefore not eligible for collection. By default all references are strong references so simply having a pointer to other objects makes them visible to the garbage collector and thus not eligible for collection.
Screen shot 2010-12-22 at 2.06.47 PM.png
These are both objects which have valid strong pointer references. The 2nd window may have had another object reference it at some point in time but that object stopped referencing it and now you have a pointer to it, but you don't necessarily need to keep that reference alive. __weak solves this by allowing you to still have a pointer to the object, but qualifying it with weak so it just nils out when it goes away
gc_windows2.png
this way with weak references the 2nd window becomes eligible for garbage collection, you still have a reference to that object, and when you try and send messages to it they just go to nil  and if you check that pointer you'll see it points to nil assuming it has been collected. This has other benefits, in Mac OS X 10.6 and later NSNotificationCenter is weak referenced so you no longer need do to the following in your code

[[NSNotificationCenter defaultCenter] removeObserver:self
      name:kObservationName
      object:nil];

that's right since notifications are weak referenced you no longer need to remove yourself as an observer from them under garbage collection.


NSMapTable & NSHashTable
Under garbage collection all references to objects are considered to be strong by default. As a result of this you have to either explicitly make references weak by using __weak or by using a class specifically configured to use weak references. These classes include NSHashTable which is a class modeled after NSSet to support weak references, or NSMapTable which has been modeled after NSDictionary to support weak references.


Say goodbye to -dealloc, Say hello to -finalize
Under garbage collection the garbage collector will send the -finalize message to objects instead of -dealloc when collecting them. Most of the time this may not even be necessary to implement this as you will mainly just need -finalize for letting go of external resources like say closing files. In much the same way you'll call [super dealloc] at the end of -dealloc you will also need to call [super finalize] if you need to implement finalize. Additionally your finalize methods need to be thread-safe.
-(void)finalize
{
 if(myFileRef != NULL) {
  //close file
 }
 [super finalize]; 
} 
Making malloc'd memory collectable
There are times when you need to manually allocate memory through malloc or other such functions. Unfortunately the garbage collector does not automatically pick up on memory allocated like this. In order to remedy this Apple has provided the function NSAllocateCollectable().


Core Foundation & Garbage Collection
There are times when you need to dig down into Core Foundation to get functionality not found in the Cocoa Frameworks. As such Core Foundation will have to work with and interact with the garbage collector in gc apps. Luckly when allocating Core Founation objects when specifying the collector if you provide NULL, kCFAllocatorDefault or kCFAllocatorSystemDefault those all allocate memory from the garbage collection zone. In fact by default all Core Foundation Objects allocate from the garbage collection zone.

With Core Foundation any objects you allocate need to be either be released with CFRelease or CFMakeCollectable. Personally I use CFMakeCollectable as I think it makes the intent of the code more clear. Technically speaking CFRelease and CFMakeCollectable are nearly identical, if you do another CFRetain on a Core Foundation object you will need to use a CFRelease or CFMakeCollectable to balance it out. If you are writing code to run on both it'd be pretty easy to check for which one you need to do by checking to see if [NSGarbage collector defaultCollector] is NULL, otherwise CFMakeCollectable is just a no-op in a retain/release environment.
if ([NSGarbageCollector defaultCollector] == NULL) CFRelease(myCFString)


How do I turn this on in my projects?
Garbage Collection is currently only available on Mac OS X projects and not available on iOS yet. To turn on garbage collection in your application you need to open your project settings and search for garbage collection
gc_setting.png
There are 3 possible values you can have for your application.
(1) No GC Flag - Garbage Collection is not supported
(2) -fobjc-gc-only - When compiling with this set, Garbage Collection is required. If you have some bit of code in an external project that is gc only and try to use it in a project that does not support GC then you will get linker errors
(3) -fobjc-gc - When compiling with this both Garbage Collection code and traditional Retain/Release code is generated and the code can be loaded into any app. You probably will not compile apps with this setting and instead use this for Frameworks.
Typically you'll make your Apps support garbage collection or not, and your frameworks will be built with in supported mode (generating both gc and retain/release code.)


A note on Debugging GC
When using the heapshot in instruments you should set the environment variable AUTO_USE_TLC = NO because unfortunately the heapshot feature and the GC Thread Local collector don't work well together. Otherwise the heapshot feature in instruments will think you have more memory allocated than you really do. This and other GC environment variables are mentioned in the famous Mac OS X Debugging Magic Tech Note.
Instruments also contains a great instrument to show you when, for how long and how much the collector collected at various points in your application. If you are writing any gc code it would be to your advantage to run this every so often and get a feel for when the collector is running and what modes it is running in.
gc_modes.png

Conclusion

Managing memory in Objective-C is not nearly as hard as anybody would make it out to be. It's merely a matter of knowing the few memory management rules and periodically running Instruments on on your app, and the clang static analyzer in Xcode to eliminate memory issues. Garbage Collection eliminates many of the issues associated with manual objective-c retain/release, but you still have to know a few rules and how the collector works.

Releated Reading

Mac OS X Debugging Magic Tech Note
Memory Management Programming Guide
Garbage Collection Programming Guide
Instruments User Guide

12 comments:

Michelle said...

I've never really seen the need for explicit autorelease pools. Every time I make a lot of objects at once it is because I am putting them in a collection. After I add an object to the collection, I just release it immediately. Is there some other common use case I'm not familiar with?

Colin Wheeler said...

Michelle,

Usually when you are talking about explicit autorelease pools you may be doing it to relieve memory pressure on the system in order to make the memory go away now vs later on. This is especially useful to do on iOS where memory is severely limited and any bit you can do to keep memory pressure down is good. It may also be useful to do on Mac OS X as well just to control when memory is cleared out in retain/release apps and keep your memory consistently down.

Michelle said...

Colin: Doesn't an explicit "release" also happen immediately?

Colin Wheeler said...

Michelle,

Not for autoreleased objects things like

NSString *myString = [NSString stringWithFormat:@"%i count",count];

this is for implicitly allocated objects that you don't explicitly allocate yourself vs say a NSString that you do allocate yourself through doing say

NSString *myString = [[NSString alloc] initWithFormat:@"%i count",count];

In the 1st example the NSString is implicitly allocated and will go away when it reaches the end of the scope it is in. In the 2nd example I've explicitly called alloc myself. If I do a lot of things like the 1st example then creating them inside an autorelease pool & explicitly draining the autorelease pool will clear out all the implicitly allocated objects like that. This is what I am referring to when using NSAutoreleasePool's to keep memory pressure down because you may create a lot of temporary objects like this and if you put them inside well designed autorelease pools it keeps memory use down.

Colin Wheeler said...

or to put it another way, without an autorelease pool here you'd go create a lot of temporary objects and they'd never get deallocated until the end of the loop. Putting a autorelease pool ensures on every iteration of this loop the temporary objects get cleared out of memory right away vs at the end of the whole operation

for(id obj in array){
NSAutoreleasePool *pool = //alloc pool

NSString *string = [NSString stringWithFormat...
NSNumber *num = [NSNumber numberWith...
NSDictionary *dict = [NSDictionary dictWith...

MyObject *myObj = [[MyObject alloc] init...
//use temp objects
//do something with obj

[myObj release];
[pool drain];
}

it's not the myObj object we are worried about but the string, number & dictionary we implicitly allocated so that we could do something with myObj.

Michelle said...

Right. I was talking about objects created with alloc+init.

MyObject *obj = [[MyObject alloc] init];
[myDict setObject: obj forKey: obj.key];
[obj release];

The usual case is when I'm building custom objects from statements returned by a database query and caching them in a dictionary.

Colin Wheeler said...

okay I see what your saying. Yes for objects like that autorelease pools won't do much because you've explicitly allocated it and are releasing it. You'd only be concerned about objects sent the autorelease message within an autorelease pool (i.e. those implicitly allocated objects) and creating those pools to release the memory now vs later on.

An explicit autorelease pool for the bit of code your showing wouldn't do anything really assuming myDict has been sent the alloc message.

Michelle said...

I guess one thing to look out for would be autoreleasing objects buried deeper in the call stack -- even that alloc+init call could hide some autoreleases.

Thanks for the prompt responses.

cncool said...

What is the one case where calling dealloc on an object is appropriate?

lowell said...

> In all these cases when we explicitly perform an -alloc

Should be +alloc as it's a class method.

Anonymous said...

Why aren't you autoreleasing the AutoReleasePools you allocate?

Colin Wheeler said...

lowell
Yes that should be +alloc

Anonymous
From the documentation "In a reference-counted environment, releases and pops the receiver; in a garbage-collected environment, triggers garbage collection if the memory allocated since the last collection is greater than the current threshold." In other words for my example I don't need to.

 
...