Wednesday, September 16, 2009

A Guide to Blocks & Grand Central Dispatch (and the Cocoa API's making use of them)

Intro As you may or may not know I recently did a talk at the Des Moines Cocoaheads in which I reviewed Blocks and Grand Central Dispatch. I have tried to capture the content of that talk and a lot more here in this article. The talk encompassed

  • Blocks
  • Grand Central Dispatch
  • GCD Design Patterns
  • Cocoa API's using GCD and Blocks

All of the content of this article applies only to Mac OS X 10.6 Snow Leopard as blocks support and Grand Central Dispatch are only available there. There are alternate methods to get blocks onto Mac OS X 10.5 and the iPhone OS via projects like Plausible Blocks which have blocks support, though do not have Grand Central Dispatch (libdispatch.)

Grand Central Dispatch is Open Source I should mention that Apple has in fact Open Sourced libdispatch (Grand Central Dispatch) on Mac OS Forge and the other components like Kernel Support for GCD (although if implemented on other OS's this is not necessary) and the blocks runtime support are all freely available and if you want you can even checkout the libdispatch repository using Git with the command git clone git://


Blocks are part of a new C Language Extension, and are available in C, Objective-C, C++ and Objective-C++. Right off the bat, I should say that while we will use blocks with Grand Central Dispatch a lot later on, they are not required when using Grand Central Dispatch. Blocks are very useful onto themselves even if you never use them with anything else. However we gain a lot of benefit when we use blocks with Grand Central Dispatch, so pretty much all my examples here will use blocks.

What are blocks? Well let me show you a very basic example of one

^{ NSLog(@"Inside a block"); }

This is a very basic example of a block, but it is basically a block that accepts no arguments and contains a NSLog() statement. Think of blocks as either a method or snippet of code that accepts arguments and captures lexical scope. Other languages have already had something like this concept implemented for a while (since the 70's at least if I remember correctly.) Here's a couple examples of this concept in one of my favorite languages Python

>>>f = lambda x,y,z: x + y + z
>>> f(2,3,4)

Here we are defining a lambda in Python which is basically a function that we can execute later on. In Python after the lambda keyword you define the arguments that you are passing in to the left of the colon and the right is the actual expression that will get executed. So in the first line of code we've defined a lambda that accepts 3 arguments and when it's invoked all it will do is accept the arguments and add them together, hence when we invoke f like f(2,3,4) we get 9 back. We can do more with Python lambda's. Python has functions that actually do more with lambdas like in this example...

>>>reduce((lambda x,y: x+y),[1,2,3,4])
>>>reduce((lambda x,y: x*y),[1,2,3,4])

This reduce function uses a lambda that accepts 2 arguments to iterate over an array. The lambda in this case accepts 2 arguments (as the reduce function requires) and in the first example just iterates over the array with it. Python begins by calling the lambda using the first 2 elements of the array then gets a resulting value and again calls the lambda with that resulting value and the next element in the array and keeps on calling the lambda until it has fully iterated over the data set. So in other words the function is executed like so (((1 + 2) + 3) + 4)

Blocks bring this concept to C and do a lot more. You might ask yourself "But haven't we already had this in C? I mean there are C Function Pointers." Well yeah, but while blocks are similar in concept, they do a lot more than C Function Pointers, and even better if you already know how to use C function pointers, Blocks should be fairly easy to pick up.

Here is a C Function Pointer Declaration...

 void (*func)(void); 

...and here is a Block Declaration...

 void (^block)(void); 

Both define a function that returns nothing (void) and takes no arguments. The only difference is that we've changed the name and swapped out a "*" for a "^". So lets create a basic block

 int (^MyBlock)(int) = ^(int num) { return num * 3; }; 

The block is laid out like so. The first part signifies that it's a block returning an int. The (^MyBlock)(int) is defining a block of the MyBlock type that accepts an int as an argument. Then the ^(int num) to the right of the assignment operator is the beginning of the block, it means this is a block that accepts an int as an argument (matching the declaration earlier.) Then the { return num * 3; }; is the actual body of the block that will be executed.

When we've defined the block as shown earlier we can then assign it to variables and pass it in as arguments like so...

int aNum = MyBlock(3);

printf(“Num %i”,aNum); //9 

Blocks Capturing Scope: When I said earlier that blocks capture lexical scope this is what I mean, blocks are not only useful to use as a replacement for c function pointers, but they also capture the state of any references you use within the block itself. Let me show you...

int spec = 4;
int (^MyBlock)(int) = ^(int aNum){
 return aNum * spec;

spec = 0;
printf("Block value is %d",MyBlock(4));

Here we've done a few things. First I declared an integer and assigned it a value of 4. Then we created the block and assigned it to an actual block implementation and finally called the block in a printf statement. And finally it prints out "Block value is 16"? Wait we changed the spec number to 0 just before we called it didn't we? Well yes actually we did. But what blocks do actually is create a const copy of anything you reference in the block itself that is not passed in as an argument. So in other words we can change the variable spec to anything we want after assigning the block, but unless we are passing in the variable as an argument the block will always return 16 assuming we are calling it as MyBlock(4). I should also note that we can also use C's typedef utility to make referencing this type of block easier. So in other words...

int spec = 4;
typedef int (^MyBlock)(int);
MyBlock InBlock = ^(int aNum){
 return aNum * spec;

spec = 0;
printf("InBlock value is %d",InBlock(4));

is exactly equivalent to the previous code example. The difference being is that the latter is more readable.

__block Blocks do have a new storage attribute that you can affix onto variables. Lets say that in the previous example we want the block to read in our spec variable by reference so that when we do change the variable spec that our call to InBlock(4) actually returns what we expect it to return which is 0. To do so all we need to change is adding __block to spec like so...

__block int spec = 4;
typedef int (^MyBlock)(int);
MyBlock InBlock = ^(int aNum){
 return aNum * spec;

spec = 0;
printf("InBlock value is %d",InBlock(4));

and now the printf statement finally spits out "InBlock value is 0", because now it's reading in the variable spec by reference instead of using the const copy it would otherwise use.

Blocks as Objective-C objects and more! Naturally going through this you'd almost be thinking right now that blocks are great, but they could potentially have some problems with Objective-C, not so Blocks are Objective-C objects! They do have a isa pointer and do respond to basic commands like -copy and -release which means we can use them in Objective-C dot syntax like so...

@property(copy) void(^myCallback)(id obj);
@property(readwrite,copy) MyBlock inBlock;

and in your Objective-C code you can call your blocks just like so self.inBlock();.

Finally I should note that while debugging your code there is a new GDB command specifically for calling blocks like so

$gdb invoke-block MyBlock 12 //like MyBlock(12)
$gdb invoke-block StringBlock “\” String \”” 

These give you the ability to call your blocks and pass in arguments to them during your debug sessions.

Grand Central Dispatch

Now onto Grand Central Dispatch (which I may just reference as GCD from here on out.) Unlike past additions to Mac OS X like say NSOperation/NSThread Subclasses, Grand Central Dispatch is not just a new abstraction around what we've already been using, it's an entire new underlying mechanism that makes multithreading easier and makes it easy to be as concurrent as your code can be without worrying about the variables like how much work your CPU cores are doing, how many CPU cores you have and how much threads you should spawn in response. You just use the Grand Central Dispatch API's and it handles the work of doing the appropriate amount of work. This is also not just in Cocoa, anything running on Mac OS X 10.6 Snow Leopard can take advantage of Grand Central Dispatch ( libdispatch ) because it's included in libSystem.dylib and all you need to do is include #import <dispatch/dispatch.h> in your app and you'll be able to take advantage of Grand Central Dispatch.

Grand Central Dispatch also has some other nice benefits. I've mentioned this before in other talks, but in OS design there are 2 main memory spaces (kernel space and user land.) When code you call executes a syscall and digs down into the kernel you pay a time penalty for doing so. Grand Central Dispatch will try and do it's best with some of it's API's and by pass the kernel to return to your application without digging into the kernel which means this is very fast. However if GCD needs to it can go down into the kernel and execute the equivalent system call and return back to your application.

Lastly GCD does some things that threading solutions in Leopard and earlier did not do. For example NSOperationQueue in Leopard took in NSOperation objects and created a thread, ran the NSOperation -(void)main on the thread and then killed the thread and repeated the process for each NSOperation object it ran, pretty much all we did on Leopard and earlier was creating threads, running them and then killing the threads. Grand Central Dispatch however has a pool of threads. When you call into GCD it will give you a thread that you run your code on and then when it's done it will give the thread back to GCD. Additionally queues in GCD will (when they have multiple blocks to run) just keep the same thread(s) running and run multiple blocks on the thread, which gives you a nice speed boost, and only then when it has no more work to do hand the thread back to GCD. So with GCD on Snow Leopard we are getting a nice speed boost just by using it because we are reusing resources over and over again and then we we aren't using them we just give them back to the system.

This makes GCD very nice to work with, it's very fast, efficient and light on your system. Even though GCD is fast and light however you should make sure that when you give blocks to GCD that there is enough work to do such that it's worth it to use a thread and concurrency. You can also create as many queues as you want to match however many tasks you are doing, the only constraint is the memory available on the users system.

GCD API So if we have a basic block again like this

^{ NSLog(@"Doing something"); }

then to get this running on another thread all we need to do is use dispatch_async() like so...

 NSLog(@"Doing something");

so where did that queue reference come from? Well we just need to create or get a reference to a Grand Central Dispatch Queue ( dispatch_queue_t ) like this

dispatch_queue_t queue = dispatch_get_global_queue(0,0);

which just in case you've seen this code is equivalent to

dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT,0);

In Grand Central dispatch the two most basic things you'll deal with are queues (dispatch_queue_t) and the API's to submit blocks to a queue such as dispatch_async() or dispatch_sync() and I'll explain the difference between the two later on. For now let's look at the GCD Queues.

The Main Queue The Main Queue in GCD is analogous to the main app thread (aka the AppKit thread.) The Main Queue cooperates with NSApplicationMain() to schedule blocks you submit to it to run on the main thread. This will be very handy to use later on, for now this is how you get a handle to the main queue

dispatch_queue_t main = dispatch_get_main_queue();

or you could just call get main queue inside of a dispatch call like so

dispatch_async(dispatch_get_main_queue(),^ {....

The Global Queues The next type of queue in GCD are the global queues. You have 3 of them of which you can submit blocks to. The only difference to them are the priority in which blocks are dequeued. GCD defines the following priorities which help you get a reference to each of the queues...

enum {

When you call dispatch_get_global_queue() with DISPATCH_QUEUE_PRIORITY_HIGH as the first argument you've got a reference to the high global queue and so on for the default and low. As I said earlier the only difference is the order in which GCD will empty the queues. By default it will go and dequeue the high priority queue's blocks, then dequeue the default queues blocks and then the low. This priority doesn't really have anything to do with CPU time.

Private Queues Finally there are the private queues, these are your own queues that dequeue blocks serially. You can create them like so

dispatch_queue_t queue = dispatch_queue_create("com.MyApp.AppTask",NULL);

The first argument to dispatch_queue_create() is essentially a C string which represents the label for the queue. This label is important for several reasons

  • You can see it when running Debug tools on your app such as Instruments
  • If your app crashes in a private queue the label will show up on the crash report
  • As there are going to be lots of queues on 10.6 it's a good idea to differentiate them

By default when you create your private queues they actually all point to the default global queue. Yes you can point these queues to other queues to make a queue hierarchy using dispatch_set_target_queue(). The only thing Apple discourages is making an loop graph where you make a queue that points to another and another eventually winding back to pointing at the first one because that behavior is undefined. So you can create a queue and set it to the high priority queue or even any other queue like so

dispatch_queue_t queue = dispatch_queue_create("com.App.AppTask,0);
dispatch_queue_t high = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH,NULL);


If you wanted to you could do the exact same with your own queues to create the queue hierarchies that I described earlier on.

Suspending Queues Additionally you may need to suspend queue's which you can do with dispatch_suspend(queue). This runs exactly like NSOperationQueue in that it won't suspend execution of the current block, but it will stop the queue dequeueing any more blocks. You should be aware of how you do this though, for example in the next example it's not clear at all what's actually run

dispatch_async(queue,^{ NSLog(@"First Block"); });
dispatch_async(queue,^{ NSLog(@"Second Block"); });
dispatch_async(queue,^{ NSLog(@"Third Block"); });


In the above example it's not clear at all what has run, because it's entirely possible that any combination of blocks may have run.

Memory Management It may seem a bit odd, but even in fully Garbage Collected code you still have to call dispatch_retain() and dispatch_release() on your grand central dispatch objects, because as of right now they don't participate in garbage collection.

Recursive Decomposition Now calling dispatch_async() is okay to run code in a background thread, but we need to update that work back in the main thread, how how would one go about this? Well we can use that main queue and just dispatch_async() back to the main thread from within the first dispatch_async() call and update the UI there. Apple has referred to this as recursive decomposition, and it works like this

dispatch_queue_t queue = dispatch_queue_create(“”,NULL)
dispatch_queue_t main = dispatch_get_main_queue();

 CGFLoat num = [self doSomeMassiveComputation];

  [self updateUIWithNumber:num];

In this bit of code the computation is offloaded onto a background thread with dispatch_async() and then all we need to do is dispatch_async() back into the main queue which will schedule our block to run with the updated data that we computed in the background thread. This is generally the most preferable approach to using grand central dispatch is that it works best with this asynchronous design pattern. If you really need to use dispatch_sync() and absolutely make sure a block has run before going on for some reason, you could accomplish the same thing with this bit of code

dispatch_queue_t queue = dispatch_queue_create(“”,NULL);

__block CGFloat num = 0;

 num = [self doSomeMassiveComputation];

[self updateUIWithNumber:num];

dispatch_sync() works just like dispatch_async() in that it takes a queue as an argument and a block to submit to the queue, but dispatch_sync() does not return until the block you've submitted to the queue has finished executing. So in other words the [self updateUIWithNumber:num]; code is guaranteed to not execute before the code in the block has finished running on another thread. dispatch_sync() will work just fine, but remember that Grand Central Dispatch works best with asynchronous design patterns like the first bit of code where we simply dispatch_async() back to the main queue to update the user interface as appropriate.

dispatch_apply() dispatch_async() and dispatch_sync() are all okay for dispatching bits of code one at a time, but if you need to dispatch many blocks at once this is inefficient. You could use a for loop to dispatch many blocks, but luckly GCD has a built in function for doing this and automatically waiting till the blocks all have executed. dispatch_apply() is really aimed at going through an array of items and then continuing execution of code after all the blocks have executed, like so

dispatch_queue_t queue = dispatch_get_global_queue(0,0);

dispatch_apply(queue, count, ^(size_t idx){

//do something with data 

This is GCD's way of going through arrays, you'll see later on that Apple has added Cocoa API's for accomplishing this with NSArrays's, NSSets,etc. dispatch_apply() will take your block and iterate over the array as concurrently as it can. I've run it sometimes where it takes the indexes 0,2,4,6,8 on Core 1 and 1,3,5,7,9 on Core 2 and sometimes it's done odd patterns where it does most of the items on 1 and some on core 2, the point being that you don't know how concurrent it will be, but you do know GCD will iterate over your array or dispatch all the blocks within the max count you give it as concurrently as it can and then once it's done you just go on and work with your updated data.

Dispatch Groups Dispatch Groups were created to group several blocks together and then dispatch another block upon all the blocks in the group completing their execution. Groups are setup very easily and the syntax isn't very dissimilar from dispatch_async(). The API dispatch_group_notify() is what sets the final block to be executed upon all the other blocks finishing their execution.

dispatch_queue_t queue = dispatch_get_global_queue(0,0);
dispatch_group_t group = dispatch_group_create();

 NSLog(@"Block 1");

 NSLog(@"Block 2");

 NSLog(@"Final block is executed last after 1 and 2");

Other GCD API you may be Interested in

//Make sure GCD Dispatches a block only 1 time

//Dispatch a Block after a period of time

//Print Debugging Information

//Create a new dispatch source to monitor low-level System objects
//and automatically submit a handler block to a dispatch queue in response to events.

Cocoa & Grand Central Dispatch/Blocks

The GCD API's for being low level API's are very easy to write and quite frankly I love them and have no problem using them, but they are not appropriate for all situations. Apple has implemented many new API's in Mac OS X 10.6 Snow Leopard that take advantage of Blocks and Grand Central Dispatch such that you can work with existing classes easier & faster and when possible concurrently.

NSOperation and NSBlockOperation NSOperation has been entirely rewritten on top of GCD to take advantage of it and provide some new functionality. In Leopard when you used NSOperation(Queue) it created and killed a thread for every NSOperation object, in Mac OS X 10.6 now it uses GCD and will reuse threads to give you a nice performance boost. Additionally Apple has added a new NSOperation subclass called NSBlockOperation to which you can add a block and add multiple blocks. Apple has additionally added a completion block method to NSOperation where you can specify a block to be executed upon a NSOperation object completing (goodbye KVO for many NSOperation Objects.)

NSBlockOperation can be a nice easy way to use everything that NSOperation offers and still use blocks with NSOperation.

NSOperationQueue *queue = [[NSOperationQueue alloc] init];
NSBlockOperation *operation = [NSBlockOperation blockOperationWithBlock:^{
 NSLog(@"Doing something...");

//you can add more blocks
[operation addExecutionBlock:^{
 NSLog(@"Another block");

[operation setCompletionBlock:^{
 NSLog(@"Doing something once the operation has finished...");

[queue addOperation:operation];

in this way it starts to make the NSBlockOperation look exactly like high level dispatch groups in that you can add multiple blocks and set a completion block to be executed.

Concurrent Enumeration Methods

One of the biggest implications of Blocks and Grand Central Dispatch is adding support for them throughout the Cocoa API's to make working with Cocoa/Objective-C easier and faster. Here are a couple of examples of enumerating over a NSDictionary using just a block and enumerating over a block concurrently.

//non concurrent dictionary enumeration with a block
[dict enumerateKeysAndObjectsUsingBlock:^(id key, id obj, BOOL *stop) {
  NSLog(@"Enumerating Key %@ and Value %@",key,obj);

//concurrent dictionary enumeration with a block
[dict enumerateKeysAndObjectsWithOptions:NSEnumerationConcurrent
    usingBlock:^(id key, id obj, BOOL *stop) {
     NSLog(@"Enumerating Key %@ and Value %@",key,obj);

The Documentation is a little dry on what happens here saying just "Applies a given block object to the entries of the receiver." What it doesn't make mention of is that because it has a block reference it can do this concurrently and GCD will take care of all the details of how it accomplishes this concurrency for you. You could also use the BOOL *stop pointer and search for objects inside NSDictionary and just set *stop = YES; to stop any further enumeration inside the block once you've found the key you are looking for.

High Level Cocoa API's vs Low Level GCD API's

Chris Hanson earlier wrote about why you should use NSOperation vs GCD API's. He does make some good points, however I will say that I haven't actually used NSOperation yet on Mac OS X 10.6 (although I definitely will be using it later on in the development of my app) because the Grand Central Dispatch API's are very easy to use and read and I really enjoy using them. Although he wants you to use NSOperation, I would say use what you like and is appropriate to the situation. I would say one reason I really haven't used NSOperation is because when GCD was introduced at WWDC, I heard over and over about the GCD API, and I saw how great it was and I can't really remember NSOperation or NSBlockOperation being talked about much.

To Chris's credit he does make good points about NSOperation handling dependencies better and you can use KVO if you need to use it with NSOperation Objects. Just about all the things you can do with the basic GCD API's you can accomplish with NSOperation(Queue) with the same or a minimal couple lines of more code to get the same effect. There are also several Cocoa API that are specifically meant to be used with NSOperationQueue's, so in those cases you really have no choice but to use NSOperationQueue anyway.

Overall I'd say think what you'll need to do and why you would need GCD or NSOperation(Queue) and pick appropriately. If you need to you can always write NSBlockOperation objects and then at some point later on convert those blocks to using the GCD API with a minimal amount of effort.

Further Reading on Grand Central Dispatch/Blocks

Because I only have so much time to write here, and have to split my time between work and multiple projects, I am linking to people I like who have written some great information about Grand Central Dispatch and/or Blocks. Although this article will most definitely not be my last on Grand Central Dispatch and/or Blocks.

A couple projects making nice use of Blocks Andy Matauschak's KVO with Blocks

Interesting Cocoa API's Making Use of Blocks

A list of some of the Cocoa API's that make use of blocks (thanks to a certain someone for doing this, really appreciate it.) I should note that Apple has tried not to use the word block everywhere in it's API for a very good reason. When you come to a new API and you saw something like -[NSArray block] you would probably think it had something to do with blocking using the NSArray or something where you are blocking execution. Although many API do have block in their name, it is by no means the only keyword you should use when looking for API's dealing with blocks, for these links to work you must have the Documentation installed on your HD.










NSUserInterfaceItemSearching Protocol

searchForItemsWithSearchString:resultLimit: matchedItemHandler:





























enumeratorAtURL:includingPropertiesForKeys: options:errorHandler:































Anonymous said...

That's a load of exciting stuff to play with! Looking forward for a project where I'd try these things. :)

Anonymous said...

A Guide to Blocks & Grand Central Dispatch (and the Cocoa API's making use of them):

1. "So lets create a basic block" should be:
"So let's create a basic block".

2. "Lets say that in the previous example" should be:
"Let's say that in the previous example".

3. "Grand Central Dispatch will try and do it's best with some of it's" should be:
"Grand Central Dispatch will try and do its best with some of its".

4. "Apple has tried not to use the word block everywhere in it's API" should be:
"Apple has tried not to use the word block everywhere in its API".

5. "Additionally you may need to suspend queue's" should be:
"Additionally you may need to suspend queues".

麥克 said...

something strange in the positions of variable queue and count

dispatch_apply(queue, count, ^(size_t idx){

should be

dispatch_apply(count, queue, ^(size_t idx){

right ?

Anonymous said...

What an amazing post! I find it really interesting to read and the information is well organized. Never take this page offline.


dseal said...

Well written. Great post. Thanks for taking the time to explain GCD.

Studio Sutara said...

Thanks, this post was by far the most helpful guide for to understand blocks and GCD better!

sunnyboy_smiling said...

Good article.
One question, see, what's the best solution on your mind? Thank you.

Anonymous said...

Awesome tutorial. Thanks a ton !!

Unknown said...

This is good stuff to go with basics of GCD concepts for beginners..... :)