Cocoa Samurai: 2008

Thursday, December 11, 2008

Debugging Cocoa with DTrace Talk Tonight at Cocoaheads

Tonight at the Des Moines Cocoaheads I will be doing a talk in which I will describe how DTrace can be the perfect compliment to your debugging session and help you find bugs quicker regardless of if you are using DTrace through the Terminal or Instruments. Topics that will be covered are * Brief review of DTrace & how it works * What DTrace can do for you * DTrace and the Objective-C Provider * Tasks DTrace can make much easier for you that would otherwise be mundane and consume time * Getting the most out of Custom DTrace Probes in Instruments * useful DTrace snippets & functions to know * and more... If you are in the central Iowa area come and see this and more at DM Cocoaheads tonight ( http://www.cocoaheads.org:80/us/DesMoinesIowa/index.html. ) I look forward to seeing you there! After this talk is over with all presentation materials will be posted here with a in depth article with more information than I could present in a reasonable amount of time tonight. Also this is kinda a Intermediate/Advanced Level Talk it assumes some basic knowledge of how DTrace works (though like I said there will be a very brief review of DTrace and how it works.) If you don't know DTrace check out my earlier article and the accomanying links from DTrace for Cocoa Developers ( http://is.gd/mIq )

Friday, October 17, 2008

Getting Some XML Love with libXML2

A while back Marcus Zarra did a good article on working with libXML2 and xmlTextReader to parse XML Data without loading the whole thing into memory. However the tutorial failed my needs only for 1 reason... he used a file on Disk. In his call to xmlReaderForMemory() he passes a path to the resource on disk (yes I know it makes for an easy self contained project that's easy to demonstrate, but bear with me here), I made the assumption that this made the method unusable for objects residing only in memory. In fact just about every method in libxml2 seems to have an argument for a path to a resource on disk and initially thought I couldn't really do strict in memory xml parsing. How wrong I was as you'll see later on. What I am going to show you today is how to create a valid xmlTextReader object from libXML2 with only the assumption that you are downloading the xml data from a server somewhere. Marcus showed a great intro into XML parsing with libXML2 and using a file on disk, but in my code I am getting Data back in memory from calls to web services on the internet and so I went to find out what it'd take to do XML Parsing with libXML2 without using a reference to a file on disk. Basically I wanted to assume that I just have a valid NSData object that has XML in it. When I got done with this I found out there are 2 ways to do this 1 very insanely easy way which isn't obvious from looking at the libXML2 method names alone and 1 harder way that accomplishes the same thing. The hard way

    1 NSData *xmlData; /* for this example assume xmlData has some valid xml data */
    2 
    3 xmlParserInputBufferPtr inpt = xmlAllocParserInputBuffer(XML_CHAR_ENCODING_UTF8);
    4 
    5 if(!inpt) {
    6     NSLog(@"Failed to create XML Input Buffer");
    7     return;
    8 }
    9 
   10 NSString *XML_STRING = [[NSString alloc] initWithData:xmlData encoding:NSUTF8StringEncoding];
   11 
   12 xmlBufferPtr xmlBuffr = xmlBufferCreateStatic((void *)[XML_STRING UTF8String] ,strlen([XML_STRING UTF8String]));
   13 
   14 inpt->buffer = xmlBuffr;
   15 
   16 xmlTextReaderPtr reader = xmlNewTextReader(inpt, NULL);
   17 
   18 if (!reader) {
   19     NSLog(@"Failed to create xmlTextReader");
   20     return;
   21 }
   22

The point in this example is to create a xmlParserInputBufferPrt object that will be passed into xmlNewTextReader() so we don't read anything off of the disk. On line 3 we create this xmlParserInputBufferPtr by using the alloc method and passing in that we are going to use UTF8 encoding. And of course on line 5 we check for the existence of the xmlParserInputBufferPrt and if it doesn't exist there is really no point in going any further. Then I create a NSString object from the NSData object (line 10) which will allow us to get the UTF8String from the NSData object and (again) signify that we are using UTF8 Encoding. Then the big thing we need to create (line 12) is the xmlBufferPrt object. This creates a static buffer with the entire contents of the XML from the NSData object passing in a pointer to the string contents itself and the length of the string. Then (line14) in the xmlParserInputBufferPtr we point the data buffer pointer to the xmlBufferPrt object we created on line 12. After that it's just a matter of creating a xmlTextReaderPtr (line 16) with xmlNewTextReader and pointing the input buffer to the xmlParserInputBufferPointer which now has all the XML in it and pass NULL to the path of the XML. Now you have a xmlTextReader which you can use to parse xml with. The Easy Way Now I started going down this path because of Peter Hosey's suggestion of using xmlParserInputBufferPtr which logically seemed like the best solution to my problem of wanting to use strictly in memory objects and do no reading off of disk. If I could just pass NULL for the path in xmlNewTextReader() could I do the same with xmlTextReaderForMemory and it'll still work? As it turns out... YES... yes it does work.

    1 NSData *xmlData; /* for our purposes here assume xmlData has valid xmlData in it */
    2     
    3 xmlTextReaderPtr reader = xmlReaderForMemory([xmlData bytes], 
    4                                                                                          [xmlData length], 
    5                                                                                          NULL, NULL, 
    6     (XML_PARSE_NOBLANKS | XML_PARSE_NOCDATA | XML_PARSE_NOERROR | XML_PARSE_NOWARNING));
    7     
    8 if (!reader) {
    9     NSLog(@"Failed to create xmlTextReader");
   10     return;
   11 }

In fact all you need to do is set the 3rd and 4th parameters to NULL in xmlReaderForMemory() and it works fine with in memory objects like if you have a NSData object you got back from API's like say NSURLConnection sendSynchronousRequest. The only thing I wish is that there was a lot better documentation on libxml2, maybe there is a great rescource I just don't know about, but from my googling it was hard to find anything and I had to do a lot of trial by error. Update: Im aware of the documentation at xmlsoft.org, most of it however pretty much just shows you a list of method names and describes the arguments you pass in to methods with a couple decent doc's/tutorials. This isn't great documentation for me as it doesn't describe which arguments are necessary, if a argument is optional you should make that crystal clear and the documentation doesn't make it clear that a lot of arguments are in fact optional in libxml2 methods. Specifically one page im referring to is at http://xmlsoft.org/html/libxml-tree.html#xmlParserInputBufferPtr.

Wednesday, October 01, 2008

Thank Goodness the F'ing iPhone NDA is being lifted

Apple FINALLY did the right thing today and publicly recognized what pretty much all iPhone developers and the public that have been paying attention to the news have known for a long time now, that the iPhone NDA was doing much more harm than good. From their page ( http://developer.apple.com/iphone/program/ ) "To Our Developers We have decided to drop the non-disclosure agreement (NDA) for released iPhone software. We put the NDA in place because the iPhone OS includes many Apple inventions and innovations that we would like to protect, so that others don’t steal our work. It has happened before. While we have filed for hundreds of patents on iPhone technology, the NDA added yet another level of protection. We put it in place as one more way to help protect the iPhone from being ripped off by others. However, the NDA has created too much of a burden on developers, authors and others interested in helping further the iPhone’s success, so we are dropping it for released software. Developers will receive a new agreement without an NDA covering released software within a week or so. Please note that unreleased software and features will remain under NDA until they are released. Thanks to everyone who provided us constructive feedback on this matter." Personally I am of the opinion that it's far better for people to eventually recognize that they made mistakes and try to correct it than people to live in blissful ignorance and just believe they did the right thing, so in this regard I am glad that Apple has finally publicly come around and recognize that the iPhone NDA, while it helped Apple "protect" some things on the iPhone, it was really doing a net damage to the platform that hurt it and made developers afraid to even touch the iPhone SDK. I was really looking forward to Bill Dudney's Core Animation book, I really originally just wanted to use it on the Mac, but because it just contained 1 chapter on the iPhone Core Animation differences the Pragmatic Programmers couldn't publish it. Thats 3-4 months that I couldn't simply hold a book in my hands and learn something that contributes back to Apples platform because 1 small part contains something on the iPhone (Yes I have the PDF, but I find it hard to sit down at a computer and read a whole book that way, I really only read book PDF's on my Mac as a reference looking up one small piece of information I want.) Now I hope they rescind what they said in their recent email saying they would just publish it without the chapter and finally publish the whole book in its entirety. And it's not just that the Core Animation book was probably one of the most publicly known books initially that people knew was being held up by the NDA, Amazon shows many other books on iPhone Development that are not yet available due to the NDA. In the end I really hope Apple has learned a lot from this. Now I hope that iPhone App quality will go up as a result of developers and sites soon being able to finally share code and information between each other. I know I have a couple iPhone SDK articles that are in development, but I haven't given them the time they deserve because I just didn't know if Apple was ever going to lift the NDA anytime soon. Apple we love you, we love the Mac and the iPhone, and we are even willing to put up with some things that developers of other platforms would scoff at because we like this platform that much. But when you get a ton of negative press and tons of developers are very publicly and loudly criticizing you all with the same opinion, it's not to give you grief, it's because we care about this that much and you're royally screwing up on something. If we really wanted to hurt you we'd be silent and say nothing. Also where is my iPhone Dev Key Apple?

Monday, September 22, 2008

Announcing the First NSCoder Night in Ames

I am starting NSCoder night in Ames, IA. The first meeting will be at The Stomping Grounds at 7pm on Tuesday the 23rd. If people want the location can change, just email me or tell me in person at the first NSCoder Night in Ames. Come and bring your Cocoa Projects and I'll see you then. Here is a link to the NSCoder Night site explaining what NSCoder Night is about: http://nscodernight.com/

Tuesday, August 19, 2008

Xcode Shortcuts: Original Documents now Creative Commons Licensed

I am hardly one to hold something back from the Mac Developer Community when I think I have something that will benefit everybody. As such I am finally doing something I've wanted to do for a while. My Xcode Shortcuts guide has a ton of downloads and is one of the most popular articles of all time on my site, and I think many people would like to use the content in various different formats to suit their needs so I don't want to hold you all back. As such I am releasing the Xcode Shortcuts guide under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. In essence you can share and modify the work as you please as long as you attribute me as the author somewhere in the work, don't use it for Commercial Purposes and share the modified works under a similar License. The Xcode Shortcuts guide was created with Pages '08 and thus you need it to modify the document, though I have included a copy of the document exported to Rich Text format for some compatibility though the document looks like one big mess when I open it in Text Edit. The only thing I've done from the Guide I released is to remove the Xcode Icon just for legal reasons. If you want it's easy to paste it back in there. Although it's not required, if you modify and use the Xcode Shortcuts Guide leave a comment here or send me a quick email letting me know. Thanks! Have Fun! Right Click on the link and select "Save Linked File to 'Downloads'" Xcode Shortcuts Guide CC Licensed

Sunday, August 03, 2008

I Agree with Gus: VMware Mac OS X Virtualization rocks!

Recently Gus Mueller posted about how VMware with a virtualized instance of Mac OS X is a dream come true for him. It's also a dream come true for me. The ability to run Mac OS X on a virtual machine instance has been something that's been on my wish-list for a while now. Backing up all my data and reinstalling all my apps so that I can install the current version of Mac OS X and then create another partition for the developer seed of Mac OS X is a pain in the butt and overall wiping my HD and reinstalling Mac OS X is something I only like to do when a new major revision of Mac OS X comes out like 10.5, 10.6, etc to avoid any problems and clear out the clutter. Also if I have to reboot to boot into a different version of OS X I feel like it requires a lot of planning to decide when I should boot into the other version of OS X, how long I should spend in it and what exactly I am working on when in the new version of OS X (not to mention you don't have your regular apps,etc), however with virtualization I can multitask and work on 2 things at once or switch between the 2 versions of Mac OS X more flexibly. I'd like to use this as a means to install say Snow Leopard and develop on that which can fit into my workflow very easily as it could fit in its own space and thus I can keep developing on Leopard and then switch to a space running Snow Leopard. I can only imagine the possibilities if I became a Mac Indie of how VMware with virtualized instances of Mac OS X could fit into my workflow with features like snapshots. This also takes VMware from being that app you have to run in order to run Windows XP/Vista so you can do your homework or just work on it because there's not a Mac version of that app or your client/teacher has to run your work on XP/Vista to a true Mac app. It's no longer a "this thing runs Windows" app it's a "this thing runs another instance of Mac OS X on Mac OS X!" which sounds a 1000x more exciting. I do also agree with Gus that Apple should allow the Mac OS X Client Versions to be virtualized as well and not just the Mac OS X Server version, but that is up to Apple legal. When I was flying back from WWDC, I got on my plane and a woman came to me frantically saying that she had a ticket for another better seat and wanted to trade seats with me so she could be with her children. I agreed and got bumped up to Economy Plus with plenty of leg room and more importantly sat right next to some guy working for VMware. To my surprise he was running Mac OS X Leopard Server in VMware, however I was tired and at the time I think I remember hearing something about how Apple was allowing this, so I thought "huh that's nice." He asked me if I had my snow leopard DVD on me so he could try that out, but I believe I put it in my checked luggage so I didn't get to see that, which is a shame because I think I'll try that soon and see if it works. After I got back and had time to rest I soon realized just how significant seeing Mac OS X virtualized was. All in all the future for Mac Developers working on 2 versions of Mac OS X is just looking better and better thanks to developments like this. By the way, a big Thank you goes out to all the twitterers who suggested that I go with VMware, i've been very much enjoying it ever since I switched from Parallels. Update: Heres a Blog Entry from VMware about Leopard Server virtualization and here is a YouTube Video showing it in Action

Thursday, June 19, 2008

One GIT Build Script to Rule them all

There seem to be at least 2 camps of GIT Users on OS X, those who installed git with the Mac OS X Package installer and those who installed it from Mac Ports. On my new project Gitty (a Git Repo inspector/manager just beginning development), I didn't want to discriminate, but at the same time I didn't know perl well enough to modify Marcus Zarra's Build Script beyond changing /opt/local/bin/git to /usr/local/git/bin/git. However one of my followers on Twitter was kind enough to modify the perl build script that sticks part of the GIT Hash in the about box to search for both locations and use which ever one you have installed so that it is now GIT Location Agnostic. It looks for the standard install location first and then if It can't find that searches for the MacPorts git install location and uses that. This works great for me and so i thought i'd share... enjoy. Thanks

# Xcode auto-versioning script for Subversion by Axel Andersson
# Updated for git by Marcus S. Zarra and Matt Long
# Updated to use git in the Standard Install Location or the MacPorts
#   install location by Patrick Burleson
 
use strict;
 
# Get the current git commit hash and use it to set the CFBundleVersion value
my $REV = "";

if( -e "/usr/local/git/bin/git" ) 
{
    $REV = `/usr/local/git/bin/git show --abbrev-commit | grep "^commit"`;
}
elsif ( -e "/opt/local/bin/git" )
{
    $REV = `/opt/local/bin/git show --abbrev-commit | grep "^commit"`;
}
else 
{
    die "Git not found";
}

my $INFO = "$ENV{BUILT_PRODUCTS_DIR}/$ENV{WRAPPER_NAME}/Contents/Info.plist";
 
my $version = $REV;
if( $version =~ /^commit\s+([^.]+)\.\.\.$/ )
{ 
    $version = $1;
}
else
{
    $version = undef;
}
die "$0: No Git revision found" unless $version;
 
open(FH, "$INFO") or die "$0: $INFO: $!";
my $info = join("", <FH>);
close(FH);
 
$info =~ s/([\t ]+<key>CFBundleVersion<\/key>\n[\t ]+<string>).*?(<\/string>)/$1$version$2/;
 
open(FH, ">$INFO") or die "$0: $INFO: $!";
print FH $info;
close(FH);

Saturday, June 14, 2008

To all those who said hi during WWDC...

Thank you! I meet up with a bunch of people who actively follow me on this blog and on twitter and it was great to talk to every one of you from all over the world. You guys are what made WWDC great for me this year. I myself finally got to meet up with some people who i've been talking to for a while like Scotty, Scott Stevenson, Deric Horn and all the great people in Apple Developer Technical Support. I am amazed at how many people recognized me from such small pictures on my blog/twitter. Im normally a shy person (even though I try not to be), so this was a great experience to meet people left and right. Thanks to talking to some people I got some good ideas and requests of things to cover next here on Cocoa Samurai, as always if you have any requests feel free to write in any time. I'll start work on them right after I get caught up on my homework post-WWDC :\

Tuesday, May 27, 2008

DTrace for Cocoa Developers

Update: Sorry people I didn't know the default Viddler downloading permissions. If you logged in and tried to download the video and couldn't before, login again and you should be able to download it now.

So in this Screencast Im going to show you how to use DTrace and how to ultimately turn that knowledge into a custom DTrace Instrument for Instruments. Honestly I would login and download the full quality version of this from Viddler. I apologize for the hisses in the s's that I make, I was slightly tired and trying to control that. As always if I said something wrong and made a mistake please let me know right away and i'll make a correction here.

DTrace Examples in the movie

Example #1

colinw$ sudo dtrace -n 'syscall:::/execname == "Safari"/{@num[probefunc] = count();}' -q ^C

lstat 2 sigaltstack 2 sigprocmask 2 stat 2 getuid 16 mmap 32 munmap 36 geteuid 94 gettimeofday 128 stat64 248

Example #2

colinw$ sudo dtrace -n 'syscall::*open*:entry{printf("%s %s",execname,copyinstr(arg0));}' dtrace: description 'syscall::*open*:entry' matched 7 probes CPU ID FUNCTION:NAME 0 17604 open:entry mds . 1 17604 open:entry nmbd /private/var/samba/browse.dat. 0 17604 open:entry Safari /.vol/234881026/1644952 1 17604 open:entry mdworker /private/var/samba/browse.dat 1 17604 open:entry mds . 1 17604 open:entry Safari /.vol/234881026/230745/QuickTime Preferences ^C

Example #3

colinw$ sudo dtrace -n 'syscall:::/execname == "Safari"/{@num[ustack()] = count();}' -q Password: ^C

libSystem.B.dylib`sendto$UNIX2003+0xa CoreFoundation`__CFSocketEnableCallBacks+0x255 CoreFoundation`CFSocketEnableCallBacks+0x4f CFNetwork`_SocketStreamRead+0x5e5 CoreFoundation`CFReadStreamRead+0x1dd CFNetwork`httpRdFilterRead+0x5ef CoreFoundation`CFReadStreamRead+0x1dd CFNetwork`httpStreamRead+0x2aa CoreFoundation`CFReadStreamRead+0x1dd CFNetwork`httpReadStreamCB+0x94 CoreFoundation`_CFStreamSignalEventSynch+0x89 CoreFoundation`CFRunLoopRunSpecific+0xca8 CoreFoundation`CFRunLoopRunInMode+0x58 Foundation`+[NSURLConnection(NSURLConnectionReallyInternal) _resourceLoadLoop:]+0x140 Foundation`-[NSThread main]+0x2d Foundation`__NSThread__main__+0x134 libSystem.B.dylib`_pthread_start+0x141 libSystem.B.dylib`thread_start+0x22 2

Example #4

sh-3.2# dtrace -n 'objc16675:NSMutableArray::entry{@num[ustack()] = count();}' dtrace: description 'objc16675:NSMutableArray::entry' matched 24 probes ^C

CoreFoundation`+[NSMutableArray arrayWithCapacity:] Keynote`0x201ab Keynote`0x35289 Keynote`0x9d981 AppKit`-[NSWindow makeFirstResponder:]+0x12e Keynote`0xa1b29 AppKit`-[NSWindow sendEvent:]+0x1505 Keynote`0x18c7e7 AppKit`-[NSApplication sendEvent:]+0xadc SFApplication`-[SFAppApplication sendEvent:]+0x283 Keynote`0x26d69 AppKit`-[NSApplication run]+0x34f Keynote`0x352f Keynote`0x34a5 Keynote`0x4cc36 Keynote`0x4cb5d 0x2 1

Useful DTrace Links Solaris DTrace Guide (PDF) from Sun Solaris DTrace Guide (HTML) from Sun DTrace Review (Google Tech Talk) MacTech: Exploring Leopard with DTrace

Friday, May 09, 2008

I am on the Mac Developer Roundtable 007 Source Code Management

I was an invited guest on the Mac Developer Roundtable and came on to talk Source Code Management and to advocate for GIT. It was my first time ever on a podcast and I was a bit nervous, still listening to the first part of it right now, but it sounds pretty good so far. Please forgive all the "uhs" I do, Im normally a lot more confident speaking publicly, but I didn't know what to expect on the podcast. Give it a listen and let me know what you think. Be gentle, it's my first time :P ... Mac Developer Roundtable Episode 007 - Source Code Management My Recommendations from the podcast F-Script If I remember correctly, I think I just glanced over F-Script and didn't give it much explanation. Im a huge fan of creating experimental projects and playing around with API's and that's one reason Im a fan of F-Script because then I don't have 30 small projects cluttering up my desktop or a temp directory and so now I do much less tiny projects in Xcode and use F-Script to see how the API's work. You can start F-Script up in a console and create an app from scratch or just create temp objects and see how they work and play around with Cocoa in a way that's not possible in Xcode without creating loads of small projects. You can download F-Script from http://www.fscript.org/ though if you are on Leopard you will need to download and install a special version of FScript Anywhere from http://osiris.laya.com/blog/?p=24 Mac OS X Internals Mac OS X Internals is a great book explaining the guts of Mac OS X and its individual components to you in how they work and comes with many source code examples to demo what's going on. It's a very big book, but it's well worth it for the content it offers you. The books website is at http://www.osxbook.com/

Tuesday, April 15, 2008

OSSpinLock :: Lock Showdown ( POSIX Locks vs OSSpinLock vs NSLock vs @synchronized )

Reader Simon Chimed in on my Last article about threading in Leopard:

Worth mentioning as well is the OSSpinLock{Lock, Unlock} combination; these busy-wait, but are by far the cheapest to acquire/release. In the case of many resources and lightweight access (very few threads, and/or little work being done per access), they may pay off significantly. Consider the set/get methods for 10000 objects. Creating an OSSpinLock for each object is cheap (it's just an int), and if the likelihood of access to an given object is small (ie, there's not 1 object constantly being accessed and 9999 very rarely), the OSSpinLock approach (implemented by memory barriers, IIRC) can really pay off, because the cross-section of thread conflict is minute.

Thanks for adding to the Discussion Simon, I did forget to include OSSpinLock. And he is right, buried inside the libkern/OSAtomic.h header is the OSSpinLock API. OSSpinLock can be a particularly fast lock, however the Apple Man Page on it makes it clear that it is best used when you expect little lock contention. I started to develop a test to gauge this performance till I discovered that someone had already developed one for this purpose on CocoaDev, however it left out @synchronized() so I took the test and extended it to add @synchronized and see how fast OSSpinLock was and how much we really pay for using @synchronized(). The test creates and gets 10,000,000 objects while using locks in the form of POSIX Mutex Locks, POSIX RWLock/unlock, OSSpinLock, NSLock (which uses POSIX locking) and the @synchronized() lock. I ran the test several times over and grabbed a snapshot which reflected what I got on the 3rd run and showed the (about) average results I got before and after this run. The graph shows the time per request which was derived from dividing the total requests by the time it took to complete all 10,000,000 objects for setting or getting. This was run on my 2.33 GHz Core 2 Duo Mac Book Pro with 2 GB RAM.

Results are in usec's. As you can see OSSpinLock is the fastest in this instance, but not by much when compared to the POSIX Mutex Lock. Also seen above we can see the pain that Google Described with regard to performance when using the @synchronized directive. When you run the test it's painfully obvious even from a user perspective that @syncrhonized does cost you in terms of performance relative to the other locking methods. Where the other locks take about 2 or (at most 3 seconds) to complete the test @synchronized takes 4 or 5 seconds to accomplish the same task. Here at the total runtimes for the tests POSIX Mutex Lock (Set): 2.382080 Seconds POSIX RW Lock (Set): 2.881769 Seconds OSSpinLock (Set): 2.278029 Seconds NSLock (Set): 2.948313 Seconds @Synchronized (Set): 4.310732 POSIX Mutex Lock (Get): 2.953534 Seconds POSIX RW Lock (Get): 3.390998 Seconds OSSpinLock (Get): 2.880452 Seconds NSLock (Get): 3.493390 Seconds @Synchronized (Get): 5.098508 Seconds Ouch! Things aren't looking good for @synchronized at this point. However in terms of a compromise for speed and efficiency I particularly like the POSIX Mutex Lock the best, because (as stated on CocoaDev) it already tries a spin lock first (which doesn't call into the kernel (why OSSpinLock is so fast here)) and only goes to the kernel when a lock is already taken by another thread. Which means in a best case scenario it is basically acting exactly like OSSpinLock and it doesn't have any of the disadvantages to OSSpinLock. OSSpinLock has to poll every so often checking to see if its resource is free which (as stated earlier) doesn't make it an attractive option when you expect some lock contention. Still Simon is right in that depending on how your application works OSSpinLock can be an attractive option to employ in your app and is something worth looking into. This isn't a "this one is the best!" type of thing, it's (as I've said before) something you have to test and see if it's right for you.

Sunday, April 13, 2008

A Guide to Threading on Leopard

Authors Note: This is a campanion article to the talk I gave @ CocoaHeads on Thursday April 10, 2008. You can download the a copy of the talks from http://www.1729.us/cocoasamurai/Leopard%20Threads.pdf.

Intro to Threading

It's becoming abundantly clear that one big way you can increase application performance is multithreading, this is because increasing processor speeds are no longer a viable route for increasing application performance, although it does help. Multithreading is splitting up your application into multiple threads that execute concurrently with some threads having access to the same data structures that your main thread has access to.

A benefit to this technique is that its possible for your application to have a whole core to itself or that your app's main thread can reside on one core and a thread you spawn off can be spawned onto your 2nd core or 3rd,etc.

Why is this becoming an Issue?

This is becoming more and more of an issue these days because all the Mac's Apple sells are at least Dual Core machines with Intel Core 2 Duo's or 2x Quad Core Intel Xeon on the Mac Pro's. In other words Apple doesn't sell any single core Mac's (leaving out the iPhone in this instance) anymore, so now we have all this computational wealth to take advantage of, it's foolish not to take advantage of multithreading if it can benefit your application.

Why & When you Should & Shouldn't Thread

Now with all this talk I suppose it may be uplifting you to get on this bandwagon and put a bunch of threads in your app right now. But I should give you some info on why you should and shouldn't thread. Multithreading is not a tool that should be used because you can. One reason it's coming into use more and more is that when your app starts out it is given 1 thread and a Run Loop that receives events. If you are performing an incredibly long calculation on that thread it freezes your user interface leaving your users incredibly frustrated that they cannot do anything while this calculation is going on. Because of this multithreading became a topic of interest beyond just the operating system designers and became an issue for us as App designers. The solution was to spawn these long operations onto their own threads that run at the same time in the background while your user interface is still responsive to the users interactions. So what are some good candidates for threading operations?

File / IO / Networking Operations
API's that explicitly state they will lock the calling thread and create their own thread
Any compartmental/modular task that's at least 10ms

10 milliseconds is a bit ambiguous though. Does 10ms mean 10ms on a Core 2 Duo MacBook Pro with 4GB RAM or 10ms on a Intel Xeon with 16GB RAM or what? This 10ms time suggestion that's repeated many times over in Apples Documentation, the only suggestion I can give you is that if you gage that your app is taking at least 10ms on your machine at least it's a good candidate for threading assuming your machine is a Mac that your target audience might be using. In the end you need to do testing to see if your methods are taking at least 10ms. If you need to you can use Instruments or Shark to gage your applications performance. I made a quick screencast below showing how to do so with Instruments

If you want to download this all you need is a free Viddler Account and download the original video from http://www.viddler.com/explore/Machx/videos/2/ and go to the download tab and you can see the original upload file and download it. This is my very first screencast so if you have any comments/suggestions feel free to let me know. When shouldn't you thread then?

If many threads are trying to access/modify shared data structures
If the task doesn't take at least 10ms
If the end result is that you app is taking unnecessary memory due to threading
If threads slow down your app more than it speeds it up

If your threads are trying to access a shared data structure isn't a no no when it comes to threads, because idealistically you should never spawn threads that do this, however in the real world this isn't always possible. When you spawn threads that are all vying for access to a shared data structure without locks you get inconsistent data across threads, which can leave your app acting oddly or even crash if your app depends on certain states in the data structure. However if you are a good programmer and provide locks in your application, it can lead to lock contention where you are putting locks in all the proper places, but because there are many threads trying to access the same thing it leads to a backlog of threads waiting to get access to the same data structure. Apple repeatedly quotes the 10ms rule (see discussion above) and is their suggestion that if your under 10ms it's not worth spawning a thread for the particular task. It's also important to remember that every time you spawn a thread, memory is allocated to in terms of Kernel Data Structures and Stack Space for allocating variables on the thread. If you are spawning a lot of threads your threads are consuming memory on your system and could end up slowing it down more than it really benefits it too.

Thread Costs

Every time you create thread it's important to note that there is an inherent cost to that creation in terms of CPU Time and Memory allocation. According to Apples Documentation Thread creation has the following costs as measured on a iMac Core 2 Duo with 1GB RAM:

Kernel Data Structures	1 KB
Stack Space	512 KB 2nd*/8MB Main
Creation Time	90 microseconds
Mutex Acquisition Time	0.2 microseconds
Atomic Compare & Swap	0.05 microseconds

* = 16 KB Stack Minimum and the stack space has to be a multiple of 4 KB Thus if your spawning many threads in your application you are consuming a great deal of resources and not only that if you spawn way too many threads your threads could possibly be fighting for CPU resources.

Warning!

It's also important to note that although threads are a tool just like anything else in the Cocoa/Foundation frameworks, if you misuse threads and screw up with them you could really screw up bad with them. My advice if anything is that you should never use threads because you simply can use them, but rather you use them because you've gaged the performance of your application and looked at several methods and decided that they could benefit from the concurrency that it offers. That means going into Shark and/or instruments and developing some performance metric for your application and watching how it does threaded/non-threaded. My last piece of advice is that you pay attention to what is and isn't thread safe in Apples Documentation. In the Multithreaded Programming Guide Apple does list off some of the classes that are and aren't thread safe. A general rule for thread safety is that mutable objects are generally not thread safe while immutable objects are generally thread safe. If Apples Documentation on the class you are using doesn't make an explicit reference to the class being thread safe, you should assume that it is not thread safe.

How Threads are Implemented on Mac OS X

Mach Threads
At the lowest level on Mac OS X Threads are actually implemented as Mach Threads, however you probably won't ever be dealing with them directly. According to the Tech Note 2028 User Space programs should not create mach threads directly. If you want to know more about Mach Threads I would suggest buying Amit Singh's Mac OS X Internals Book and reading up on them there, where he describes them in great detail. POSIX Threads
POSIX Threads are the next type of threads and these threads are layered on top of Mach Threads. POSIX Stands for Portable Operating System Interface and is a common set of API's for performing many tasks, amongst them creating and managing threads. You can use threads in your application simply by including the pthread.h header file in your class and you have access to the POSIX API's. This is the level you will probably go down to if you need finer grain control over your threads than the higher level API's offer. High Level API's ( NSObject / NSThread / NSOperation )
Cocoa offers an array of Threading API's in Leopard that make it convenient and easy to thread. NSObject
Starting with Mac OS X 10.5 Leopard NSObject gained a new API called - (void)performSelectorInBackground:(SEL)aSelector withObject:(id)arg. This method makes it easy to dispatch a new thread with the selector and arguments provided. It's interesting to note that this is essentially the same as NSThreads Class method + (void)detachNewThreadSelector:(SEL)aSelector toTarget:(id)aTarget withObject:(id)anArgument with the benefit being that with the NSObject method you no longer have to specify a target, instead you are calling the method on the intended target. When you call the method performSelectorInBackground: withObject: you are in essence spawning off a new thread with the class of the object that immediately goes to execute the selector specified. So take this example:

[CWObject performSelectorInBackground:@selector(threadMethod:) withObject:nil];

is essentially the same as spawning a new thread with the class CWObject that immediately starts executing the method threadMethod:. This is a convenience method and puts your application into multithreaded mode. NSThread
If you've ever dealt with threading pre-Leopard chances are that you probably came in contact with NSThreads Class Method detachNewThreadSelector: toTarget: withObject:. However it's gotten a couple new tricks. For starters you can now instantiate NSThread objects and start them when you want to and you can create your own NSThread subclasses and override it's - (void)main method, which makes it look similar to NSOperation, minus many of the benefits NSOperation brings to threading operations including KVO compliance. NSOperation / NSOperationQueue
NSOperation is the sexy new kid on the block and brings a lot to the table when it comes to threading. It's an abstract class that you subclass and override it's - (void)main method just like NSThread, and that is all that is required from thereon out to instantiate your subclass and put it into action. Additionally NSOperation is fully KVO compliant. NSOperationQueue brings easy thread management to the table. It can poll your application performance, system performance/load and automatically spawn as many threads as it thinks your system can handle, making it future proof with regard to new hardware coming down the line from Apple, though I should note that in the documentation it says that it will most likely spawn off as many threads as your system has cores.

Thread Locks

I mentioned Thread Safety earlier on and a important part of that is that you make sure at critical points in your code that only 1 thread has access to a portion of code at a time to make sure that the value the thread retrieves is still correct or that 2 threads aren't changing the same value at the same time. There are several methods you can use to put locks on in your application. @synchronized Objective-C Directive Objective-C has a built in mechanism for thread locking. The @synchronized directive will take any Objective-C object including self. Whatever you pass to the @synchronized should be a unique identifier and will block other threads from accessing the block of code in between the brackets after the @synchronized code. In addition to providing a thread blocking mechanism, it also provides for Objective-C exception handling which means exception handling has to be enabled in your application in order to use it. When exceptions are thrown it releases the lock and throws the exception to the next exception handler. Some examples of how to use @synchronized are below:

- (void)veryCriticalThreadMethod
{
    @synchronized(self)
    {
        /* very critical code here */
    }
}

Apples doc's also make reference to a way you can just identify and lock your method down

- (void)veryCriticalThreadMethod
{
    @synchronized(NSStringFromSelector(_cmd))
    {
        /* very critical code here */
    }
}

In this case we are passing a string to the @synchronized directive which uses NSStringFromSelector method and passes in _cmd , which is just a reference to the methods own selector. Additionally if you are trying to ensure that only 1 thread can change the value of an object at a time you should just simply pass in that object there. NSLocks The @synchronized might be the most convenient method to implement thread locking, however it is not the only method by far, and in some situations is not the right one to apply. Here are some of the other methods for thread locking

NSLock NSLock is the most basic form of locking, it uses POSIX Thread Locking (link above) to implement it's locking.

NSLock *myLock = [[NSLock alloc] init];

[myLock lock];

/* critical code here */

[myLock unlock];

To use it in it's basic form all you need to do is to simply create an instance of a NSLock and call lock on it, then put your code after the lock call and when you are ready call unlock on it. I should note that if you have a situation where you try to call lock on it again you will get a deadlock because the lock is already locked and can't lock again. This brings me to NSRecursiveLock NSRecursiveLock NSRecursiveLock is just like NSLock in that it will provide locking for your thread, but it will allow you to call lock on it again if you are using the lock in a recursive manner like its name suggests.

NSRecursiveLock *myLock = [[NSRecursiveLock alloc] init];

-(void)veryCriticalThreadMethod
{
    [myLock lock];

    [self incrementCounterBy:2];

    [myLock unlock];
}

-(void)incrementCounterBy:(NSInteger)aNum
{
    [myLock lock];
    
    self.counter += aNum;
    
    [myLock unlock]
}

NSConditionLock NSConditionLock does as it's name suggests and gives you a set of API's for developing your own system as to when it should and shouldn't lock. It's conditions are all NSIntegers you pass in along with possible date references to lock down your thread. POSIX Thread Locking It is possible to use POSIX Thread Lock methods in your code since your thread is residing on top of a POSIX thread anyway. To use a POSIX Lock in your code you need to include the pthread.h header and use it like such

#include <pthread.h>

static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;

if(pthread_mutex_lock(&mtx))
    exit(-1); /* your lock has failed */

/* super vital secret code here */

if(pthread_mutex_unlock(&mtx))
    exit(-1); /* your unlock has failed */

It should be noted that the Google Mac team has blogged about this. They particularly found the @synchronized directives setup to be bloated for what they needed, because it not only locked the thread, but the exception handling setup was causing their app to not perform as well as they wanted. Their solution in the end was to use the pthread lock directly. Proper Locking with KVO Notifications As noted by Chris Kane on the Cocoa-Dev mailing list, KVO is thread safe but the manner you use the willChangeValueForKey and didChangeValueForKey can screw things up if you use it wrong. Thus if you use KVO notifications you should apply your lock send the willChangeValueForKey notification, modify your object, and then send the didChangeValueForKey notification like this:

NSLock *myLock;

[myLock lock];

[self willChangeValueForKey:@”someKey”];

someKey = somethingElse;

[self didChangeValueForKey:@”someKey”];

[myLock unlock];

He also discusses a receptionist pattern that can be used to perform KVO updates on the main thread from a 2nd+ thread. Locking with Threaded Drawing Another topic of interest for people is how you get threads to be able to draw to an NSView. Although I won't dive too deep into here the basic operations you need to follow when doing threaded drawing involve locking the NSView, performing your drawing and then unlocking the view like so

/* important that we acquire a lock for the view first */
if([ourView lockFocusIfCanDraw]) {
    
    /* do drawing code here */
        
    /* flushes the contents of the offscreen buffer to the screen if flushing & buffered is enabled */
    [[drawView window] flushWindow]; 
    /* relinquish our lock on the view */
    [ourView unlockFocus];
}

NSObject In Leopard NSObject gained a method called performSelectorInBackground: withObject, which I mentioned earlier is essentially the same as NSThreads class method for dispatching new threads minus the argument to which target object you want doing the operation.

/* in AppController.m */

[self performSelectorInBackground:@selector(doThreadingMethod:)
                       withObject:nil];
                    
/* is roughly equivalent to */

[NSThread detachNewThreadSelector:@selector(doThreadingMethod:)
                         toTarget:self
                        withObject:nil];

NSThread If you've ever been doing any threading, or looked at any Cocoa's project that did threading prior to Leopard you probably ran into NSThread and it's class method + (void)detachNewThreadSelector:(SEL)aSelector toTarget:(id)aTarget withObject:(id)anArgument which is shown in the source code example just before this. NSThread is a easy way to get a thread going. However it's got a new ability in leopard to create instances of NSThread and later on when you want to call start on it and then it will create your thread.

NSThread *myThread;

myThread = [[NSThread alloc] initWithTarget:self
                 selector:@selector(doSomething:)
                   object:nil];

/* some code */

[myThread start]; //start thread

All of this can be more convenient to you depending on how you like to use NSThread. However an ever better new ability in my opinion is your ability to subclass NSThread and override it's -(void)main method like so:

@interface MyThread : NSThread {
    NSString *aString;
}
@property(copy) NSString *aString;
@end

@implementation MyThread

@synthesize aString;

- (void)main
{
    self.aString = @"Super Big String that needed a thread";
}

@end

Just like as is the case with a generic NSThread object you simply create an instance of your NSThread subclass and call start on it when you need it to dispatch a new thread. It's also very important to Note that whatever method you call with the NSThread class method setup a NSAutoReleasePool and free it at the end of the method for non garbage collected applications, if your application is garbage collected you don't have to worry about this as this is automatically done for you. NSOperation I've made it no secret that I have a particular fondness of NSOperation and have been evangelizing it for a while now and for good reason. NSOperation brings the sexy to threading and makes it easy to not only create them, but with KVO and NSOperationQueue take a great deal of control of your threading operations. In particular I like this trend Apple is evangelizing of breaking your code up into specialized chunks that in combination with your custom initializer methods can be an effective force inside your app. In particular now I use a subclass in my app RedFlag (a Del.icio.us client in development) called RFNetworkThread all I have to assume is that a credential has been stored in the users keychain beforehand and I dispatch these threads with different URL's with call outs to del.icio.us and they can each store and retrieve the data for a different particular query easily. NSOperation is no different in how it's implemented in relation to subclassed NSThread objects with the exception of specifying that you are subclassing from NSOperation instead:

@interface MyThread : NSOperation {
    NSString *aString;
}
@property(copy) NSString *aString;
@end

@implementation MyThread

@synthesize aString;

- (void)main
{
    self.aString = @"Super Big String that needed a thread";
}

@end

However with NSOperation you gain some big benefits, most notably among them that you can use NSOperation with NSOperationQueue to have some sort of flow of control, NSOperation is also KVO compliant so that you can receive notifications about when it is executing and done executing. It's bindable properties are:

isCancelled
isConcurrent
isExecuting
isFinished
isReady
the Operations dependencies array
quePriority (only writable property)

Apple recommends that if you override any of these priorities that you maintain KVO compliance in your Operation objects, additionally if you extend any further properties on your operation objects you should make them KVO compliant as well. If you want to observe when your operation objects have completed execution, you could do it in a manner like this:

    1 NSOperationQueue *que = [[NSOperationQueue alloc] init];
    2     
    3 loginThread = [[RFNetworkThread alloc] init];
    4     
    5 [loginThread addObserver:self
    6               forKeyPath:@"isFinished" 
    7                options:0
    8                   context:nil];
    9     
   10 [que addOperation:loginThread];

Heres what we did (#1 Line 1) Created a NSOperationQueue to spawn our network thread on. (2: Line 3) Created our NSOperation Subclass object (3: Line 5) add the controller class as an observer of the operation objects keyPath "isFinished." Because isFinished will always be NO till it actually completes and changes to YES we just only need to be notified to when that keyPath is changed and then (4: Line 10) Add the operation object to the Queue. It's important to note that if you want to modify anything about a particular NSOperation object you need to do it before placing it onto the queue, otherwise it's just like playing russian roulette... you just don't know what will happen. Now to observe the change in the operation object you only need to implement standard KVO notification in whatever class added itself as an observer to the operation object like so:

    1 - (void)observeValueForKeyPath:(NSString *)keyPath 
    2                            ofObject:(id)object 
    3                               change:(NSDictionary *)change 
    4                             context:(void *)context
    5 {
    6 if([keyPath isEqual:@"isFinished"] && loginThread == object){
    7         NSLog(@"Our Thread Finished!");
    8         
    9         [loginThread removeObserver:self
   10                          forKeyPath:@"isFinished"];
   11                         
   12         [self processDeliciousData:loginThread.returnData];
   13     } else {
   14         [super observeValueForKeyPath:keyPath
   15                              ofObject:object 
   16                                change:change
   17                               context:context];
   18     }
   19 }
   20

The method observeValueForKeyPath: ofObject: change: context: allows us to be notified through KVO when our object has completed. In this instance I am only concerned about making sure that the path is "isFinished" and that the loginThread I created earlier is the object in question (line 6), because of that once I've gotten this notification I only need to remove myself as an observer and do what I intended to do with the data I got back from the operation. Beyond that on line 14 I only need make sure that if the operation hasn't finished and it's not our object that I just pass on the KVO notification up the chain of interested objects. The other alternative is to do like Marcus Zarra has demoed on Cocoa Is My Girlfriend and create a shared instance, so as to provide a reference that you can call performSelectorOnMainThread on and return control back to the main thread. Neither option is the wrong one, it just depends on which method you like better and how many operation objects you are tracking. What's better in my opinion is when you create your own dependency tree and use KVO notifications to be notified when the root dependency has been completed. This brings me to my next point NSOperation dependencies. NSThread has no built in mechanism for adding dependencies as of right now. However NSOperation has the - (void)addDependency:(NSOperation *)operation method which allows for an easy mechanism (when used with NSOperationQueue) for dependency management. So with that lets get into NSOperationQueue... NSOperationQueue Unless you intend to manage NSOperation objects yourself, NSOperationQueue will be your path to managing threads and has several API's for dealing with how much concurrency you can tolerate in your app, blocking a thread until all operation objects have finished, etc. When you place NSOperation objects onto the queue it will go through your objects and check for ones that have no dependencies or have dependencies that have already completed. In this way you can build your own custom dependency tree and toss all the objects onto the queue and NSOperationQueue will follow and obey how much concurrency you can handle. As noted earlier you only need to use - (void)addDependency:(NSOperation *)operation to indicate this to the operation queue. By default NSOperationQueue will spawn off as many threads as your system has cores, however if you need more or less or would just like to be able to specify that in your application programatically you can use the - (void)setMaxConcurrentOperationCount:(NSInteger)count and set concurrency count yourself, although Apple reccommends that you use the value NSOperationQueueDefaultMaxConcurrentOperationCount which automatically adjusts the number of operation objects that execute dynamically as your system load levels go up and down. Another useful API is the ability to put a bunch of Operation objects onto the que and wait for them to finish halting the method you are in while you wait for this to happen. The - (void)waitUntilAllOperationsAreFinished method was designed for this. I would recommend that you perform this on a 2nd+ thread so as to not interrupt the main thread while you wait for operation objects to complete. Since NSOperation is KVO compliant, it only make sense that NSOperationQueue be KVO compliant as well. And the most useful binding NSOperationQueue has is the ability to bind to all the NSOperation objects being executed with its operations bind-able property. This is somewhat analogous to Mails Activity viewer ( Command + 0 ) interface. This enables you to provide information to your users about what is going on in your application if you are providing such an interface. Personally I would recommend that you do provide such an interface, but expose it as an optional interface in the same way that mail doesn't show the activity viewer by default but allows you to hit a shortcut and bring up the activity viewer.

Conclusion & References

I've touched over a lot things in this article, but even at this I am barely scratching the surface of multithreading. It's a very expansive topic indeed. But I hope I gave you a starting point here. In this article i've covered:

What threads are
When you should use threading
When you shouldn't use threading
What the Costs of Creating Threads are
A warning about Threading and it's implications
How Threads are implemented on Mac OS X
Mach Threads
POSIX Threads
Thread Locks
NSThread
NSOperation
NSOperationQueue

And yet so much still left to talk about. Below you'll find a simple concise list of links of API's and Classes mentioned throughout the article. NSRunLoop Technical Note TN2028: Threading Architectures (About Mach Threads) Thread Costs Thread Safety Thread Safe and Unsafe Classes Apples Threading Programming Guide Mac OS X Internals (Book) POSIX Threads (Wikipedia) POSIX Thread Header File ( pthread.h ) NSObject Threading Method NSThread Class Reference NSStringFromSelector Method NSLock NSRecursiveLock NSConditionLock POSIX Thread Lock Google Mac Blog on @synchronized performance slowdown Chris Kane @ Apple on KVO Thread Safety and KVO updates NSOperation NSOperationQueue Marcus Zarra Tutorial on NSOperation with shared instance reference

Credits

Thread memory layout graph based off of graph from CocoaDevCentral licensed under a Creative Commons License @ http://cocoadevcentral.com/articles/000061.php by H. Lally Singh