Wednesday, August 11, 2010

Enumerating Lines Concurrently using a Block

Recently on twitter I asked about enumerating over the lines of a string concurrently using a block. I couldn't see an API to deal with this issue directly. If you look through the Mac OS X 10.6 SDK you'll see this

- (void)enumerateLinesUsingBlock:(void (^)(NSString *line, BOOL *stop))block
Which is a synchronous operation processing each line ( the components separated by \n ) one by one until you've gone through the whole thing. However, this doesn't work if you want to go through the lines concurrently, but make the operation as a whole (not proceeding on until all the lines have been processed) synchronous. Steve Streza came up with a solution that I should have thought of, but didn't seem obvious at the time till I saw the solution.
[[myString componentsSeparatedByString:@"\n"] enumerateObjectsWithOptions:NSEnumerationConcurrent
usingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
   NSString *line = (NSString *)obj;
   //do what your going to do with line...
A minor hack, but it works and isn't incredibly inconvenient. Still I wish Apple would make this API
- (void)enumerateLinesWithOptions:(NSStringEnumerationOptions)option usingBlock:(void (^)(NSString *line, BOOL *stop))block
[myString enumerateLinesWithOptions:NSEnumerationConcurrent usingBlock:^(NSString *line, BOOL *stop){
 //do stuff with the line...
I was also asked why I didn't do something like
[stringArray enumateLinesUsingBlock:^{ dispatch_async(myQueue, ^{
And the reason is simple, while that would technically work, i'd not know when the operation as a whole was completed. You would really want to use a dispatch_group so you could dispatch all the blocks and then get notified when they are all completed. However the solution above works with a minimal of fuss and the operation as a whole is synchronous while the line processing is asynchronous. We could do the same thing other ways, but overall we'd be writing a lot more code for really no benefit.

2 comments:

Jacob Gorban said...

That's a nice solution but might be problematic for large strings, which have really lots of lines because
separating strings is not parallel and also it creates an NSArray from the original string, duplicating the content.

If your operation on one line is not very heavy, then perhaps all the performance that could be gained will be lost to creating this array of separate lines even before the first line started to be processed.

Ideally you'd only want to pass through the multi-line string once, like apple's synchronous solution probably works.

Perhaps this approach with dispatching is not so bad in fact.

Adam Preble said...

I wrote up a solution: NSString+ConcurrentEnumeration.m. It turns out to be pretty simple with the use of NSString's enumerateSubstringsInRange:options:usingBlock: and GCD groups.

 
...