BuySellAds.com

Our DNA is written in Objective-C
Jump

Fast Folder Nuking on iOS

I got a strange bug report last week for iCatalog. Deleting of outdated catalogs takes too long, if we couldn’t show a HUD with a spinner while the deletion occurs. That was definitely one of these HUUUUU?! moments. I always thought that file deletion is instant on Unix since only an entry in a file table needs to be removed.

I grabbed an iPad 1 and deleted a 160 MB catalog. Only to find that the whole deletion – a simple NSFileManager removeItemAtPath – took 50 seconds. Uhm, no that is far from ideal to be blocking the main thread and interface for that long.

I played around a bit and over the course of the day, with some great help from several GCD experts on twitter, I pieced together a solution that might interest you if you ever have to delete large amounts of files in an instant. Before Cocoa, on Carbon, OSX offered a method called FSPathMoveObjectToTrashAsync, this is sort of the equivalent for iOS.

I didn’t debug or instrument too much into NSFileManager and why it performs so badly in this case. I suspect that this might be because NSFM supports setting a delegate, whose delegate methods get called multiple times during file operations. Possibly this isn’t optimized such that the check which methods are implemented (all are optional) is done when setting the delegate. If NSFM does tons of respondsToSelector: then that would explain the lag.

The goal in this was was to get rid of a folder (and its contents) as fast as possible and to have any longer operations go on in the background. So my first instinct was to create a category on NSFileManager.

Go Undercover

Removing the above mentioned 160 MB folder would take 50 seconds (on iPad 1), but it would only require 8 ms to rename it. So the strategy became: 1) move it away into a temp location 2) actually delete it.

// move it to a tmp name to that it appears gone
CFUUIDRef newUniqueId = CFUUIDCreate(kCFAllocatorDefault);
CFStringRef newUniqueIdString = CFUUIDCreateString(kCFAllocatorDefault, newUniqueId);
NSString *tmpPath = [NSTemporaryDirectory() stringByAppendingPathComponent:(__bridge NSString *)newUniqueIdString];
CFRelease(newUniqueId);
CFRelease(newUniqueIdString);
 
// make a file manager just for this thread/queue
NSFileManager *fileManager = [[NSFileManager alloc] init];
 
if (![fileManager moveItemAtPath:path toPath:tmpPath error:NULL])
{
	// looks like the file is no longer there
	return;
}
 
[fileManager removeItemAtPath:tmpPath error:NULL];

Now you see that I am creating a new fileManager. In the initial category I would simply use self here. But it was pointed out to me that it is unsafe to use the same NSFM instance from multiple threads. Also we don’t know “where it has been” – i.e. somebody might have set the delegate – so we make our own.

A Touch of GCD

The trusty performSelectorOnBackgroundThread: is so 2010, so we’ll make full use of the facilities provided by Grand Central Dispatch (GCD). Trust me, it’s easier than it sounds. Don’t let yourself be intimidated by the C and Blocks. We covered blocks before, today we shall dispatch them asynchronously.

We will employ 3 GCD techniques in tandem:

  1. dispatch_async
  2. GCD groups
  3. dispatch_once

In true GCD terminology everything happens on so-called Queues. As far as I can tell you can use Queue and Thread interchangeably. There are ways to get the main queue or background queues with certain priorities. But the one attribute of queues that we shall make use of is that they process one block of code at a time.

This means you can feed multiple operations (packed in blocks) onto a background queue and be certain that they will be worked off sequentially. Sounds like an NSOperationQueue, doesn’t it? Well, NSOQ was reimplemented on GCD as soon that entered the language because it is way more efficient than regular threading.

Now you can create anonymous queues, or you can group queues under a queue name. The latter is advantageous if you want to be able to wait for the queue to finish its work.

The DTAsyncFileDeleter class that we are creating here is supposed to work off the items we give it to remove in sequence so that we can be certain that we don’t have two instances try to remove the same file at once. I initially thought about a simple @synchronize but that would have blocked if you wanted to remove two items in rapid succession. As a matter of fact I found that just mentioning the possibility of @synchronize in informed circles will get you many frowns and invariably several people will step forward to lecture you on how much more efficient GCD is.

Let’s summarize the strategy: All instances of DTAsyncFileDeleter should use the same background queue. This queue should only be created once globally for all instances. And we need to somehow know if all operations are done. Oh, and as a bonus it would be nice if the deletion would continue even after the app is suspended …

“use the same background queue” and “create once globally” are the terms that should trigger two ideas: static global variable and dispatch_once.

dispatch_once is a way in GCD to have something occur exactly once. Before GCD we would possibly have a static variable, check that against nil and only instantiate it if it is. The problem with this approach is that there would potentially be racing conditions where two threads would call the sharedInstance method at the same time and both would create the shared instance, but one would leak.

With GCD you define a static token and you are guaranteed that the dispatch_once block using this token is only going to be executed once. Really.

So in code, we have the static global variables at the top of our class, before the @implementation:

static dispatch_queue_t _delQueue;
static dispatch_group_t _delGroup;
static dispatch_once_t onceToken;

And in the init we dispatch_once the creation of a queue and a group.

dispatch_once(&onceToken, ^{
	_delQueue = dispatch_queue_create("DTAsyncFileDeleterQueue", 0);
	_delGroup = dispatch_group_create();
});

In this instance we don’t care about releasing these resources because these items should stay available until the app is terminated. GCD objects are very lightweight anyway, so not to worry.

And with the GCD trimmings set up we can wrap the above rename and delete code in a GCD block:

- (void)removeItemAtPath:(NSString *)path
{
	dispatch_group_async(_delGroup, _delQueue, ^{
		// rename and delete
	});
}

Of course if you don’t need to wait on the queue then you can dispense with the group thing, a regular dispatch_async without group will work just the same. But we want to have a way for the outside world to wait, so we implement a method for that.

- (void)waitUntilFinished
{
	dispatch_group_wait(_delGroup, DISPATCH_TIME_FOREVER);
}

If the queue in our _delGroup is done, this function returns immediately. Otherwise it will wait forever. It’s almost too easy.

At this point our class is fully functional, but for convenience we also want to have a shared instance. So using what we now know about a token for dispatch_once we can construct our +sharedInstance method thusly:

static DTAsyncFileDeleter *_sharedInstance;
 
@implementation DTAsyncFileDeleter
 
+ (DTAsyncFileDeleter *)sharedInstance
{
	static dispatch_once_t instanceOnceToken;
	dispatch_once(&instanceOnceToken, ^{
		_sharedInstance = [[DTAsyncFileDeleter alloc] init];
	});
 
	return _sharedInstance;
}

Oh how convenient that is, anywhere we import the header we can do

[[DTAsyncFileDeleter sharedInstance] removeItemAtPath:path]:

It’s fast, it’s convenient, it’s safe. It’s totally Apple.

Bonus: Automatic Background Task Completion

If we left here then the deletion process would get suspended together with the app and possibly aborted if the app is killed. Of course it would resume normally the next time the app is brought to the foreground but such a deletion might take in the vicinity of one minute on a older iOS device, so that’s an ideal use case for iOS multitasking, aka background task completion.

A UIApplication is not suspended but left alive if it has one or more registered background tasks running. For this purpose you need to get a background task id from the shared UIApplication instance of your app. You need to set a completion handler that gets called if the task completion runs into the 10 min timeout. And you need to invalidate the background task in both cases, when timed out and when complete.

We just need to register a method for getting informed for the UIApplicationDidEnterBackgroundNotification and do that all. Of course we also check if multi tasking is available, just to be sure. (It is probably superfluous because it wouldn’t get the notification if it weren’t, right?)

#pragma mark Notifications
- (void)applicationDidEnterBackground:(NSNotification *)notification
{
	UIDevice *device = [UIDevice currentDevice];
 
	if ([device respondsToSelector:@selector(isMultitaskingSupported)])
	{
		if (!device.multitaskingSupported)
		{
			return;
		}
	}
 
	UIApplication *app = [UIApplication sharedApplication];
	__block UIBackgroundTaskIdentifier backgroundTaskID;
 
	void (^completionBlock)() = ^{
		[app endBackgroundTask:backgroundTaskID];
		backgroundTaskID = UIBackgroundTaskInvalid;
	};
 
	backgroundTaskID = [app beginBackgroundTaskWithExpirationHandler:completionBlock];
 
	// wait for all deletions to be done
	[self waitUntilFinished];
 
	// ... when the syncing task completes:
	if (backgroundTaskID != UIBackgroundTaskInvalid)
	{
		completionBlock();		
	}
}

Notice the __block which tags the backgroundTaskID variable such that it can be modified from inside the blocks. Without this attribute the value of the variable would be captured at the time of creation of the block. You can also see that I reuse the completionBlock variable because the same code is to be executed on the timeout as is on successful completion. The actual work is just to wait for the _delQueue to finish.

Update: Gavin McKenzie correctly pointed out to me that the above mentioned approach causes subsequent renames to have to wait for the entire async block to finish. This is why the public version of DTAsyncFileDeleter now has a second rename queue and performs the renames on a sync queue too. Gavin explains it thus:

In the original implementation, additional callers would wait until all queued renames were finished — which is exactly what your code comment said. Which, could (in theory) mean that if 20 renames were queued, even the first rename operation would have to wait for the 20th queued rename to finish. That potential worried me, of having basically a block-wait on all the renames.

With the dispatch_sync approach each rename only has to wait for the previous rename to finish.

Update 2: Steve Weller was the only one spotting the major problem with my approach and I needed some lab tasting to understand why he was right. The wait was occurring on the main thread of the application. So if that was blocking then the watchdog timer would see your app as not responding after 10 minutes and kill it. So the finally perfect approach was to do away with the notification and instead wrap each block with a task completion id, which – according to the documentation – can be called from background threads safely. Please look at the DTAsyncFileDeleter class in DTFoundation for the latest version.

Conclusion

GCD offers us a possibility to stick multiple items into a background queue that we don’t want to happen concurrently. The background queue will then happily work off these items and if we have it in a group then we can also wait for the completion in a safe way.

dispatch_once is a safe and convenient method of making sure that something only occurs a single time and thus ideal to create caches or shared instances.

Finally – according to an Apple engineer who was asked this question at a TechTalk – you can have as many background task completions as you wish, provided you clean up properly. Because if you don’t then your app will be killed.

One question that I couldn’t find an answer for at the time of this writing is if you can do away with the notification but instead “warp the operations in a background task” from the get go. My feeling is that this would be less elegant because you would have to dispatch onto the main thread to get a background task ID and to invalidate it in the end. But I am interested if you know any more elegant solutions…

The source code for DTAsyncFileDeleter can be found in my DTFoundation project on GitHub.


Categories: Recipes

%d bloggers like this: