BuySellAds.com

My book Barcodes with iOS 7 is nearing completion. Buy it now to get early access!
Our DNA is written in Objective-C
Jump

Avoiding Image Decompression Sickness

Avoiding-Decompression-Sickness

When starting to work on our iCatalog.framework I stumbled upon an annoying problem, the same that you will face if you ever need to work with large images. “Large” meaning of a resolution sufficient to cover the entire screen of an iPad or potentially double that (horizontally and vertically) when dealing with Retina Resolution on a future iPad.

Imagine you have a UIScrollView that displays UIImageViews for the individual pages of a catalog or magazine style app. As soon as even one pixel of the following page comes on screen you instantiate (or reuse) a UIImageView and pop it into the scroll view’s content area. That works quite well in Simulator, but when you test this on the device you find that every time you try to page to the next page, there is a noticeable delay. This delay results from the fact that images need to be decompressed from their file incarnation to be rendered on screen. Unfortunately UIImage does this decompression at the very latest possible moment, i.e. when it is to be displayed.

Since adding a new view to the view hierarchy has to occur on the main thread, so does the decompression and subsequent rendering of the image on screen. This is where this annoying stutter or pause is stemming from. You can see the same on app store apps where scrolling through something stutters whenever a new image appears on screen.

You have basically two main choices for an image format on disk, JPEG and PNG. Apple generally recommends that you use PNGs as graphics for your user interface because when building your app those get optimized by an open source tool named PNGCRUSH. This changes the PNGs such that they can be decompressed and rendered faster on iOS devices by making them easier to digest for the specific hardware. The first iPad magazines, like Wired, were using PNGs as the image format in which they transported the invidiual magazine pages causing one edition of Wired to take upwards of 500 MB.

Just because PNGs are automatically optimized for you, that does not mean they are the apex of wisdom for any kind of purpose. That’s all nice and dandy for images that you can bundle with your app, but what about images that have to be downloaded over the internet? There are distinct advantages and disadvantages to both formats. Let’s review…

PNGs can have an alpha channel, JPEGs cannot. PNGs have lossless compression, JPEGs allow you to choose a quality of anywhere between 0 and 100%. So if you need the alpha channel – how transparent each pixel is – you are stuck with PNG. But if you don’t need a pixel-perfect rendition of your image then you can go with the perceptually optimized JPEG which basically omits information you don’t see anyway. For most kinds of images you can go with around 60-70% of quality without any visual artifacts spoiling the visual fidelity. If you have “sharp pixels”, like for text you might want to go higher, for photos you can choose a lower setting.

Looking at memory used an image takes multiple chunks of it:

  1. space on disk or being transferred over the internet
  2. uncompressed space, typically width*height*4 Bytes (RGBA)
  3. when displayed in a view the view itself also needs space for the layer backing store

There’s an optimization possible for 1) because instead of copying the compressed bits into memory they can also be mapped there. NSData has the ability of pretending that some space on disk is in memory. When it is being access then you actually access the bytes on disk and not in RAM. CGImage is rumored to know by itself which loading strategy is more efficient. UIImage is basically just a wrapper around CGImage.

Then there’s the question of “How fast can I get these pixels on screen?”. The answer to this is comprised of 3 main time intervals:

  1. time to alloc/init the UIImage with data on disk
  2. time to decompress the bits into an uncompressed format
  3. time to transfer the uncompressed bits to a CGContext, potentially resizing, blending, anti-aliasing it

To be able to give a definitive answer to a scientific question, we need to take measurements. We need to benchmark.

Basic Assumptions and Ingredients

I made a benchmark app that I ran on several iOS device I had handy. Since I want to also compare crushed versus non-crushed PNGs I needed to start with a couple of source images that Xcode would crush for me. I would have been nice to also dynamically try out different size, but at present I lack a possibility of doing the pngcrush on the device. So I started with 5 resolutions of the same photo of a flower. Of course a true geek would have contrived of multiple different images representing both highly compressible, medium and non-compressable source material. But I could not be bothered. And please don’t get me started on sample size. TOO MUCH INFORMATION.

Please forgive odd outliers and the lack of resolution due to 1 ms being the minimum unit. We are not after a doctorate in image processing, but some practical information.

We are mostly interested in a general comparison of compression schemes and not specific niche cases. The benchmark includes 128*96, 256*192, 512*384, 1024*768 and 2048*1536 resolutions and we clocked Crushed versus Non-Crushed PNG, and JPEGS ranging from 10% to 100% quality in 10% increments. This benchmark was run on iPad 1+2, iPhone 3G, 3GS and 4. This should give us multiple measurements to evaluate performance between formats as well as between devices.

First, let’s look at raw device performance. Hardware descriptions from Wikipedia:

  • iPhone 3G: “Most of the iPhone 3G’s internal hardware were based on the original iPhone. It still included a Samsung 32-bit RISC ARM11 620 MHz processor (underclocked to 412 MHz), a PowerVR MBX Lite 3D GPU, and 128 MB of eDRAM.”
  • iPhone 3GS: “The iPhone 3GS is powered by the Samsung APL0298C05 chip, which was designed and manufactured by Samsung. This system-on-a-chip is composed of an ARM Cortex-A8 CPU core underclocked to 600 MHz (from 833 MHz), integrated with aPowerVR SGX 535 GPU. It also has 256 MB of eDRAM. The additional eDRAM supports increased performance and multi-tasking in iOS 4.”
  • iPhone 4: “The iPhone 4 is powered by the Apple A4 chip, which was designed by Intrinsity and, like all previous iPhone models, manufactured by Samsung. This system-on-a-chip is composed of an ARM Cortex-A8 CPU integrated with a PowerVR SGX 535 GPU. The Apple A4 is also used in the iPad where it is clocked at its rated speed of 1 GHz. The clock speed in the iPhone 4 has not been disclosed. All previous models of the iPhone have underclocked the CPU, which typically extends battery life and lowers heat dissipation.”
  • iPad: 1 GHz Apple A4 system-on-a-chip with 256 GB DDR RAM in chip package
  • iPad 2: 1 GHz (dynamically clocked) dual-core Apple A5 system on a chip with 512 GB DDR2 RAM in chip package

Two of these devices have the same processor, the iPad 1 and iPhone 4. We see the major shift in performance at the same time when Apple moved to their own silicone. The (under clocked) A4 is still twice as fast as the ARM Cortext-A8 and PowerVR SGC 535 GPU.

Crushed by the Evidence

Total time for loading, decompressing and rendering  a 90% JPEG at 1024*768:

  • iPhone 3G: 527 ms
  • iPhone 3GS: 134 ms
  • iPad: 79 ms
  • iPhone 4: 70 ms
  • iPad 2: 51 ms

Comparing Crushed versus Non-Crushed PNG, also 1024*768:

  • iPhone 3G: 866 ms – 1032 ms = 16% faster
  • iPhone 3GS: 249 ms – 458 ms = 46% faster
  • iPad: 130 ms – 256 ms = 49% faster
  • iPhone 4: 179 ms – 309 ms = 42% faster
  • iPad 2: 105 ms – 208 ms = 49% faster

On iPhone 3GS and above it is a good rule of thumb to say that crushed PNGs are twice as fast as uncrushed ones.

Following are the charts for all 5 test resolutions. I have plotted alloc/init, decompress and drawing time needed separately. The bars are ordered front to back by speed, smaller bars mean faster. Charts courtesy of Christian Pfandler.

At first glance you can see that the relationship between compression schemes is more or less the same within one chart i.e. one resolution. But I still present all to you here for sake of completeness.

 

The first two charts represent resolutions that you would encounter when making images to supplement your user interface. Ignoring the fact that old hardware is slow you can draw a mental line at around 20 ms. This is the limit whereabouts the images are small enough to be drawn directly in a timely manner. Anything about this line means you either have a crappy old iPhone or you have to get the decompression off the main thread if you don’t want your app to be all stuttery.

The next three charts are of resolutions that – for the sake of analysis – we can deem as representative of phone full screen, pad full screen and page retina full screen.

 

Up to this point you might still survive with lots of UIImages and UIImageViews especially if you don’t care about fluidity on older devices. Though you see that you take a severe performance hit by using PNGs at this stage. At full screen phone resolution you definitely want to prefer JPEG.

 

At these resolutions we are way beyond the comfort zone. Even the fastest device can only decompress between 2 (iPad Retina) and 10 (iPad full screen) huge images per second.

Even if you get the decompressing off the main thread the drawing itself still takes some significant time. You want to have this done with CATiledLayer, split your images into tiles and cache the hell out of it.

 

Uniformly we can see that the red portion (decompress) is always the part that takes the most time. Drawing time is only dependent on the resolution, not the complexity of the compression scheme, which makes sense because at that point the game is won by points … pixels that is.

Generally a 100% JPEG is about equivalent to a crushed PNG. I can think of two reasons why you would choose that: a) you cannot dynamically create a crushed PNG on an iOS device, b) file size does not matter and you want to make sure you store a pixel-perfect replica.

Parallel to the file sizes (see below) you see how processing time increases linear from JPG 10% to JPG 90% and goes upwards steeply from there.

An interesting observation is buried in the last chart. In order for responsiveness to be on an iPad Retina display as we are used to with iPad 1+2 the image decoding would need to be 3-4 times faster than the iPad 2. From iPad 1 to 2 it only doubled. There you have the reason why iPad 2 cannot drive Retina just yet. It might not even be the next generation hardware, but 2 generations in the future that will be the first to manage that.

File Sizes

Let’s review the file sizes. Crushing PNGs makes them faster to render, but does it decrease file size?


Crushing PNGs reduces file size only for large images by a minimal amount. The maximum quality of JPEG somewhat compares to crushed PNGs, but setting it to 100% defeats the purpose of compression. You can see that choosing 90% instead of 100% more than halves the file size for any resolution. JPEG file size increases linear with quality but increases sharply over 90%.

Real Time?

When looking at the numbers for full screen images you see a problem that we’ve been having ever since Apple introduced the tablet form factor. A 70% JPEG at full screen resolution takes a whopping 75 ms to display on an iPad 1, 49 ms on an iPad 2 nowhere near real time. Meaning that this is way slower than the 60 fps you should be aiming for wherever possible. 13 fps or 20 fps are well below the human threshold of 30 frames per second that you need to consider a movement smooth. It turns my stomach seeing smooth scrolling and when a new image enters the POV the scroll motion gets stuck for a 20th of a second and then have it jump to again match your finger.

If we could take the decompression out of the equation then the numbers would look far more promising: 17 – 18 ms just for the drawing equate to about 55 fps, smooth as butter. Interestingly the iPads are not so much different in just blasting pixels onto layers, it’s the hardware accelerated decompression that makes the big difference.

One easy way around this was for me to simply draw the catalog pages in iCatalog on a CATiledLayer with fading disabled. This would cause the image display to occur on the background thread used for the tiling with no impact on the scrolling performance. The only thing you might notice there is that if you scroll quickly to the right pages might pop into sight after a brief delay. One disadvantage of this approach is that it is hard to get transitions between portrait and landscape orientations to look nice.

The other – albeit more advanced method – would be to force decompression on images just in time before they are needed.

Forceful Decompression

The first time you need an image’s pixels iOS is decompressing it. Usually this decompressed version sticks around – RAM permitting. So it makes very little sense to decompress an image by means of rendering it into a new one. This only has you end up with TWO uncompressed images, at least for a short time. It is sufficient to “just pretend”:

- (void)decompressImage:(UIImage *)image
{
	UIGraphicsBeginImageContext(CGSizeMake(1, 1));
	[image drawAtPoint:CGPointZero];
	UIGraphicsEndImageContext();
}

This causes the image to decompress, even though the image context is just 1 pixel large.

Strangely I could not consistently get the UIImage to keep the decompressed version around if it was just initWithContentsOfFile. Instead I had to use the ImageIO framework (available as of iOS 4) which provides an option to explicitly specify that you want to keep the decompressed version:

NSDictionary *dict = [NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES]
                      forKey:(id)kCGImageSourceShouldCache];
 
CGImageSourceRef source = CGImageSourceCreateWithURL((CFURLRef)url, NULL);
CGImageRef cgImage = CGImageSourceCreateImageAtIndex(source, 0, (CFDictionaryRef)dict);
 
UIImage *retImage = [UIImage imageWithCGImage:cgImage];
CGImageRelease(cgImage);
CFRelease(source);

Initializing the images like this I could see the times drop to indicate that decompression would only take place once. The first call to decompress took long time, the second no time at all. The magic code word is kCGImageSourceShouldCache which you can specify either for the CGImageSource or the CGImageSourceCreateImageAtIndex, according to the header this means:

Specifies whether the image should be cached in a decoded form. The value of this key must be a CFBooleanRef; the default value is kCFBooleanFalse.

If I set it to NO then I could see the drawing time also grow by the decoding time. If I set it to YES the decoding would only be happening once.

Conclusion

If you absolutely need an alpha channel or have to go with PNGs then it is advisable to install the pngcrush tool on your web server and have it process all your PNGs. In almost all other cases high quality JPEGs combine smaller file sizes (i.e. faster transmission) with faster compression and rendering.

It turns out that PNGs are great for small images that you would use for UI elements, but they are not reasonable to use for any full screen applications like catalogues or magazines. There you would want to choose a compression quality between 60 and 80% depending on your source material.

In terms of getting it all to display you will want to hang onto UIImage instances from which you have drawn once because those have a cached uncompressed version of the file in them. And where you don’t the visual pause for a large image to appear on screen you will have to force decompression for a couple of images in advance. But bear in mind that these will take large amounts of RAM and if you are overdoing it that might cause your app to be terminated. NSCache is a great place to place frequently used images because this automatically takes care of evicting the images when RAM becomes scarce.

It is unfortunate that we don’t have any way to know whether or not an image still needs decompressing or not. Also an image might have evicted the uncompressed version without informing us as to this effect. That might be a good Radar to raise at Apple’s bug reporting site. But fortunately accessing the image as shown above takes no time if the image is already decompressed. So you could just do that not only “just in time” but also “just in case”.


Categories: Recipes

%d bloggers like this: