Ad

Our DNA is written in Swift
Jump

iPhone 5 Image Decompression Benchmarked

One of the first lucky owners of the iPhone 5, David Smith, kindly ran my Image Decompression Benchmark on the latest 3 generations of iPhone. These benchmarks measure the time it takes for an image to get from disk to screen and encompass 5 resolutions, PNG crushed and uncrushed, as well as 10 compression levels of JPEG. Christian Pfandler prepared the charts for us.

We like to repeat the same benchmarks on every new CPU that Apple likes to solder into their devices, you can read past analyses iPhone 3G through iPad 2,  iPad 3. One note of caution if you want to compare these to the results in this article, we changed the methodology of logging the times from NSLog to CFAbsoluteTime. NSLog itself takes up to 50 ms per logged statement. The new method is more exact and does not have this drawback of including the logging time in the measurements.

Executive Summary: the iPhone 5 can indeed be claimed to be twice as the predecessor.

Running this benchmark on earlier iOS devices had shown the pattern that increases in GPU power did have no relevant impact on the numbers. At the same time the rendering speed improvements seemed to be entirely a function of the CPU. This theory was confirmed by an Apple engineer at WWDC who told me that all the UIKit image decompression indeed happens on the CPU.

The only way to get images to be decoded on the GPU would be via CoreImage. The problem is while image decoding there would be faster you still have the bottleneck of having to get the decoded image transferred to a CALayer and again back into the render tree. So decoding images via CoreImage probably only makes sense if you can keep them on the GPU, like for video compositing or use as 3D textures.

Knowing that this benchmark only looks at the CPU however we still think that it is a valid method to evaluate overall CPU performance from one iOS device generation to the next.

iPhone 5 Results

Small images sizes do not show much of a difference dealing with JPGs, though PNGs (both crushed and uncrushed) are showing a definite improvement there.

The blue area which measures the alloc/init of a UIImage with the corresponding test image seems to be about constant. The reason is probably that the SSD in the iPhone has not much increased in throughput.

The green parts measure the time it takes to draw the images into a bitmap context of identical size to avoid skewing of the number by adding the need to resize the image. There was a bit of an improvement there from the 4S over iPhone 4 for the smallest two images sizes the potential increased memory bandwidth from CPU to GPU does not yet show here.

512×384 is the first time that we really see an effect of better throughput from the CPU to the GPU with the green parts being noticeably shorter. Uncrushed PNGs are way quicker to decompress. For the first time we see PNGs of this size clock in slightly faster than 100% JPGs.

The remarkable increase in PNG decoding performance again beats 100% JPGs. Accross the board the time to decompress and render on the iPhone 5 is equal to the uncompression time alone on iPhone 4S and 4.

PNGs have always been the slowest at higher resolutions. Which is why in my original benchmark article we concluded that PNGs are great for small UI elements and icons. But for full screen catalogs we went with 80% JPGs because of the overall speed benefit.

Out of all these numbers my personal favorite is to compare the highest resolution on all three devices:

iPhone 4: Flower_2048x1536.png (JPG 80%) init: 4 ms decompress: 168 ms draw: 76 ms total 248 ms

iPhone 4S: Flower_2048x1536.png (JPG 80%) init: 2 ms decompress: 160 ms draw: 72 ms total 234 ms

iPhone 5: Flower_2048x1536.png (JPG 80%) init: 3 ms decompress: 91 ms draw: 31 ms total 124 ms

At this level you can easily see the 2x improvement the CPU has over the previous generation. JPEGs from 10% to 90% all prove this point. Contrasting to the numbers of the iPad 3 both PNG and JPEG performance gets a benefit. If you remember, when we benchmarked the iPad 3 there was almost no speed improvement in JPEGs, but only some on PNGs.

With the iPhone 5, the improvement for decoding PNGs is just as remarkable: crushed a little less, uncrushed a little more. On the iPhone 5 the difference is almost not noticeable, but for earlier devices it still pays to have Xcode automatically crush the PNGs when building apps.

Conclusion

The A6 processor in the iPhone supports VFPv4, a special set up instructions to highly parallelize floating point operations, VFP is short for “Vector Floating Point”. The presence of these instructions leads Anandtech to conclude that it must be the first SoC entirely designed by Apple in-house, albeit based on the ARM 7 architecture.

If I’m reading this right then the improved VFP performance comes from having 32 registers in there, twice as many as the VFPv3 previously had. This simply means that twice as many floating point numbers can be crunched in parallel, explaining the doubling of floating point performance on the CPU. This “going wider” yields its benefit at roughly the same battery usage.

The custom silicon that Apple invented for the iPhone 5 lets it easily win by a factor of 2 over previous generation devices.


Categories: Administrative

1 Comment »

Trackbacks

  1. Image Decompression Benchmark: iPad 4 + Mini | Cocoanetics

Leave a Comment