Our DNA is written in Swift

UITextView Caught With Trousers Down

I had begun development on DTRichTextEditor a few months before WWDC 2012. This was the time when Apple announced that UIKit would support attributed strings beginning with iOS 6. 3 classes to be exact – if you search the documentation for the attributedText property – UILabel, UITextField and UITextView.

Back then I I was hesitant to concede that Apple had sherlocked my open source work in DTCoreText. Yes, there where a few formatting styles now supported, but still the initWithHTML which exists on Mac was still nowhere to be seen on iOS. In the least people would still be able to bridge the gap from HTML to NSAttributedString with my classes.

As I dove more into making DTCoreText compatible with new attributes used by iOS 6 I found the approach that Apple chose to take quite limited and extremely incomplete.

New Attributes

The main differences between NSAttributedString on iOS 6 and CoreFoundation-level CoreText are that several classes are now proper NSObjects which before where only CF-classes. CTParagraphStyle is matched by NSParagraphStyle. CTFont is replaced by a new hybrid UICFFont, which looks like UIFont to us. References to colors where CGColor before and now are UIColors.

Unfortunately some attributes use exactly the same key strings internally which causes nasty crashes if you try to use those on the three UIKit classes. For example an attributed string which uses a CTFont is incompatible because UIKit expects an UIFont at the attribute with the same string as key.

And the unfortunate discoveries don’t end here. Apple only brought over a part of the properties of NSParagraphStyle. On the Mac you have access to tab stops, lists and text blocks. Internally those are present even on iOS as evidenced by the description of NSParagraphStyle objects. But we have no public method for setting these. This poses a bit of a problem because we need access to the tab stops if we want to get the positioning of list bullets correct.

DTCoreText is iOS 6 Compatible

Nevertheless I upgraded DTCoreText by popular demand to be able to generate iOS 6 compatible attributed strings. If you pass the new DTUseiOS6Attributes (with an NSNumber’d YES) option to -initWithHTMLData:options:documentAttributes: then whenever the string builder has a choice between traditional or iOS 6 compatible attribute it will pick the latter.

I updated the demo app with an additional “iOS 6” tab which shows the demo snippet in a UITextView. The “View” tab displays the text in an DTAttributedTextContentView as it always has, using the classic CoreText attributes (plus a few of my own).

Screen Shot 2012-12-15 at 21.46.26Screen Shot 2012-12-15 at 21.46.41

As you can see from these two screenshot we have all the basics working, even CSS shadows, although only one shadow is supported by UITextView at a time.

The list of missing functionality is long. Besides the missing tabs making lists slightly awkward, there is no support for NSTextAttachment. This is the way how images are embedded in attributed strings on Mac. In DTCoreText we are using run delegates as a workaround to reserve sufficient space to manually add a subview for images or videos. Run delegates are also sorely missing in UIKit.

It especially hurts to see hyperlinks missing as well. UITextView supports hyperlink detection. If you turn this on then you get hyperlinks for text that looks like URLs, but if the user taps on these he leaves the app. Also you have no way to have a different label than the actual internet address behind the link. It would have been so simple … on Mac there is an attribute that holds an NSURL. Apple could have done the same on iOS and display a link for text ranges with this attribute. An additional UITextView delegate method to respond to the user tapping the link would have been nice as well.

What’s Really Going On

You might have heard this elsewhere before: internally UITextView is just a web view. When I got started working on the iOS 6 compatibility mode for DTCoreText I caught UITextView with its trousers down, peekaboo!

UITextView generating HTML

This is the stack trace you get if you try to set an attributed string with a CTFont on a UITextView. You can clearly see that the setAttributedText: is trying to convert the attributed string to HTML. The CTFont is missing a certain property that UIFont has, triggering an unrecognized selector exception. This crash was truly eye-opening.

Until this point I was assuming that UITextView was working similar to DTAttributedTextContentView in so far as it would be directly drawing NSAttributedStrings. But Apple took a shortcut  -instead of adding drawing of attributed strings to UIKit, they brought the private NSHTMLWriter class to iOS. This is used to create HTML that the Webkit view inside of UITextView is able to display.

UITextView has a private setHTMLString: method, unfortunately off limits. If we look closer with recursiveDescription we see the beautifully simple structure:

Screen Shot 2012-12-15 at 22.20.55

UITextView itself is a subclass of UIScrollView, hence the two image views which represent the scroll indicators. The content view is a UIWebDocumentView, which is where the Webkit magic supposedly occurs. Finally there is a UITextSelectionView which – as the name suggests – probably has to do with displaying text selections.

It might or might not be a Webkit derivative, but it is quite clear that Apple would not have named a view that displays attributed strings UIWebDocumentView. “Web Document” is a longer name for HTML.

Giving NSHTMLWriter a Spin

I googled for the class name and found quite a few GitHub repositories featuring headers of private classes. I used the one that Nicolas Seriot has on his GitHub space. Class dumping also outputs instance variables and private methods. Omitting these, that is how the “public” interface looks like:

@interface NSHTMLWriter : NSObject
- (id)initWithAttributedString:(id)arg1;
- (void)setDocumentAttributes:(id)arg1;
- (id)documentFragmentForDocument:(id)arg1;
- (void)readDocumentFragment:(id)arg1;
- (id)webArchiveData;
- (id)webArchive;
- (id)subresources;
- (id)HTMLFileWrapper;
- (id)HTMLData;

The web archive stuff is probably used for copy/paste support. BTW: I have a project for working with web archives, too.

Using the obviously named initializer and assuming that HTMLData is the HTML data in default UTF 8 format we can concoct a quick sample to see how the generated HTML looks like:

NSDictionary *fontAttr = @{NSFontAttributeName: [UIFont systemFontOfSize:14.0]};
NSMutableAttributedString *string = [[NSMutableAttributedString alloc] initWithString:@"123\nHello HTML World!\nNew Paragraph\nMultiple Spaces->      <-until here" attributes:fontAttr];
// make HTML red
[string addAttribute:NSForegroundColorAttributeName value:[UIColor redColor] range:NSMakeRange(10, 4)];
// make Hello bold
[string addAttribute:NSFontAttributeName value:[UIFont boldSystemFontOfSize:14.0] range:NSMakeRange(4, 5)];
// make World italic
[string addAttribute:NSFontAttributeName value:[UIFont italicSystemFontOfSize:14.0] range:NSMakeRange(15, 5)];
NSHTMLWriter *writer = [[NSHTMLWriter alloc] initWithAttributedString:string];
NSData *data = [writer HTMLData];
NSString *htmlString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSLog(@"%@", htmlString);

By now I was bursting with anticipation because NSHTMLWriter does something that I’ve been meaning to add to DTCoreText for a long time: it has a dedicated class for assembling HTML. My current solution is much more crude, -htmlString works by appending to a mutable string.

This is what the above sample outputs:

Screen Shot 2012-12-15 at 22.49.09

We got a HTML4 Strict DTD, a UTF-8 content type, the writer identifies itself as “Cocoa HTML Writer”. All the styles are grouped together in a style block as last element of the HTML header.

The first combination of styles is responsible for generating a span style, subsequent occurrences of the an identical style are then simply mapped to this first occurrence. Ingenious I must say, this can potentially eliminate a great deal of redundancy that my -htmlString currently outputs.

Each \n creates – of course – a new paragraph. There is one more technique that I find eye-opening here.

Converting between HTML and NSAttributedString also poses a problem when encountering multiple spaces next to each other. HTML generally ignores these and shows only a single space between works if there are a few spaces or a new line. Unless in a CDATA or  PRE whitespace is generally being compressed.

Apple has a cute technique to deal with such a scenario. Multiple spaces (code 32) are enclosed with a span of class “Apple-converted-space”. Then every other space is replaced with a code 160 space which is a Unicode NO-BREAK SPACE. This way it would still break, but only at every normal space. You don’t get a range that fully prevents line wrapping. Ingenious!

I’m a bit torn over this technique of brute-force style categorizing. I was imagining something more elegant that would reduce repetitions to a minimum. Like if the entire document is only using Helvetica, then I would like to see the only reference for it to be in the outermost container tag, i.e. BODY. But then again, how many different styles (including italic and bold) would you really find on the kinds of text we would want to display in a UITextView? Half a dozen perhaps? A dozen?


At this time there are only very few scenarios where I can imagine somebody using attributed strings with UIKit views. Only if you have no images, no hyperlinks and no lists can I imagine this to be of value. Possibly if you have some sort of eBook reader style app. A rich text UITextView gives you text selection, copy and define context options, even basic editing capability.

The current solution leaves much to be desired, to use a modern expression it is “half assed” to say the least. The left out the important features that would have let us work around some of the missing pieces, most painfully tab stops and run delegates.

Worst of all there are tons of bugs resulting from the weird way of converting attributed strings to HTML data. If you want to set a line height this is broken as soon as you have more than one font. Also Kerning is supposed to be supported, but it is broken, too. With the sample code above you are now able to put the finger on the true culprit: NSHTMLWriter.

About time we file tons of Radars…

Categories: Apple


  1. What’s the performance of this compared to UILabel with NSAttributedString

  2. What is “this” in this question? UITextView has always been a web view internally.