
An editor has to deal with the user hitting the tab key on an external keyboard and then be able to persist these tabs. Thus the question arose how I would best represent tab characters (\t) in HTML. At first I tried to encode them as 	 entities, but that is causing lots of trouble since on the parsing end it is difficult to know whether a tab came from this entity or if it came from the literal \t.
I could have done that with a very ugly hack of libxml2 (which powers my DTHTMLParser), but after having wasted half a day on this I relented. I previously reported my findings about Apple-converted-space which is the method NSHTMLWriter uses to preserve multiple spaces.
In this article I am documenting my findings related to how Apple conserves tabs for HTML output.