Daniel started the implementation of a new text block object, which will offer a much richer text tool, based on the
Common.Text project which I developed this summer. Everything is moving very quickly. In about a week, Daniel has built a first version of the new text object (single frame, limited features), version II.
Common.Text supports advanced typographic features, such as ligatures (standard fi, ff, fl but also what Open Type calls
discretionary ligatures), kerning (modifying the distance between character pairs, such as A and W, in order to make them visually better spaced) and glyph variants (replacing one character glyph with another one depending on the context). It also features a complex property system, which associates every character to a set of properties (font, size, leading, margins, kerning, color, underlines, etc.) and cascadable styles.
The text engine knows how to layout text into multiple, possibly not rectangular frames. It operates in several passes.
- A first pass, done directly when the text gets inserted into the internal buffer, flags the positions in the text where there are possible breaks, based on the Unicode line break algorithm and on a language-based hyphenation algorithm (currently, we support only the French hyphenation rules).
- A second pass measures the width of every line and decides which break gives the best result. Currently, this decision is based on a single line, but ideally the badness should be computed on a paragraph or page base, just like what Knuth's TEX algorithm does. This would give better results, since the algorithm is not focused on optimizing one line at a time and thus does not lock on local optima.
- The rendering pass is responsible for the justification. Depending on a line's disposition and justification parameters (both are defined in the 0..1 range), the excess space gets inserted at the end, at the beginning or at both ends of the line, possibly scaling existing spaces and even the glyphs (by a small amount).
In order to improve the performance, the engine caches the possible break positions and associates with each paragraph advanced line break information, which includes the ideal width of every line, its position in the frame, etc. Redrawing a text block can be very fast thanks to this cached information.
In the Common.Text project (Text II, part 1), the text is stored internally as 64-bit character codes (ulong in C#). At first, using 64 bits to store a single character may seem foolish, as most western languages would be perfectly happy with an 8-bit enc
Tracked: Oct 30, 11:46