NRSI: Computers & Writing Systems
An Introduction to TrueType Fonts: A look inside the TTF format
The primary font technology used on Microsoft Windows and the Mac OS is based on the TrueType specification. TrueType fonts are scalable which means the glyphs can be displayed at any resolution and any point size (though the glyphs may not look good in extreme cases). A TrueType font is a binary file containing a number of tables. There is a directory of tables at the start of the file. The file may contain only one table of each type, and the type is indicated by a case-sensitive four letter tag. Each table and the whole font have checksums.
The TrueType specification was developed by Apple and adopted by Microsoft. Later Microsoft and Adobe expanded the specification to support smart rendering and PostScript glyphs. The new specification, which added more tables, was called OpenType. Apple also added tables to TrueType to support a different smart rendering system producing the Apple Advanced Typography (AAT) font specification. SIL’s Graphite smart rendering system works by adding tables, too. This section is only concerned with the tables which were part of the original TrueType specification and which are still used in OpenType, AAT, and Graphite to describe glyphs and provide general font data. For more details, see the reference documentation ( Microsoft, Apple, or Adobe).
1 Glyphs (‘glyf’)
All fonts contain glyphs. TrueType fonts describe each glyph as a set of paths. A path is simply a closed curve specified using points and particular mathematics. A lower case ‘i’ has two paths, one for the dot and one for the rest of it. The paths are filled with pixels to create the final letter form. This set of paths is called an outline. Drawing the glyphs is an art in its own right.
Another way that a glyph may be specified is in terms of other component glyphs. A glyph may consist of references to other glyphs which are combined to make the new compound glyph. An ‘acute e’ could be composed of the glyphs for ‘e’ and ‘acute’. In such compound glyphs, each component glyph has placement and optional transformation data associated with it.
The ‘glyf’ table contains the data describing each glyph in the font. A particular glyph is identified by a glyph id, which is used exclusively throughout the font to identify that glyph. See “Character to Glyph Mapping” below.
At low resolutions (on a computer screen), slight differences in how an outline aligns with the available pixels on a device can make a large visual difference. The ‘m’ on the left below is malformed because of bad alignment. The ‘m’ on the right has been adjusted to fit the available pixels better. There are two approaches to solving this problem of alignment.
In addition to the basic mathematical data that describes each glyph’s outlines, the font can store a set of ‘instructions’ (called hints) which are executed when the glyph is drawn on screen. These instructions move some of the points which define the glyph so that they are positioned well in relation to the grid onto which the glyph is to be drawn (shown above). The hinting code is very complicated and very few people interact directly with it. Most font design tools provide a feature to auto-hint the glyphs a user draws, but the result varies from passable to terrible depending on several factors. The best hints are created manually by highly skilled people using special tools.
If hinting is done well, the results are dramatic. For example, compare the on-screen display quality of Times New Roman (which has been very thoroughly manually hinted) against SILDoulos (which has been auto-hinted using a font design tool). Manual hinting can triple the time it takes to design a font, but it is sometimes worth the investment.
Hinting information is contained in three tables within the font—’cvt’, ‘fpgm’, and ‘prep’. These tables cannot be easily edited outside of specialized font hinting software.
Anti-aliasing is another approach which has become more popular recently. It consists of smudging the edges of the glyph outline slightly by using grey pixels on a screen. The eye averages out the smudging, and the results are clearer than without it. This approach is used in such programs as Adobe Acrobat Reader and for some fonts in Windows to produce more legible glyphs.
Microsoft has also recently patented a new technology called ClearType (primarily of interest to LCD display users). White on a computer screen actually consists of red, green, and blue pixels placed very close together. ClearType manipulates these color pixels separately to finely control the edges of glyphs, resulting in cleaner, more legible letters.
The best way to improve the legibility of a font is to increase the screen resolution. Screen displays are improving each year, so eventually hinting and anti-aliasing will not be needed, but many years probably will pass before these techniques can be discarded.
2 Character to Glyph Mapping (‘cmap’)
The ‘cmap’ table is used to convert from an outside encoding (such as Unicode) to internal glyph ids. The rendering system uses the ‘cmap’ to convert the Unicode code points in a string to glyph ids and then renders the appropriate glyph shapes at the proper positions on the screen or printer. Glyph ids are used exclusively to reference glyphs in all other font tables.
The ‘cmap’ table consists of a set of mapping subtables for different technologies and architectures. This allows the same font to work on multiple operating systems. For example, most TrueType fonts have three subtables within the ‘cmap’, two for Apple and one for Microsoft.
Each mapping subtable has two numbers associated with it: the platform id and the encoding id. The platform id indicates what architecture the subtable is designed for. For example, a platform id of 0 indicates Apple Unicode, 1 indicates Apple Script Manager, 2 indicates ISO, and 3 indicates Microsoft. The meaning of the encoding id depends on the platform. For example, some older fonts have only two subtables: Platform 1, encoding 0 (Mac Roman 8-bit simple 256 glyph encoding) and Platform 3, encoding 1 (Windows Unicode).
The mapping subtables use various formats to reduce their size, but all formats map from a character code to a glyph id. Multiple character codes can map to the same glyph id. For example, space (U+0020) and no-break space (U+00A0) often map to the same glyph id.
There are some special glyph ids which are reserved by convention as described in the below table. It is advisable to follow these conventions when modifying fonts. Glyph 0 is normally an open rectangle and is used by rendering systems to substitute for characters that are not present in the font.
3 Glyph Names (‘post’)
The ‘post’ table provides a name for each glyph based on PostScript conventions. These names are primarily of interest to font designers, engineers, and automatic tools which work with fonts at the glyph level. The table also stores some miscellaneous information such as the italic angle, the underline position, and whether the font uses proportional spacing. It is possible (common in large CJK fonts) to have a ‘post’ table which is empty (provides no names.)
Glyph PostScript names can contain any alphanumeric character, though there are some restrictions. Adobe’s naming convention is the industry-wide standard. This provides a predefined list of 1054 special names, like .notdef, space, and Aacute. Glyphs not in this list are given a name based on their unicode value (e.g. uni0E4D) with a possible modifier (uni0E48.left) or based on a sequence of codes (for ligatures—uni0E380E48). If a font contains a non-empty ‘post’ table and does not use these naming guidelines, it may not work correctly with some software (e.g. Adobe Acrobat). For detailed information about PostScript glyph names, see http://partners.adobe.com/asn/developer/type/unicodegn.html.
4 Metrics, Style, Weight, etc. (‘hmtx’, ‘hdmx’, ‘OS/2’, etc.)
The ‘hmtx’ (horizontal metrics) table specifies the advance width and left side bearing for each glyph. Glyphs are positioned relative to a given point on the screen or page. The horizontal distance from the current point to the left most point on the glyph is the left side bearing. The horizontal distance the current point moves after the glyph is drawn to be positioned for the next glyph is the advance width. For example, an overstriking glyph will have an advance width of 0 with possibly a negative left side bearing (if it is intended to follow the overstruck glyph), while a space glyph will have a large advance width, even though there is no actual glyph outline for this glyph.
Changing the advance width will alter the positioning of every glyph on a line after the changed glyph. Changing the left side bearing will only alter the placement of an individual glyph. In right to left scripts, glyphs still are described using a left to right coordinate system.
The left side bearing and advance width will scale along with the glyph outlines and also can be modified by hints. For performance reasons (e.g. calculating line breaks before actually rendering glyphs), a font may provide precalculated advance widths for each glyph at various sizes and resolutions in its ‘hdmx’ (horizontal device metrics) table.
To support glyphs drawn on a vertical baseline, the ‘vmtx’ and ‘VDMX’ tables may be present. They are analagous to the ‘hmtx’ and ‘hdmx’ tables.
The ‘OS/2’ table provides font wide metrics that control line spacing in Windows. It also indicates the style and weight of a font as well as a rough description of its overall look. On the Mac OS, the ‘head’ and ‘hhea’ tables are used instead of the ‘OS/2’ table to store this information. Information in all three tables should agree.
In addition, the ‘OS/2’ table provides information about what range of characters are in the font, both in terms of Windows’ codepages and Unicode ranges. This data can affect whether a given font will be used to display Unicode data. The ‘head’ table contains the checksum for the entire font and other summary information about the font as a whole. The ‘hhea’ table specifies summary information about the horizontal metrics.
5 Kerning (‘kern’)
The optional ‘kern’ table specifies how glyph combinations should be kerned. On the Windows, the only format which is supported provides simple pair kerning. Pairs of glyph ids are listed with the amount by which the origin of the second glyph should be moved in relation to the first glyph. This movement will result in a shift in all following glyphs on the same line. For example, if an ‘A’ follows a ‘W’ then the ‘A’ and all subsequent glyphs on the line should be moved left. OpenType and AAT tables provide a much more sophisticated mechanism that includes complex contextual kerning. For more information, see the relevant specification.
6 General Font Information (‘name’)
The ‘name’ table stores text strings which an application can use to provide information about the font. Each string in the ‘name’ table has a platform and encoding id corresponding to the platform and encoding ids in the ‘cmap’ table. Each string also has a language id which can be used to support strings in different languages. For Windows (platform id 3) the language id is the same as a Windows LCID. For the Mac OS (platform id 1), script manager codes are used instead. See the TrueType documentation for details.
Each string also has a name id which describes the meaning of this string and its use. The list below was taken from current documentation. This list may grow in the future. Other strings can be included but are not assigned a special meaning.
7 What’s Left?
Four other tables are usually present in each font. The ‘loca’ table simply provides an offset (based on a glyph id) for glyph data in the ‘glyf’ table. It is required if a ‘glyf’ table is present (see below). The required ‘maxp’ table provides information about the overall complexity of the glyphs and hints. The optional ‘LTSH’ table indicates when a given glyph’s size begins to scale without being affected by hinting. The optional ‘PCLT’ table stores fontwide information based on Hewlett-Packard’s Printer Control Language specification.
Several other optional tables can be present which are used to accommodate bitmaps. An OpenType font can use the ‘CFF’ table (instead of the ‘glyf’ and ‘loca’ tables) for PostScript glyphs. OpenType, AAT, and Graphite also define additional tables primarily for smart rendering. Periodically entirely new tables or new fields within existing tables are added to the specification to support new rendering features.
Two other related though older font specifications that the reader should be aware of are TrueType Open and TrueType GX. Roughly, TrueType Open is like OpenType without the support for PostScript glyphs. TrueType GX is like AAT before support for Unicode was added.
In this section the reader has obtained a basic understanding of bare TrueType fonts and has been equipped to explore further. The official specifications (referenced in the Introduction) can provide more details. It is often helpful to compare specifications from both Microsoft and Apple.
To further explore smart rendering, “Rendering technologies overview” should be a useful resource. A quick summary of all tables used in TrueType, OpenType, AAT, and Graphite fonts can be found in TrueType table listing. For a better understanding of how rendering relates to keyboarding and encoding, see Constable (2000).
Constable, Peter. 2000. Understanding Multilingual software on MS Windows: The answer to the ultimate question of fonts, keyboards and everything. ms. Available in CTC Resource Collection 2000 CD-ROM, by SIL International. Dallas: SIL International.
Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.
Note: the opinions expressed in submitted contributions below do not necessarily reflect the opinions of our website.
Just have to comment, since I work with (er, play with, really) fonts I sometimes have to go back to basic references. This article is one of the better written ones I've seen. Thanks //al
this document was a big help
Thank you for an excellent introduction to the TTF internals.
Amazing post that explains very well some fundamental characteristics of the fonts. Thanks!
Searching for information on font internals is next to impossible, mainly because no one is looking except those uncommon few working with them. Thanks so much SIL for providing this essential information! No one else is providing it. You're great. :)
Thanks, Ryan, much appreciated! —Peter
Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.