NRSI: Computers & Writing Systems
Display Issues – FAQ
Question: Why are some character shapes distorted in Windows XP?
Some character shapes may be distorted or garbled. Examples of fonts that are known to manifest these symptoms are SIL Ezra, Ramna Classique, and SIL Doulos NP.
Answer: The culprit appears to be the initial version of Microsoft’s font smoothing technology based on ClearType. To fix this problem, either switch to standard font smoothing, turn off font smoothing altogether, or apply Windows XP SP1.
More information: In Windows XP, there are three settings for font smoothing: , , and . There are a number of methods to control the selection, but the most obvious is from the Control Panel: .
If you want to turn off smoothing altogether, just uncheck this box. Otherwise, selectinstead of if you are having trouble with font display.
This problem is fixed with Windows XP Service Pack 1 (SP1).
Question: How can I confirm that the required fonts are installed?
Answer: It may sound silly, but I've often been asked to help with a font-related problem only to find out that the user doesn't have the required font installed! To save yourself some embarrassment, be sure to look for this before going further.
A tip for Microsoft Word users: You can find out whether Word thinks it has all the fonts required for a given document quite easily. With the document open, click , then select the tab and press . If Word is missing any fonts needed for this document, it will list them and tell you what fonts it is using for the missing ones.
Question: Why does my text sometimes appears as boxes when the font is changed?
You open a Word document, select some text, and change the font. Some of the characters now show up as boxes. For example, with this text:
changing the font to Times New Roman results in:
Answer: Boxes show up when there is a mismatch between Unicode characters in the document and those supported by the font. Specifically, the boxes represent characters not supported by the selected font.
One common situation is when the original text is formatted with a symbol-encoded font such as
When text is formatted with a symbol-encoded font, Word stores PUA characters (typically U+F020 .. U+F0FF) in the document. If you then change fonts to a standard font such as Times New Roman, boxes appear because Times New Roman doesn't have any characters in the PUA range.
There is an anomaly that may mask this behavior: Word doesn't lock in the PUA character codes until a document is saved. This means that you can type text in Wingdings and then successfully change the font to Times New Roman (and back!) as long as you haven't saved the document in the interrim.
If you have Peter Constable's Unicode Word Macros installed, you can select a block of text and click on the button and all the symbol-font characters will be folded to their codepage 1252 ANSI equivalents (in Times New Roman).
Symbol-encoded fonts are only one possible situation that can cause boxes. No single font covers all of Unicode, so you can have similar situations arise when trying to change fonts to one that doesn't support the Unicode characters needed by your document.
Question: Why does my text sometimes appears as question marks?
For example, I open up a document with this text:
and I use the clipboard orto transfer text to another application, but I end up with something that looks like:
??? ????? Unicode? ? ?? ?? ?????? ??????? "???????" ?? ? ???Unicode(???/?????)?? ???Unicode(???)?? ?? ????? ?? Unicode? ? Cos'è Unicode? ? ??????????? Kas tai yra Unikodas? ? ???????? ?????? ? ??? ????? Unicode? ?
Answer: Unicode data is being converted to 8-bit (usually via the system codepage) and the target 8-bit character set doesn't included the characters needed. Any characters not representable in the 8-bit character set will come through as question marks.
A common situation is trying to send data to an 8-bit legacy application through the clipboard orwhen the original text is either
In both of these situations, the original text contains Unicode characters that are not representable in the target 8-bit character set, so the unrepresentable characters are changed to question marks.
Word 2000 (and later)
To fix this problem, at least for symbol-encoded text copied to the clipboard, Microsoft changed the way Word 2000 (and later) put such text on the clipboard: it magically maps the symbol-font PUA characters back down to 8-bit (usually by subtracting U+F000 from the Unicode codepoint). While this change fixes one thing (copying symbol-font text to legacy applications via the clipboard), it temporarily broke something find/replace (see this question). This was fixed in Word 2002.
Question: Most characters in my font are OK, but some display as a box or display from some other font.
Answer: When a few, but not all, characters are wrong, the most important things to know are:
See subsequent questions for specific problems
Question: Characters 128 (x80), 142 (x8E), 158 (x9E), and/or 183 (xB7) display as a box or or display from some other font.
Answer: The problem fonts can be fixed by the Eurofix utility.
More Information These characters have led dual lives on Windows. That is, the Unicode characters to which these Windows characters map have changed over time. Characters 128, 142 and 158 changed with the introduction of Windows 98. Difficulties for character 183 started much longer ago – with the introduction of Unicode APIs in Windows NT: The 8-bit APIs have, since Windows 3.1, mapped character 183 to U+2219. Unfortunately, the Unicode APIs (introduced with Windows NT) map 183 to U+00B7.
Depending on the date and source of the font in question, and the particular application being used, one or more of these characters may show up as a box (▯) or, if the application uses font linking, as the Windows default character for the character in question (€, Ž, ž, or ·, respectively) but from some other font.
In all these cases, fix is simple: modify the font so the glyphs are double mapped. E.g., the glyph for character 183 should be accessed from both U+2219 and U+00B7. Before proceeding, be sure your font license permits making such changes.
For fonts generated by the SIL Encore Font System, the latest TypeCaster compiler already double-maps characters 128, 142, and 158, but not 183. Older versions of the SIL Encore Font System do not double-map any characters.
The font can be made to work by processing the font with the Eurofix program. Eurofix is part of Martin Hosken's Font-TTF package, available as a Perl module from CPAN or as Windows executable from FontUtils. Eurofix is a command-line program and if you execute it without parameters you get this help message:
EUROFIX [-m num] infile outfile Edits a font to account for the change in codepage 1252 definition in Win98, NT5 and all things new then. -m specifies that the Mac hack should also be done.
The following changes are made to ensure that the glyphs at the two positions are the same, if possible:
For the Mac table
(-m may be for 240 or 211 depending on Apple or MS)
Copies are only made if there is no glyph there already.
Question: In Microsoft Word 2000 or later, characters 157 (x9D), 253 (xFD) and 254 (xFE) don't show up at all, the font changes to something like Tahoma, and the typing cursor changes.
Answer: You have enabled a right-to-left language such as Hebrew or Arabic. Even though you are typing on an English keyboard, Word 2000 or later interprets these keystrokes via Codepage 1256 (the Arabic codepage):
There are three known workarounds:
Microsoft has been told of this problem and they insist that Word is "operating as intended".
Question: Character 211 (xD3) displays as the pair of characters 237 (xED) + 210 (xD2)
Answer: This is a bug in Thai Windows 9x. The solution is to avoid this character or switch to alternate OS, e.g. Windows 2000.
Question: In Microsoft Publisher 2003, various characters from 131 (x83) to 159 (x9F) are displayed from some other font than the one selected. These display properly in Word 2003.
Answer: The specific list of characters is
As far as Unicode is concerned, these characters are actually in the Latin Extended A block. Publisher looks into the font to see what Unicode blocks the font claims to support, and not seeing Latin Extend A indicated, Publisher is finding another font to use for these characters.
If the font is from the SIL Encore Font system, and if you have the CST, etc., and can re-build the font with TypeCaster, the fix is to add, using a text editor like Notepad, the line:
to the top of the CST file — up near the trans and encode commands, and then recompile your fonts. (Unfortunately the CST will no longer be editable with the Encore CST Editor).
If the font isn't Encore, or you don't have the CST, you can use various font editors to make the same change. Before proceeding, be sure your font license permits making such changes. What is needed is to set the bit corresponding to the Unicode "Latin Extended-A" in the UnicodeRange bits of the OS/2 table. Of course, some font editors are not very good at preserving the font structure and quality, so be careful. If you are comfortable installing Perl modules, you can use Martin Hosken's HackOS/2 script from the Font:TTF package available.
Question: Using my custom font, miscellaneous (other than already mentioned) characters are not showing up properly when typed. They are OK once in the document, or if I use .
Answer: The most likely causes are your word processor's auto-correct and auto-format capabilities. By default, for example, Word will automatically change character code 243 (xF3) to 211 (xD3) at the begining of sentences, and if you have hacked in two unrelated shapes at these character points then you might be surprised when 211 shows up.
More and more, word processors are taking advantage of semantic information provided by the Unicode standard. In this example, Word knows there is a case relationship between U+00D3 LATIN CAPITAL LETTER O WITH ACUTE and U+00F3 LATIN SMALL LETTER O WITH ACUTE, and thus Word is trying to be nice and correct your typing. But if you have hacked the font with shapes that don't have the same case relationships, you are just asking for trouble.
You can, of course, turn off most of Word's autocorrect features if you need to.
By the way: This is the main reason that most of SIL's legacy fonts (e.g., SIL Galatia, SIL Ezra, SIL IPA93) are created as symbol-encoded fonts.
Question: In Word 2000 only, non-Roman text appears as Latin in the dialog.
For example, I copy some SIL Ezra text from the document to the Find/Replace dialog, and it displays as Latin text. Furthermore, if I click the Find button Word cannot find the text, even though it is clearly in the document.
Answer: The problem occurs with symbol-encoded fonts such as SIL Galatia, SIL Ezra, and SIL IPA93. When characters in these fonts are stored in your document, Word stores Private Use Area codepoints (typically U+F020 .. U+F0FF). When copying such text to the clipboard, Word 2000 (and later) folds these character codes down to the Latin range. While this helps in some situations (such as copying data to 8-bit applications), the Find/Replace dialog wasn't upgraded to account for this change. This was fixed in Word 2002.
Question: Why does my application wrap lines in the middle of words rather than at spaces when using symbol-encoded fonts such as SIL IPA or SIL Galatia? How can I fix this?
Answer: The characters in a symbol-encoded font are assumed to be symbols, not letters, and the application makes no assumptions about what constitutes a word. In the case of Word 97 and later, the character U+F020 (the Unicode value that Word typically stores for the character in the symbol font that you think is a space) is no different than any other symbol-font character — it should no more be used as a linebreak location than any other.
The workaround for this is to replace all occurances of the character U+F020 with the character U+0020 in some other font (perhaps Times New Roman). In Word this can be done using a search and replace or, if you have Peter Constable's Unicode Word Macros installed, you can select a block of text and click on the button.
Question: Why does my application prevent me from formatting my text with certain fonts?
Answer: The most likely reason is that the font is incorrectly identifying what characters and scripts it supports.
More info: In the world of Unicode, no single font covers all the character ranges. Consider this situation:
I'm about to apply a font named SimSun to a bunch of text. Question is, what parts of the selected text ought to actually have the font applied? You may happen to know that SimSun is a east-Asian font and it doesn't cover Arabic or Greek or Cyrillic, so it probably shouldn't be used with those scripts.
A user-friendly app will prevent the user from applying fonts that don't cover the selected text. The typical way this is implemented is font slotting.
The basic idea is that the application partitions the supported scripts into groups or slots. Through various heuristic tests, the application figures out:
Microsoft Word supports three slots. This becomes clear when you look at the complete Font Selection dialog:
Notice the slots for Latin text, Asian text, and Complex text.
Other applications use different strategies: Internet Explorer and Microsoft Publisher base their slots on Unicode ranges. You can easily see the nearly 40 slots for Internet Explorer by clickingand then clicking .
Every font has the opportunity to declare what scripts it supports, in terms of what Unicode ranges are supported and also what Windows codepages are supported. You can use Microsoft's Font Properties Extension to find out what a font declares about itself. In the case of SimSun, we see:
What can go wrong? One of the most common problems is that fonts, especially older ones, do not always accurately declare what scripts are supported. Older applications didn't care and so it didn't really matter, but as we move towards Unicode it becomes more and more important that fonts make appropriate claims. See the previous question regarding Microsoft Publisher for an example.
The other problem we've seen is where bugs or magic in the application. For example, Microsoft Word appears to require fonts to declare support for Latin 1 Unicode range and codepage 1252.
Question: Why does my text show up in some other font, even though the font in which the text is formatted is definitely installed?
Answer: In the answer to the previous question I suggested that a user-friendly app will prevent the user from applying fonts that don't cover the selected text. A corollary to this is: a user-friendly app will find some font to use if the font you ask for doesn't support the characters in the text.
The mechanisms that systems use to locate a suitable font are varied, but generally go by the name of font linking (or font fixup). If you can't figure out any other reason for the incorrect font being displayed, then suspect font linking.
Unfortunately, font linking
Microsoft Office supports some fixup control in the registry. See details.
Question: When opening RTF or plaintext file in Microsoft Word, some “upper ANSI” characters display incorrectly as Arabic, Cyrillic, Hebrew, or CJK.
For example, here is what happens when I open the identical RTF file, on the same machine, with slightly different configurations:
Answer: The RTF spec is evolving, and some RTF writers such as Shoebox, Toolbox and Paratext output what is now considered to be underspecified or incomplete RTF.
The reading application, e.g., Word, has to make some heuristic guesses about what was intended. Among other factors, Word takes into account the languages that you have enabled in your Control Panel and in your Microsoft Office Language Settings applet.Workarounds include:
Question: When opening plain text files, why are some paragraph-initial characters such as parenthesis, square bracket, wedge, chevron, etc, pointing the wrong way?
Answer: This will be most common when opening documents that contain right-to-left text. Certain Unicode characters such as those mentioned in the question have to be "mirrored" if they occur in right-to-left text. Applications can usually tell which characters need to be mirrored by their context, i.e, what characters are around them. However paragraph-initial (or, in some applications, line-initial) characters are also affected by the overall paragraph (or line) direction (which may be different that the first bit of text in the paragraph).
Solutions for Word users: When opening a file, tell Word that document content is primarily right-to-left. In set . Then when you open the plaintext file, select . Or, if you have just the a few paragraph-initial characters that are wrong, you can use the macros in RTL scripts in Microsoft Office to correct individual characters.
As our fonts and utilities are distributed at no cost, we are unable to provide a commercial level of personal technical support. We will, however, try to resolve problems that are reported to us.
We do hope that you will report problems so they can be addressed in future releases. Even if you are not having any specific problems, but have an idea on how this system could be improved, we want to hear your ideas and suggestions.
Please note that our software products are intended for use by experienced computer users. Installing and using them is not a trivial matter. The most effective technical support is usually provided by an experienced computer user who can personally sit down with you at your computer to troubleshoot the problem.
General troubleshooting information, including frequently asked questions, can be found in the documentation. Additional information is also available on the FAQ pages. If that fails to answer your question, please contact us by starting a topic on the Language Software Community site.
We also have an email contact form, however we will responding to those messages only as staff resources allow. Please use the new community site before using the form.
Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.
Note: the opinions expressed in submitted contributions below do not necessarily reflect the opinions of our website.
My email address is [edited for privacy]
My office computer runs Windows 7 and Office 2010.
I find an annoyance when typing - the double quote keyboard key misbehaves - in both Word and Excel at least. If I type the double quote key nothing appears on the screen until I type the next character. If that charcater is a space then the double quote appears but it's not followed by a space. If the second character is a, e, i, o, u or y then that letter appears with a dieresis but no double quote character. Any other letter appears preceded by the double quote character. When closing a quote, again, typing the character doesn't make it appear until I type another key.
Is there something wrong with my installation or with this particular combination of operating system and application pacakge?
It sounds like your computer is set up to use the wrong keyboard.
To check and/or change this, go to Start -> Control panel.
Under 'Clock, Language and Region', click on 'Change keyboard or other input method'.
This will bring up a pop-up box; select the 'Keyboards and Languages' tab and click the 'Change keyboards' box.
Your default setting will be listed here. I believe in Australia this should be 'English (United States)' unless you have a personal preference for a UK-style keyboard. If that's not what's showing in the default language field, click the dropdown and select it from the list.
If it's not in that list, stay in the pop-up box and click the 'Add' button on the right, then select 'English (United States)' from the list (you may need to expand the options using the [+] in order to be able to select it).
This will add this keyboard setting to your language options, so that you can select it from the default input languages dropdown above.
Click 'Apply' to apply your changes, then 'OK'.
I hope that helps! If it doesn't, send us an email to .
³nÕLt`_©äH|Ół²²Į„ĀŚaĒ=Ä”ģÕUõX×y_Žxu__–Øx___¤_K·æ.Ē£“É€äŖ¶+żŠÅU9Qh$¨åćm)b¦ė-_Q ćRė¸…˛@Ē¹ˇ_MWoWź Z~&KUuÖĆ_ō75_Å®r‘Ņ8g6üU_’UÓ_xJÅęµNÜT¸Ž4ņ¬0ļ~_I$ +ł±`,k k¾§Õ&°_µlßõ'(>‹6!]_µõŠ£QŪØ’Q“oM_˙‡ś˛bÄ_¢'ĀG.å~"_!@Ćł~»_w60a9&m_ Kö_S˙Ņüī_e‹‰mŽéś¹_O]Ę<ŲIĻ$#Ø°R2Ēäõž`l,>_ų; ·ē˛ŃH__WĄ#_¼ģ¯’*VĮ?Ī-æ¯Kb7Ó÷ōŁ?d —
As you can see, the text has appeared garbled in your comment. It may require a font that specifically supports that writing system, or the encoding has become garbled in the copy and paste process. It might be better if you contact us using the link at the bottom of the page and send us a screenshot of what you have.
My special characters really are special they turn into boxes no matter what I do. But this only happens with the superscripts. What do I do?
Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.