When to Convert to Unicode

Albert Bickford, Jim Brase and Lorna Priest, 2007-05-11

Linguists and others who work with minority language data often still use older special character systems (fonts and keyboards) that work only for certain languages. These older (‘legacy’) systems are gradually being replaced by Unicode (What is Unicode?), which attempts to cover all languages in one unified system.

So, you are probably wondering: When is the right time to convert to Unicode? That’s what this article is about. The question actually has two parts:

  • When is it possible to convert?
  • When is it necessary to convert?

There is usually a period of time—a window of opportunity—between the time when it becomes possible and when it becomes necessary. Ideally, you should plan ahead so you can convert your data during that window. You want to find a time when it is convenient to do so, before a conversion is forced on you at perhaps a very inconvenient time. Hopefully your window of opportunity will be large: it will be reasonable for you to convert your data to Unicode at least a year before it becomes necessary.

This article will help you recognize whether you and your language data have entered that window of opportunity. It will also help you predict whether you are approaching the end of the window: the point where a conversion is necessary. It will help you decide whether now is the time to convert.

Likewise, if you are just starting to work with a language, you may need to decide which special character system to use. From now on, most people should plan to start with Unicode, but there may be a few situations where older fonts and special character systems are necessary.

This article is written mostly for a general audience, although some of the links get more detailed and technical. At some point, you may need the advice of someone with special technical knowledge to decide when to convert, and most people need help to actually do the conversion and get set-up to use Unicode. This article will also help you decide when you need to get that help, and provide guidance to the technicians who are advising you. Because it is written for two audiences—ordinary users and technicians—the words “you” and “your” sometimes refer to the users/owners of the data and sometimes to the technicians assisting them. You decide whether they apply to you personally.

We start by listing the general questions you need to consider. You will need to click on the links to get to the sections that explore these questions in more detail.

Basic questions to consider

In order to answer the two basic questions above and define your window of opportunity, you need to ask yourself several more specific questions. These are listed in the following chart, with links to later sections that discuss them in more detail.

When are you ready for Unicode?When is Unicode going to be forced on you?
If you can answer “yes” to all of the questions in this column, then you have entered the window of opportunity—you are ready to convert your data. If you answer “yes” to any of the questions in this column, you have reached the end of the window of opportunity—Unicode has become a necessity for you.
  • Consider the different things that you want to do with your linguistic data; will your operating system and your application software allow you to do those tasks if you use Unicode? Or, to put it another way, does available software that handles Unicode do everything that you need it to? (see section Does available Unicode software meet your needs?)

This table may provide enough detail for some people to decide whether they are ready to convert to Unicode. If so, you can stop here and get on with converting to Unicode. But, if you need more details, read the linked pages.


You may download this (including linked pages) as one whole document.

Download "WhentoConverttoUnicode.pdf", Acrobat PDF document, 285KB [4100 downloads]

With the exception of the Table of Contents, most links in the document will take you to the website, rather than to a point in the document.

