NRSI: Computers & Writing Systems
NRSI Update #17 – July 2002
In this issue:
Introducing Sharon Correll
by Sharon Correll
Strictly speaking I’m not really new to the NRSI, because I was “on loan” from the Language Software Development department for about three years (1998 - 2001) as part of the Graphite development effort. Non-Roman scripts proved addictive, and I am now officially a member of the NRSI.
My first job in computing was at my alma mater, the University of Delaware, developing computer-based instruction materials. Although this seems a far cry from working with non-Roman scripts, I had an opportunity to be part of a team that was creating an adventure game for foreign language instruction. During this project I learned the Prolog programming language and worked on parsing and correcting French grammar and spelling, and the concepts involved in this sort of work would turn out to be similar in many ways to what I later discovered in Graphite. I also completed a master’s degree at the U of D with a focus on artificial intelligence, and those studies exposed me to a lot of the ideas of non-traditional programming and conceptual modelling that is relevant to the kind of applications that SIL is developing.
I joined SIL around the time I was doing my masters’ and moved to Dallas in 1991 to start my first assignment in what was then the Academic Computing Department. I was a member of the CELLAR programming team, building the underpinnings of what would eventually become LinguaLinks. More recently I’ve been helping with the latest generation of tools, FieldWorks, and specifically I’ve done a lot of work on WorldPad, the text editor that can render complex scripts using Graphite.
Now that I’m full-time with the NRSI, I’m hoping to put some effort into improving the Graphite package (see related article). Eventually I’d like to develop smart fonts using Graphite and OpenType and develop more expertise in Unicode.
Not all my life revolves around programming. In my spare time I love doing music—leading worship at my church and singing with groups like the Dallas Symphony Chorus and the chorus of The Dallas Opera. My dream is to someday be involved in a music ministry using classical music. But until then, Graphite is a fun and rewarding project to be a part of!
Introducing Sheila Harrison
by Sheila Harrison
I was born in Illinois and lived in Hawaii and California before finally settling in Nashville, Tennessee. After living in Nashville for over 20 years, my husband, David, and I came to Dallas in 1991. David worked in the Academic Computing Department until his death in 1995. We came as guest helpers, and I became an STA/STM in 1996. Presently I am an MIT, waiting for membership status (when I’ll become real).
I served in the International Literacy Department for ten years as Office Manager and Secretary, and the decision to seek another assignment was a difficult one. I joined the NRSI team last November as Project Manager. I am excited about my new assignment and am keeping busy learning about complex scripts, growing in my new role, and helping the NRSI team to stay on schedule.
Currently I am in a Management program at Mountain View College, a local community college. When I’m not at work or at school I enjoy my small flower garden, cooking, and my six grandchildren.
International honors at bukva:raz! for Gentium, a new typeface by Victor Gaultney
by Peter Martin
We are pleased to announce that the NRSI’s Victor Gaultney has received recognition for his work on Gentium, an original typeface design he is developing as part of his MA in Type Design at the University of Reading, England. Encouraged by his professor, he entered the regular and italic faces in bukva:raz! (tr. “letter one!”), an international competition to identify the best 100 typefaces of the last five years. The panel of judges from around the world met in Moscow, December 2001 to select the winners: Gentium was placed in the top 100 (rankings were not given).
The design will get its own spread in the “Language, Culture, Type” book published next year by Graphis, and will be displayed in the exhibit at ATypI Roma this September, in exhibitions in Moscow and St. Petersburg, and at the General Assembly Building of the United Nations Headquarters in New York City, in early 2003.
Among Gentium’s distinctives are: original design; extensive set of Latin, Greek and Cyrillic glyphs with consistent design; excellent readability; economy of space suitable for long documents; capitals and ascenders designed to accomodate diacritic combinations. The sample below shows a selection of the extended Latin range:
From the ATypI website: “bukva:raz!, the international type design competition, is part of a special, tri-partite programme of the Association Typographique Internationale (ATypI) dedicated to the Year of Dialogue among Civilizations, 2001. The programme received the enthusiastic endorsement of Mr. Giandomenico Picco, Personal Representative of the Secretary-General for the United Nations Year of Dialogue among Civilizations, in May 2000, and is an official part of the global campaign for the Year of Dialogue coordinated by his office.”
Congratulations from the team, Victor!
Victor’s academic pages: http://www.sil.org/~gaultney/
l’Association Typographique Internationale (ATypI) website: http://www.atypi.org/
bukva:raz! information: http://www.atypi.org/bukvaraz/
ATypI Conference 2002 in Rome: http://www.atypi.org/rome2002/index.html
TECkit—new and improved
by Jonathan Kew
TECkit (the Text Encoding Conversion toolkit) is a system for defining and implementing the conversions or mappings between the custom 8-bit encodings used by our old “special character” solutions and the Unicode standard. The system consists of a simple language for writing these mappings, a shared library that implements the actual conversion process, and simple tools for applying conversions to plain-text and Standard Format files.
A preliminary version of TECkit was distributed on the CTC 2000 resource CD. Since that time, there has been considerable further development of the system, and TECkit version 2 is currently in testing. This version adds support for more complex encodings than could previously be implemented; in particular, it has better support for mappings where reordering is required, such as many existing systems for Indic/SE Asian scripts. It also removes the dependency on Perl to run the mapping compiler.
The package includes the information needed for developers to integrate the TECkit conversion engine into other applications; products such as Paratext and FieldWorks expect to take advantage of the TECkit engine to support import and export of data in legacy encodings. Sample code showing how TECkit can be used from Visual Basic and VBA (e.g., from within MS Word) is also included.
At the time of writing, the latest TECkit test release is available here.
Corporate Strategy for Transition to Unicode—Summary of Recommendations to the Language Software Board
by Peter G. Constable, November 7, 2001
We are quickly reaching the point in the development of commercial and SIL software at which it will be both practical and advantageous to use Unicode. Therefore, we need to plan now for how we will make a transition to working with Unicode. This document aims to set forth a strategy for that transition.
The following recommendations are being made:
Roman Font Strategy—The Future of Encore Fonts
by J. Victor Gaultney
New technologies require revised tools and new strategies. The technical advancements in operating systems and software, such as Unicode, require us to rethink how we want to meet the need for fonts for the millions that use writing systems based on Roman and Cyrillic alphabets.
Planning for this next generation began over a year ago and began to take shape last fall. Since then, intensive work has begun, with the hope of releasing at least one ‘next generation’ font by May 2003.
The technical requirements
A new generation of Encore Fonts will need to meet our needs for many years into the future. To be adequate for the wide range of needs (publishing, literacy, linguistics, translation, electronic publishing), a revised Encore package would need to:
A streamlined strategy
With all this in mind, a new strategy has been prepared to meet these needs as quickly and simply as possible. It centers around the development of ‘global’ fonts that, thanks to Unicode and smart font technologies, will meet 90-95% of Roman and Cyrillic needs. No longer will we need to have SIL Doulos Cameroon, SIL Doulos PNG, SIL Doulos Mexico, etc. Most of us will only need one font: SIL Doulos. Yes, a few special-purpose fonts will be necessary, but only for those with very unusual/complex needs.
The strategy also aims to release fonts as soon as possible in the development process, in order to make them available for use. So the strategy has three phases:
Phase I - Provide a single global Roman/Cyrillic font for Unicode transition
This phase will deliver a single smart font, based on SIL Doulos-Regular, that will meet 90-95% of Roman/Cyrillic needs around the world. This will require revisions to existing glyphs and a large number of new glyphs in order to support important Unicode ranges. It will also require development of tools related to design automation, information management, smart font code development and testing.
Begun: Jan 2002. Beta test: Nov 2002. Release: May 2003.
Phase II - Complete remaining font families and developer tools
This phase will provide a larger set of smart Roman/Cyrillic fonts, including other styles (italic, bold) and other typeface families (monospaced, sans-serif, publishing), similar to our current Encore 3 fonts. It will also complete development of first-generation font tools needed for making the fonts work for a wider variety of languages.
Start: Nov 2002. Beta tests: mid to late 2003. Release: July 2004.
Phase III - Further broadening and refinement of fonts and tools
This phase will increase the fonts available and make the font tools easier to use and understand.
Start: mid-2004. Release: unknown.
And what about TypeCaster?
TypeCaster, and other associated tools, have been needed because of the plethora of different font encodings in use, and to construct composite (base + diacritic) glyphs. With Unicode and smart font technologies, the need for a font compiler is nearly eliminated. Unicode specifies exactly how the glyphs ought to be encoded. Smart fonts use built-in rules to handle diacritic positioning, etc. The fonts will not require compilation or field modification!
So, there is no plan to revise TypeCaster, or replace it with a similar tool. In the plan above, there is talk of ‘font tools’, but these will initally be very rough, textual tools used by the Encore Fonts development team. Development of special-purpose fonts would also require these tools, but such development would be done in close coordination with the NRSI Roman Font Team.
New Font tool—FontLab
by Martin Hosken
It’s not often that font designers get new toys to play with, but at last there is a tool that has the capability to really help font designers to become more productive. FontLab has been developed over a number of years and is now at version 4. It is a powerful font design package which allows you to do many things not available in other packages, or for which you would have to use a suite of tools. You can do hinting, TrueType point control, composite creation, and all kinds of things. The most exciting feature is that FontLab is now scriptable using the Python language which opens up all sorts of possibilities for allowing the font design process to integrate into other activities necessary in font production. For example, we have a whole database of information used in deciding which glyphs are needed in the single global Roman Font (mentioned in Phase 1 above). The scripting capabilities allow information flow back and forth with that database, saving a poor font designer’s fingers and wrists!
FontLab is certainly a high end tool, at a high end price. It also has that high end feeling of having lots of power at your fingertips but complete confusion over how to get at it. And many have been reluctant to switch from good old Fontographer because of that. And there is certainly no reason or expectation that anyone using such a tool should need to change. If you are using Fontographer and are happy with it, please don’t feel any pressure to change. Switching is not without a learning curve. Having said that, those who are using FontLab seem to be happy with it, bugs not withstanding (here is Victor Gaultney’s perspective: http://www.sil.org/~gaultney/FogFL/).
New characters for Unicode
by Jonathan Kew
Two NRSI proposals for additions to the Arabic script block of Unicode have recently been accepted by the Unicode Technical Committee, one in November 2001 and one in February 2002. The first of these proposals was for three additional letters used in some EEG projects; the second was for a variety of diacritics and other marks used in some South Asian languages. The characters are listed (along with other characters from a number of other sources) in the Unicode “Proposed New Characters: Pipeline Table” at: http://www.unicode.org/unicode/alloc/Pipeline.html
The acceptance of these two proposals should be seen as an encouragement not only to the particular language projects where these characters will be used, but also to others. It confirms that, given appropriate supporting documentation, we can have characters needed for minority languages added to the Unicode standard. This means that as fonts and applications are updated to support newer versions of the standard, these characters will be increasingly widely supported, and the minority communities that need them will be able to use their own languages in standard, mainstream computing products for the first time.
Graphite Development News
by Sharon Correll
Graphite development has been fairly quiescent for the last year and a half, with the exception of a few bug fixes here and there. But we are expecting that will change in the near future and we will be able to begin adding some new functionality to the system. Here are some ideas the Graphite development team is discussing:
There are also a few Graphite fonts that are under development.
If you are interested in the future of Graphite, please sign up on our e-mail discussion lists by visiting Graphite. There are three lists that may be of interest:
Circulation & Distribution Information
The purpose of this periodic e-mailing is to keep you in the picture about current NRSI research, development and application activities.