NRSI: Computers & Writing Systems
Implementing Writing Systems: An introduction
The NRSI Model for Implementing Writing Systems
Over its short history, the Non-Roman Script Initiative of SIL International has developed a model for using computers to implement the various writing systems that are needed for text input, storage, processing, and output. Because of the variety and breadth of locations where it works, SIL has one of the greatest needs for flexibility in these various functions. During its early days a model was developed for dealing with complex scripts that encompassed all of these needed functions.
The NRSI model has, at its heart, data which is encoded according to a standard. This data standardization can be split into two parts: a standard for the character-by-character storage and a separate standard for the structure of the text or overall document. Surrounding this data centre are four processes which act on or interact with the data: keyboarding, rendering, analysis, and conversion. The following material will provide an introduction to these five important areas.
The Text Encoding Model
Data: the Central Focus
As mentioned previously, data is at the centre of our model. To ensure it does not change, one important requirement is for it to be stored in a way that is uniform and conforms to a standard. This standard storage form is called encoding.
Character encoding refers to the storage of individual pieces of data. During the past fifteen years a standard for character encoding has been developed which is increasingly inclusive of non-Roman scripts, both present and past. This standard is called Unicode.
Structure encoding of a text describes the type of information that is stored and how individual pieces of information relate to the whole document. The encoding can be simple or ?flat?, where it is sometimes referred to as text markup. However, it can also have a hierarchical structure as in XML.
Keyboarding refers to data input, whether by the computer keyboard or other input methods. In our data centred model, keyboarding is the means by which data is provided for storage. In the non-Roman world keyboarding needs to be flexible, allowing the user to modify input quickly and easily while at the same time providing the data in the standard storage form or encoding. Other factors to consider when designing a keyboard are consistency, efficiency, and uniformity of the process to make learning and using the keyboard easier.
Rendering prepares data to be viewed or printed by taking encoded data and combining it with a visual display. Users want to view their data in an easy, trouble-free manner. Publishers want accuracy and high-quality output. These factors should be available to every user throughout the range of tasks performed with the data.
Analysis refers to processes whereby data is analysed and presented to the user in a useful manner. These include sorting and word demarcation as well as more complex systems. Users want all such systems to be integrated into their other word processing, data acquisition and publishing functions. This analysis should also be script sensitive?usable in any script and for any language.
Conversion includes any process that changes the encoding of textual data, whether that is moving data between systems, changing the character encoding of a text string or altering a document?s structure or markup. These transformational processes are ?bridges? which prepare data for multiple uses or provide linkage between older and newer systems. Without reliable conversion, users will hesitate to migrate their data to newer systems. A lack of good conversion tools may hinder the uses that could be made of older data.
The above information is only an introduction to several of the areas in the NRSI model and as such, does not cover all the areas but only a subset. It is hoped that you will find the introduction as a useful springboard to learning more about implementing writing systems.
Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.
Note: the opinions expressed in submitted contributions below do not necessarily reflect the opinions of our website.