You are here: Encoding
> Unicode
Short URL: http://scripts.sil.org/Unicode
Unicode
4931 reads
Symbols in Phonetic Symbol Guide
2nd edn. in relation to Unicode 5.1 Peter Constable, 2009-01-21;
39336 reads
Assesses the symbols listed in the Phonetic Symbol Guide, by Pullum and Ladusaw, giving mappings through
Unicode 5.1, and comments on those symbols now supported in Unicode 5.1.
Orthography development in
relation to Unicode Lorna A. Priest, 2004-11-18; 13196
reads
It is out of our scope to give complete guidelines for developing an orthography. However, we would like to
give you a process to work through from a Unicode perspective.
In designing a writing system, one must decide what symbols will be used and how. Here we list Unicode
factors that should be taken into account.
Understanding Unicode™ -
I Peter Constable, 2001-06-13; 56665 reads
Unicode is a hot topic these days among computer users that work with multilingual text. They know it is
important, and they hear it will solve problems, especially for dealing with text involving multiple scripts.
They may not know where to go to learn about it, though. Or they may have read a few things about it and
perhaps have seen some code charts, but they are at a point at which they need to gain a firmer understanding
so that they can start to develop implementations or create content. This introduction is intended to give
such people the basic grounding that they need.
How do I encode...?
Lorna A. Priest, 2009-03-31; 24538 reads
These are questions (and the answers!) people have asked about what the encoding is for various characters in
the Unicode standard. This page is also a helpful teaching tool for understanding Unicode.
Understanding Unicode™ -
II Peter Constable, 2001-06-13; 21864 reads
Unicode is a hot topic these days among computer users that work with multilingual text. They know it is
important, and they hear it will solve problems, especially for dealing with text involving multiple scripts.
They may not know where to go to learn about it, though. Or they may have read a few things about it and
perhaps have seen some code charts, but they are at a point at which they need to gain a firmer understanding
so that they can start to develop implementations or create content. This introduction is intended to give
such people the basic grounding that they need.
A review of characters with
compatibility decompositions Peter Constable, 2003-06-09; 14136
reads
This appendix is intended, therefore, to provide an introduction to this set of characters, which constitute
perhaps the least principled elements of the Standard.
Software requirements for
different levels of Unicode Support Lorna A. Priest, 2009-11-30;
37333 reads
This page provides information on levels of Unicode support provided by different software applications.
Unicode 5.1 Latin and Cyrillic
characters – sorted Lorna A. Priest, 2008-05-09; 21938
reads
PDF documents with tables of Latin and Cyrillic characters from Unicode 5.1 sorted in Unicode Collation
Algorithm default order. Useful for finding characters in Unicode.
Why are my quote marks
backwards? Bob Hallissy, 2009-03-02; 11684 reads
If you work with right-to-left text in Unicode and have certain quote marks in your text, then this article
is for you. More specifically: if you suddenly see your smart quotes reverse direction, this article will
explain why and what to do (or not do) about it.
Encoding the Vai Syllabary in
Unicode Lorna A. Priest, 2004-12-01; 13092 reads
This document is a reference for those who are interested in encoding the Vai Syllabary in Unicode. It
contains information compiled during the time SIL was working on the Vai fonts.
SIL Unicode proposals and other
standards-related documents NRSI staff, 2009-01-20; 57121
reads
Unicode proposals and other standards-related documents
Is Unicode ready for you?
Albert Bickford, Jim Brase and Lorna Priest, 2007-05-11; 8149 reads
This article will help you decide whether Unicode will meet the needs for your given orthography or character
needs.
Sequence Checking in Thai &
Lao Martin Hosken, 2008-04-25; 9680 reads
With the Unicode character set being so large, it is natural for system and application implementors to want
to provide some mechanism for indicating what are clearly illegal sequences of Unicode characters...
Meteg and Siluq in the
BHS Joan Wardell and Christopher Samuel, 2003-09-30; 13219
reads
This short discussion on Meteg in biblical Hebrew explains how to encode various placements with a single
codepoint.
Reversed Nun in the
BHS Joan Wardell, Peter Constable and Christopher Samuel,
2003-11-05; 13401 reads
This short discussion of reversed nun explains how it is used in the Ezra SIL fonts.
Puncta in the BHS
Joan Wardell and Christopher Samuel, 2003-09-30; 12916 reads
This short discussion on Puncta dots in biblical Hebrew and Unicode explains how they are used in the Ezra
SIL fonts.
Unicode Word Macros
Template Peter G. Constable, 2007-11-14; 23130 reads
This template provides some VBA macros designed to deal with various Unicode-related issues in Word 97 and
later versions. These include providing a means to display the Unicode value of any character, to enter any
Unicode character, and to search for any Unicode character. Each of the macros can be accessed from a toolbar
that is provided.
Mapping codepoints to Unicode
encoding forms Peter Constable, 2001-06-13; 24295 reads
This appendix describes in detail the mappings from Unicode codepoints to the code unit sequences used in
each encoding form.
Unicode Web site pages of
interest Peter Constable, 2002-10-06; 9667 reads
Links to pages of interest on the Unicode web site
Character Stories: U+013F,
U+0140 Latin Capital / Small L with Middle Dot Peter Constable,
2004-04-16; 10399 reads
The Catalan l-middle dot is usually encoded as a sequence of two characters. U+0140 was added to Unicode for
compatibility with ISO 6937.
Unicode Character
Stories Peter Constable, 2003-06-19; 10492 reads
Character Stories: U+2024 ONE
DOT LEADER Peter Constable, 2003-06-02; 13248 reads
U+2024 ONE DOT LEADER is a graphic character, whose glyph consists of a small baseline dot, and whose General
Category is Po (Other Punctuation).
Character Stories: U+02EA,
U+02EB Yin / Yang Departing Tone Marks Peter Constable, 2003-06-19;
9827 reads
U+02EB and U+02EA come from the TCA submissions regarding for Minnan and Hakka languages, for use with
extended Bopomofo.
Tutorials
Unicode Transition Tutorial
Links Lorna A. Priest, 2009-02-17; 17978 reads
Because of special character needs, SIL teams have long used custom encoded fonts. This was often the only
solution and worked fairly well until newer software began "breaking" our solutions. Unicode obviates the
need for custom encoded fonts. The tutorials in this section were developed for helping people in their
transition to Unicode. You will find tools for helping you figure out what the Unicode encoding should be,
tutorials and tools for actually converting legacy encoded documents to Unicode encoded documents and
tutorials to help you with keyboarding issues.
Helpful Utilities
Unicode Character Properties
Excel Workbook Peter Constable and Bob Hallissy, 2012-02-10; 50690
reads
Various files from the Unicode Character Database (6.1) compiled into an Excel workbook.
UnicodeChecker — UnicodeChecker for Mac OS X is an application that displays information for every
code point from the Unicode Standard.
Unibook — The Unibook Character browser is a small utility for offline viewing of the character
charts and character properties for The Unicode Standard.
http://www.ling.upenn.edu/unicode/ — Unicode Character Finder is webpage which allows you to
search for Unicode characters, view characters by Unicode block and copy characters to the clipboard for use
in other applications.
Backlinks (20 most popular; affiliated sites and popular search engines removed)
http://webmail.west.cox.net/agent/mobmain?mobmain=1
http://www.geocities.com/sudan_dinkanet/
http://www.grasshopperllc.com/
http://208.145.80.131/cms/scripts/page.php?site_id=nrsi&item_id=TECkitRecentC...
© 2003-2013 SIL International, all rights
reserved, unless otherwise noted elsewhere on this page.
Provided by SIL's Non-Roman Script Initiative. Contact us at .