Home

Contact Us

General

Initiative B@bel

WSI Guidelines

Encoding

Principles

Unicode

Training

Tutorials

PUA

Conversion

Resources

Utilities

TECkit

Maps

Resources

Input

Principles

Utilities

Tutorials

Resources

Type Design

Principles

Design Tools

Formats

Resources

Font Downloads

Gentium

Doulos

IPA

Rendering

Principles

Technologies

OpenType

Graphite

Resources

Font FAQ

Links

Glossary


NRSI: Computers & Writing Systems

SIL HOME | SUPPORT | DONATE

You are here: Encoding > Unicode
Short URL: http://scripts.sil.org/CatUnicodeCharacterStories

Unicode Character Stories

Peter Constable, 2003-06-19

The Unicode character set is very large, and there are a lot of details on individual characters that simply cannot be covered in the text of the standard. The following articles provide details for at least some characters in Unicode.

Mapping codepoints to Unicode encoding forms Peter Constable, 2001-06-13
This appendix describes in detail the mappings from Unicode codepoints to the code unit sequences used in each encoding form.

How do I encode...? Lorna A. Priest, 2009-03-31
These are questions (and the answers!) people have asked about what the encoding is for various characters in the Unicode standard. This page is also a helpful teaching tool for understanding Unicode.

Symbols in Phonetic Symbol Guide 2nd edn. in relation to Unicode 5.1 Peter Constable, 2009-01-21
Assesses the symbols listed in the Phonetic Symbol Guide, by Pullum and Ladusaw, giving mappings through Unicode 5.1, and comments on those symbols now supported in Unicode 5.1.

Orthography development in relation to Unicode Lorna A. Priest, 2004-11-18
It is out of our scope to give complete guidelines for developing an orthography. However, we would like to give you a process to work through from a Unicode perspective.
In designing a writing system, one must decide what symbols will be used and how. Here we list Unicode factors that should be taken into account.

Understanding Unicode™ - I Peter Constable, 2001-06-13
Unicode is a hot topic these days among computer users that work with multilingual text. They know it is important, and they hear it will solve problems, especially for dealing with text involving multiple scripts. They may not know where to go to learn about it, though. Or they may have read a few things about it and perhaps have seen some code charts, but they are at a point at which they need to gain a firmer understanding so that they can start to develop implementations or create content. This introduction is intended to give such people the basic grounding that they need.

Understanding Unicode™ - II Peter Constable, 2001-06-13
Unicode is a hot topic these days among computer users that work with multilingual text. They know it is important, and they hear it will solve problems, especially for dealing with text involving multiple scripts. They may not know where to go to learn about it, though. Or they may have read a few things about it and perhaps have seen some code charts, but they are at a point at which they need to gain a firmer understanding so that they can start to develop implementations or create content. This introduction is intended to give such people the basic grounding that they need.

A review of characters with compatibility decompositions Peter Constable, 2003-06-09
This appendix is intended, therefore, to provide an introduction to this set of characters, which constitute perhaps the least principled elements of the Standard.

Software requirements for different levels of Unicode Support Lorna A. Priest, 2009-11-30
This page provides information on levels of Unicode support provided by different software applications.

Unicode 5.1 Latin and Cyrillic characters – sorted Lorna A. Priest, 2008-05-09
PDF documents with tables of Latin and Cyrillic characters from Unicode 5.1 sorted in Unicode Collation Algorithm default order. Useful for finding characters in Unicode.

Why are my quote marks backwards? Bob Hallissy, 2009-03-02
If you work with right-to-left text in Unicode and have certain quote marks in your text, then this article is for you. More specifically: if you suddenly see your smart quotes reverse direction, this article will explain why and what to do (or not do) about it.

Encoding the Vai Syllabary in Unicode Lorna A. Priest, 2004-12-01
This document is a reference for those who are interested in encoding the Vai Syllabary in Unicode. It contains information compiled during the time SIL was working on the Vai fonts.

SIL Unicode proposals and other standards-related documents NRSI staff, 2009-01-20
Unicode proposals and other standards-related documents

Is Unicode ready for you? Albert Bickford, Jim Brase and Lorna Priest, 2007-05-11
This article will help you decide whether Unicode will meet the needs for your given orthography or character needs.

Sequence Checking in Thai & Lao Martin Hosken, 2008-04-25
With the Unicode character set being so large, it is natural for system and application implementors to want to provide some mechanism for indicating what are clearly illegal sequences of Unicode characters...

Meteg and Siluq in the BHS Joan Wardell and Christopher Samuel, 2003-09-30
This short discussion on Meteg in biblical Hebrew explains how to encode various placements with a single codepoint.

Reversed Nun in the BHS Joan Wardell, Peter Constable and Christopher Samuel, 2003-11-05
This short discussion of reversed nun explains how it is used in the Ezra SIL fonts.

Puncta in the BHS Joan Wardell and Christopher Samuel, 2003-09-30
This short discussion on Puncta dots in biblical Hebrew and Unicode explains how they are used in the Ezra SIL fonts.

Unicode Word Macros Template Peter G. Constable, 2007-11-14
This template provides some VBA macros designed to deal with various Unicode-related issues in Word 97 and later versions. These include providing a means to display the Unicode value of any character, to enter any Unicode character, and to search for any Unicode character. Each of the macros can be accessed from a toolbar that is provided.

Unicode

Unicode Web site pages of interest Peter Constable, 2002-10-06
Links to pages of interest on the Unicode web site

Character Stories: U+013F, U+0140 Latin Capital / Small L with Middle Dot Peter Constable, 2004-04-16
The Catalan l-middle dot is usually encoded as a sequence of two characters. U+0140 was added to Unicode for compatibility with ISO 6937.

Character Stories: U+2024 ONE DOT LEADER Peter Constable, 2003-06-02
U+2024 ONE DOT LEADER is a graphic character, whose glyph consists of a small baseline dot, and whose General Category is Po (Other Punctuation).

Character Stories: U+02EA, U+02EB Yin / Yang Departing Tone Marks Peter Constable, 2003-06-19
U+02EB and U+02EA come from the TCA submissions regarding for Minnan and Hakka languages, for use with extended Bopomofo.


Backlinks (20 most popular; affiliated sites and popular search engines removed)
 http://208.145.80.131/cms/scripts/page.php?site_id=nrsi&item_id=CharEncInPT6
 http://208.145.80.131/cms/scripts/page.php?site_id=nrsi
 http://127.0.0.1/cgi-bin/adia-links?q=stories



Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.

Note: the opinions expressed in submitted contributions below do not necessarily reflect the opinions of our website.

 Reply
"Michael \'michka\' Kaplan", Fri, Jun 20, 2003 09:21 (CDT) [modified by peterc on Fri, Jun 20, 2003 09:42 (CDT)]

U+262B FARSI SYMBOL

Neither Farsi, nor a symbol. In real life, it is the official emblem of the goverment of the Islamic Republic of Iran.

That makes it a logo, so technically it shouldn�t have been considered a candidate for encoding, then!

But the funny fact is that it has been in Unicode since 1.0...

And in Unicode 1.0 it was called �SYMBOL OF IRAN�, which was closer to the description of its use. It was WG2 that insisted on renaming it �FARSI SYMBOL� to get �IRAN� out of the name...

(Content provided by Kenneth Whistler, Peter Constable, and others)

Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.



© 2003-2014 SIL International, all rights reserved, unless otherwise noted elsewhere on this page.
Provided by SIL's Non-Roman Script Initiative. Contact us here.