Home

Contact Us

General

Initiative B@bel

WSI Guidelines

Encoding

Principles

Unicode

Training

Tutorials

PUA

Conversion

Resources

Utilities

TECkit

Maps

Resources

Input

Principles

Utilities

Tutorials

Resources

Type Design

Principles

Design Tools

Formats

Resources

Font Downloads

Gentium

Doulos

IPA

Rendering

Principles

Technologies

OpenType

Graphite

Resources

Font FAQ

Links

Glossary


NRSI: Computers & Writing Systems

SIL HOME | SIL SOFTWARE | SUPPORT | DONATE

You are here: Encoding > Unicode
Short URL: http://scripts.sil.org/CharStories_0140

Character Stories: U+013F, U+0140 Latin Capital / Small L with Middle Dot

Peter Constable, 2004-04-16

[Source: Eric Muller, Unicode list, 2002-8-6]

U+0140 LATIN SMALL LETTER L WITH MIDDLE DOT is intended to be used in Catalan (‘ŀl’ is pronounced as two separate ‘l’ while ‘ll’ is pronounced as in ‘million’). However... everybody seems to use U+006C U+00B7 U+006C.



[Source: Ken Whistler, Unicode list, 2002-8-6]

There is no particular reason to use the l· as a single character, when all the 8859-based and Windows 1252 implementations would be using U+00B7 for the middle dot.

Consider U+0140 as effectively a compatibility character for ISO 6937. It is mapped to 0xF7 in that standard. It is also mapped to 0xA9A8 in Code Page 949 (Korean) — which probably got it from ISO 6937 in the first place.



[Source: Ken Whistler, Unicode list, 2004-4-15, edited]

It should be noted that there is a decomposition for U+0140 in the Unicode Character Database...

It is a compatibility decomposition for two reasons: the decomposition into the sequence <006C, 00B7> may result in rendering differences (both because of potentially different decisions about where the render the dot and because the introduction of the U+00B7  MIDDLE DOT might impact line break decisions, depending on the implementation); secondly, the properties of the characters in the sequence <006C, 00B7> are distinct from those for <0140> by itself, and may impact things such as identifier parsing, again, depending on an implementation. And, as I indicated before, U+0140 is itself basically a compatibility character, introduced for mapping to ISO 6937, a preexisting standard that was among the list of character encoding standards intended to be covered by the initial Unicode repertoire.



[Source: Anto'nio Martins-Tuva'lkin, Unicode list, 2004-3-26, edited]

Catalan usual orthography uses a regular middle dot to separate two “L”s in those cases where they are pronounced as a single one, doubled only for etymological reasons.

This dot is not connected to the previous "L" in any way, as if it were some kind of diacritical. It is a standalone character — akin to the hyphen in French or Portuguese...

I advise removal of the note “Catalan” under U+0140 and U+013F, and perhaps replacement of the whole note with «for Catalan use U+006C U+00B7» (respectively, U+004C).



Note: If you want to add a response to this article, you need to enable cookies in your browser, and then restart your browser.

Note: the opinions expressed in submitted contributions below do not necessarily reflect the opinions of our website.



© 2003-2018 SIL International, all rights reserved, unless otherwise noted elsewhere on this page.
Provided by SIL's Non-Roman Script Initiative. Contact us here.