This is an archive of the original scripts.sil.org site, preserved as a historical reference. Some of the content is outdated. Please consult our other sites for more current information: software.sil.org, ScriptSource, FDBP, and silfontdev



Home

Contact Us

General

Initiative B@bel

WSI Guidelines

Encoding

Principles

Unicode

Training

Tutorials

PUA

Conversion

Resources

Utilities

TECkit

Maps

Resources

Input

Principles

Utilities

Tutorials

Resources

Type Design

Principles

Design Tools

Formats

Resources

Font Downloads

Gentium

Doulos

IPA

Rendering

Principles

Technologies

OpenType

Graphite

Resources

Font FAQ

Links

Glossary


Computers & Writing Systems

SIL HOME | SIL SOFTWARE | SUPPORT | DONATE | PRIVACY POLICY

You are here: Encoding > Unicode > Training
Short URL: https://scripts.sil.org/UTTCCountP

Perl Character Count

a Perl Utility for Counting Legacy Characters

Joan Wardell, 2005-10-24

Contents

Goals for this step

Here is an easy character count utility for those who have Perl installed.

This step is part of the procedure How to Write a Conversion Mapping for your Legacy Font.

Running a character count on a legacy text file.

Download the utility

If you didn't download the program during setup, you will need a copy of the program
MakeCharList Perl Source
Joan Wardell, 2005-12-02
Download "MakeCharListPL.zip", ZIP archive, 1KB [4627 downloads]

Prepare the files

Unzip the file and place a copy of MakeCharList.pl in your work folder. You may also wish to place a copy in a more permanent location for stand-alone programs of this type.

You should already have a copy of your legacy text file in your work folder.

Run the Program

From Windows Explorer, right-click your work folder name. Select
Open Command Window Here.

Type the following command in the Command Window, replacing inputfile with your legacy file name. Make up a name for the outputfile name. Don't leave out the "<>" signs.

perl MakeCharList.pl inputfile > outputfile

OR

MakeCharList.pl < inputfile > outputfile

Beginning with version 3, you can add -f -r to either command to get the data sorted in reverse order of frequency.

Note:

We recommend you use all lower case characters for filenames. Put quotes around names with spaces.

Review the results

Open the outputfile in Word. Change the 4th column (Legacy) to your legacy font, (if you have one). The 5th column shows the number of occurrences in the file for each character in your legacy font.

Sample character count



Here is a sample of results from a legacy file using the SIL IPA93 font.

Save this file as a Word document. Close the terminal window by typing "Exit" and pressing Enter.

Page History

2008-02-22 JW: reviewed, minor updates

2005-10-24 JW: Page created


© 2003-2024 SIL International, all rights reserved, unless otherwise noted elsewhere on this page.
Provided by SIL's Writing Systems Technology team (formerly known as NRSI). Read our Privacy Policy. Contact us here.