NRSI: Computers & Writing Systems
What's in Your Unicode File?
The complete list of routines in How to Write a Conversion Mapping for your Legacy Font is here.
Goals of this step
After converting your data to Unicode, it may be useful to find out what's in your file. By the end of these instructions you should have an accurate count of the codepoints which occur in your Unicode text file.
This step is part of the procedure How to Write a Conversion Mapping for your Legacy Font.
Running a character count for text files using Unicode fonts (in Unicode encoding).
Download the program
If you didn't download the program during setup, you will need a copy of the program Unicode Character Count utility.
Choose the most recent Unicode Character Count Windows executable under Downloads. This is a Windows application that does not require any installation.
Prepare the files
Place a copy of UnicodeCcount.exe in your workfolder. You may also wish to place a copy in a more permanent location for stand-alone programs of this type, such as in a new folder under C:Program Files.
You should already have a copy of your Unicode text file in your work folder.
Run the Program
From Windows Explorer, right-click your work folder name. Select
Type the following command in the Command Window, replacing inputfile with your Unicode text file name. Make up a name for the outputfile name.
We recommend you use all lower case characters for filenames. Put quotes around names with spaces. See the documentation for advanced commands.
Review the results
Compare this chart with the chart you created of your legacy file in What's in Your File?.
Sort both charts by number of occurrences, if needed. Depending on your mapping, these charts should be similar, but not necessarily identical. Verify that your results make sense.