|
Computers & Writing Systems
You are here: Input > Principles An introduction to keyboard layout design theory: What goes where?
Contents Designing a keyboard layout is relatively easy: you just allocate codepoints to keystrokes. The difficulty comes when trying to decide what codepoints to assign to what keystrokes. Do you design based around the characters on the keytops of a user's keyboard or the relative position of the keys? What do you do if you want to be able to type more characters than there are keys in your keyboard? This chapter will examine various design tradeoffs and look at some of the different approaches used in different keyboarding situations and technologies. Notice that, as yet, there is no technology which can support all the different approaches presented here. So in addition to making decisions based on the particular keyboard behaviour required, you will also need to take into consideration the limitations of your keyboarding software. First we consider issues of keyboard layout and then issues of large keyboards (those where there are more characters to be typed than keys to type them). Notice that this section contains information relevant to those designing keyboards with very few characters as well. Then we look at sequence checking and finally the whole issue of how different types of keyboards are expressed as rule systems. Surprisingly, much of what we need to consider can be brought out in the simple example of adding support for the character ÿ to a keyboard. All the examples will be expressed in terms of the Keyman keyboard description language, and the reader is referred to the relevant documentation on that language. 1 Keyboard LayoutThere are two ways of considering the question of adding 'ÿ' to a keyboard. 1.1 MnemonicThe first way to consider the 'ÿ' is that it consists of two components: 'y' + umlaut. In this case we would have a special keystroke to add the umlaut on top of the 'y', for example the keystroke ¨ following a y might add the umlaut:. In other words we are using existing information on the keys of a users keyboard to help the user remember the keying of the character. We use the term mnemonic keyboard: for this type of keyboard. The SILIPA93 keyboard is of this type. Mnemonic keyboards are commonly used with Latin-based scripts, since there is a close correspondence between what people want to type and what they see printed on the keyboard in front of them. 1.2 PositionalThe relative positions of the keys on a qwerty keyboard are defined positionally in relation to each other. Thus it does not matter what is printed on the keytops of the keys, what is important is which key is next to which key on which row. This approach is most commonly used when implementing a keyboard based on a typewriter layout or some other standard. Thus we might consider 'ÿ' as a unit and we might want a single key to press to type this character. In addition, we would like it on the periphery of the keyboard, since it is a rarely typed letter, and goes just as well on the right, where there are punctuation characters we can use. So we might specify that it is associated with the second row right-most key on the keyboard, which in mnemonic terms is the ']' key. This approach of defining keyboards in terms of the relative positions of keys is less common for the addition of a single character to an existing keyboard than it is for the implementation of a whole keyboard, particularly if that keyboard emulates an existing keyboard layout. For example, there is no mnemonic relationship between the keytops on my keyboard and the Thai letters I want to type. The Thai keyboard is designed in terms of the old typewriter layout, which is a good layout. That is, the most commonly typed letters are positioned in easy to reach locations, and rarely typed letters are more difficult to reach. When designing a complete new layout for a keyboard purely in terms of relative positioning and not considering the existing keytops, it is possible to do some analysis to allow typists to type quickly, by placing commonly typed letters on the keys in the middle of the keyboard, etc. This is a radical departure from the QWERTY layout which was designed to slow typists down. The Dvorak layout is a previous attempt to provide a keyboard layout which allows typists to type English faster. 2 Large KeyboardsIn many cases, the requirement is to design a keyboard which supports more characters than can be accessed simply from a single press of the 101 or 102 keys on a normal keyboard. There are numerous approaches which are used to extend keyboards and we present most of them here. 2.1 Modifier KeysThe most common approach is to extend a keyboard using modifier keys such as Shift (a traditional modifier key) or Ctrl, Alt, Alt-Gr1, Command, Option, etc. depending upon what type of keyboard you have. The Macintosh, for example, allows access to all the 8-bit codes directly through use of combinations of the option and shift keys. The problem is that modifier keys are often used by applications for speed keys or for controlling the application, and requiring their use for typing can preclude their use by the application. Or worse, the application may take precedence over the modified key and not allow that combination to be used for typing a character. The Mac, at least, is reasonably tidy: combinations including the Command key are for 'speed keys', those without aren't (except in non-text-oriented applications or situations such as games). So combinations including the Shift and Option modifiers are safe for generating characters. On Windows, Alt is typically used for accessing menu items, whereas Alt-Gr may be used for entering extra characters. 2.2 Dead keysDead keys are a popular approach to extending the keyboard for Latin keyboards. This approach allows the user to type a single character as a sequence of two or more keys on the keyboard. All but the last key do not result in anything being displayed, but change the state of the keyboard for subsequent keystrokes. For example, on the International English keyboard, pressing the ' key results in nothing being displayed. Following that by the a key results in 'á' being output. If a key, such as b were to follow, then ''b' would be output. Dead keys work well where there is a very strong mnemonic relationship between the key being pressed and its function. Dead key sequences should be short, in order for the user not to forget where they are, and obvious. The problem with dead keys is that they easily confuse users since pressing a key results in no visual feedback. 2.3 Operator keysA different approach to dead keys is to place the modifier after the key it modifies. Thus we might type a ' to get á. On pressing the a , an 'a' would be output. Then when the ' , is pressed, the 'a' preceding the cursor is replaced by a 'á'. This is a very powerful approach in that it allows the user to always have feedback regarding what they are typing. The major difficulty with this approach is the implementation. You first need a system which can go back and edit the document you are typing. Thus, if I were to click in a document following an 'a' and then press ' I would expect that 'a' to change. But that might well not be possible, technically. Tools such as Keyman work hard to emulate this behaviour, but even then have limitations. The second requirement is that all intermediate output characters need to be supported by the system. When implementing the IPA keyboard, it may have been nice to display an intermediate ';' as it was pressed. But the ';' does not exist in the IPA font, and so could not be displayed. This problem becomes more acute with the arrival of Unicode since you do not know what font supports what Unicode subset. Thus you can only rely on intermediate output for codes that your keyboard needs to generate anyway. 2.4 Candidate WindowOne approach to the very large keyboard problem, for such languages as Chinese, is to popup a special window as a key is pressed which contains a set of possible characters to be input, which a user can select via the mouse, pressing the initial key or using the arrow keys. As the user types more keys, the possible selection list changes, homing in on appropriate character to input. For example, a Chinese keyboard results in the following windows being displayed: The left hand window contains the keystrokes I have pressed and the right hand window contains options for what I might be wanting to type. There are various other controls to allow me to page through the list, etc. and these are helpfully in Chinese, the language I am typing, whether I can read it or not! The candidate window provides a powerful mechanism for selecting characters from a large list, or even a shorter list, if a user is liable to have difficulty remembering the keying for a particular character. Its weakness is the amount of screen space it takes up, although with increasingly large screens, this is becoming less of a problem. 3 Sequence CheckingOne problem which is more prevalent in scripts which use diacritics, is the issue of ensuring that people, for example, do not type the same diacritic twice. This enforcement of valid keying sequences is known as sequence checking and can be as simple as ensuring that people don't type the same diacritic key twice in succession. In addition to ensuring that only valid base character and diacritic combinations are typed, there is the issue of ensuring that they are typed in the correct order. Unicode specifies a combining order for diacritics, and it helps if the keyboard can encourage data entry in this order. The order can be enforced, either by not allowing an earlier diacritic to be typed after a later once, or by allowing the keystroke and then re-ordering the data in the application. The problem is the same as for operator keys whereby it is not easy for the keyboarding utility to edit a document, especially if a cursor is positioned randomly within a document. For this reason no application can assume that the keyboarding utility will ensure that data is entered in any particular canonical form or combining order. While the keyboard does not interact with rendered text, but with the underlying stored text, the two are related in that they are both concerned with the visual representation of the text. Thus, the rendering subsystem may handle 'illegal' sequences by, for example, displaying illegal diacritics over a dotted circle, which can alleviate work from the keyboard. But there is no harm in programming defensively, with both the renderer able to display illegal sequences and the keyboard endeavouring not to allow users to type illegal sequences. With dumb rendering systems, which are particularly unable to work with illegal sequences (for example, duplicates of the same diacritic), then more of the burden for ensuring valid data is placed upon the keyboarding subsystem. 4 Describing a KeyboardThe details of how a keyboarding utility interacts with the system are beyond the scope of this discussion, but there are different ways in which keyboards are described, which are worth examining. There are two dimensions to the issue: what a keyboard is described in terms of and how the rules that go to make up a description interact. 4.1 Types of RulesA keyboard may be described in terms of keying sequences. For many keyboards, the sequences are of length 1. Thus one keystroke maps to one code input to the application. More sophisticated systems, particularly those based around dead keys, allow a keystroke to change the state of this one to one mapping such that a following keystroke is mapped differently. This is the typical way that a keyboard described in terms of keystrokes, only, is implemented. When the sequence becomes unambiguous, then the utility outputs the corresponding string of codes to the application. Thus, while the keying is ambiguous, then no output is generated. For example, consider a rule-based language with the following rules. Each rule consists of a sequence of keystrokes (and not a keystroke in previous output context). y -> y y " -> ÿ y ' -> _ y @ -> ( y ) If I were to type a y , then nothing would be output, because the system would not know whether a following keystroke is going to change the 'y' to be output or not. Thus when I type a following z , for example, then I would see the output 'yz'. Or if I followed the y by " then I would see 'ÿ'. But the y would be treated as a dead key, with no output when it was pressed. This is not the most helpful user interface. A keyboard described in terms of keystroke sequences alone has difficulty when it comes to editing. Consider the case, in the above rule set, of my typing: y @ and the keyboard enters: (y). Now, what happens when I press backspace ? Should the system delete all 3 codes or just the last one? The keying sequence approach is used in all dead key-based systems and for simple one to one keyboard mapping utilities. An alternative approach is to describe rules in terms of keystrokes and previous output. This is the approach used in Keyman and SILKey. Thus we might have a rule: any(code) + @ -> '(' context ')' which would mean that if I press the key @ following any code from the class code (which in this case is made to be near universal) in the output, would result in that code being parenthesised. Now it does not matter what happens if I delete some character in the output (for example by pressing backspace), since all subsequent keystrokes are interpreted in terms of the context that the user is seeing. Thus the memory that the keyboarding utility is working with is the same as what the user is seeing, rather than with keying sequences where the memory is in terms of what they user typed, which may bear little resemblance to what is on the screen. Using previous output as context, therefore, provides a very powerful approach to describing keyboards. In effect, each rule stands alone considering only the keystroke and the textual context in which the keystroke occurs. There is no need for the keyboard to have any memory (as per dead keys or rules in terms of input keystrokes only), it just needs to be able to get the textual context and the keystroke and then to be able to edit that textual context. 4.2 Multiple RulesAny keyboard description of any complexity will be made up from a series of rules. Any rule based system needs a processing model. One of the aspects of such a model is to describe how rules relate to each other. If I have two rules that can possibly fire (meet their constraint) in some situation, which rule takes precedence. The simplest approach is to say that rules take precedence in the order in which they occur in the source file. This provides maximum control for the author of the source file and minimal work for the software interpreting the rule set. But there are problems with this approach that an author must constantly be aware of. The main issue is that of masking. Consider the following two rules, expressed in terms of previous output: y + z -> w y y + z -> x Now, if we type y y z the output will be 'yw' and not 'x' as we might expect. The reason is that the first rule will always take precedence over the second rule. The second rule will never fire since the first rule has a constraint which is the same but less constrained than the second rule. In effect, the second rule has been masked by the first rule. It is imperative, therefore that rules are written in some sort of length order, with the longest, and most constrained, occurring before the less constrained. Thus, by simply re-ordering the above rules we would get the result we expect. For this reason, many rule systems automatically order the rules with length taking priority over position in the file. Thus it would not matter which order the rules are written in the source file, the system will consider the second rule to take priority over the first, thus not masking the second rule. This has the advantage that an author is free to group rules how they want. In the case of a tie in length, then the first rule in the file should take precedence over the later, as before. In the above case where each rule is of a fixed length and there is only a single sided context, the automatic ordering of rules never fails and is by far the best approach. Having said this, there are those who would stick to author controlled ordering all the time, saying that it makes the ordering issue clearer to authors and aids in locating bugs. 5 ImplementationIn this section we examine the technical details of the process of taking a keystroke and providing character information to an application. We examine the processes for Mac OS and Windows and also how keyboarding utilities such as SILKey and Keyman fit within those processes. 5.1 Mac OSWhat happens when I press the A key on my Mac keyboard? The keyboard generates a raw key code: essentially an arbitrary number that identifies the particular physical key. Note that raw key codes are hardware-dependent: different physical keyboards (e.g., a PowerBook keyboard compared to an Extended desktop keyboard) may generate different codes. This code passes through several stages of processing before it leads to a particular character code appearing in the document. 5.1.1 The 8-bit worldVirtually all current Mac OS software receives keyboard input in the form of 8-bit character codes. These are interpreted as characters in a particular Mac OS 'script system', such as Roman, Cyrillic, Hebrew, etc. Even applications such as Microsoft Word that store text in Unicode generally rely on 8-bit keyboard input (at present). We will therefore focus first on the 8-bit mechanisms (see figure 1). First, the keyboard driver (built into the OS) converts the raw key code generated by the hardware into a keyboard-independent virtual key code:. The System file contains 'KMAP' resources that specify this conversion for all the supported physical keyboards. This is not generally of concern to users, application developers, or non-Roman script developers. We can work on the basis that all physical keyboards generate the same set of virtual key codes.2 Next, the Event Manager converts the virtual key code to a character code. This is done using a 'KCHR' (keyboard layout) resource. These are the layouts that are listed in the Keyboard menu (if it is enabled) and the Keyboard control panel. The KCHR contains multiple tables mapping virtual key codes to characters; which mapping table is used depends on the state of the modifier keys (shift, command, option, etc.). It also contains a table that maps modifier key combinations to specific key-mapping tables; thus, different KCHRs may choose to distinguish different modifier states, or may use the same key-mapping table for many sets of modifiers. Then the Event Manager posts an 'event' to the system event queue, specifying that a key has been pressed and what character code it represents.3 When the foremost application checks for user events, it will see the key-down event and insert the character code into its document. Figure 1: From keystroke to character in the 8-bit world The KCHR can also support dead keys: a particular virtual key (in a given key-mapping table) may set a dead key state that modifies the mapping used for the next keystroke. This is used for accents on many Roman-script layouts, for example. When such a dead key is pressed, no key-down event is generated, and the application is unaware that the user has done anything; the key-down event occurs when the following ('completer') key is pressed. Many simple keyboard layouts can be implemented entirely at the KCHR level. If a layout requires one-to-one mapping of keystrokes to character codes, perhaps with the addition of some dead keys, a new KCHR resources is all that is needed. ResEdit (a resource editor available from Apple) incorporates a graphical KCHR editor that allows character codes (displayed using your choice of font) to be dragged onto positions on a picture of the keyboard (see figure 2). It can even show the keyboard layout as it would appear using various different physical keyboards. When more complex keyboard behavior is required, such as contextual selection of variant character codes without relying on lots of dead keys, or more extensive context than the single preceding keystroke, the SILKey utility can be used. SILKey runs in the background and monitors the stream of events being passed to the foreground application. When it sees key-down events, it compares them with the rules in its active keyboard definition file, and modifies the event stream accordingly. The 'input' to a SILKey keyboard definition, then, is the stream of character codes generated by applying the KCHR mapping to the user's keystrokes. This has several implications:
More information, including many details that I have skipped in this overview, is available in Apple's developer documentation; see http://developer.apple.com/techpubs/macos8/mac8.html, particularly the link for 'Keyboard and International Resources'. 5.1.2 The Unicode worldIt is now possible (as of Mac OS 8.5) for applications to tell the OS that they want to receive keyboard input as Unicode rather than traditional Mac OS script system encodings. Examples of such applications include WorldText and Key Caps 9 (both included with Mac OS 9.1), but the capability is not yet present in major commercial applications. If an application requests Unicode input, the user may still choose a KCHR layout in the Keyboard menu. In this case, the Text Services Manager (TSM) calls the Text Encoding Converter (TEC) to convert the 8-bit data generated by the KCHR into Unicode, which it then passes to the application.5 More interestingly, there is a new keyboard layout resource, the 'uchr'. This is similar to the KCHR in that it maps modifier combinations to key-mapping tables, and virtual keys to character codes. It is enhanced in that the character codes generated are 16-bit Unicode values. It is also considerably more flexible than the KCHR: it supports keystrokes that generate arbitrarily long sequences of character codes, not just single characters, and supports multiple dead key states, not just a single level. An example of a 'uchr' layout is 'Unicode Hex Input', available with Mac OS 9.1, which allows any Unicode value from U+0000 to U+FFFF to be entered by typing four hex digits with the Option key down. All direct Unicode input is handled by TSM and passed to the application using Apple Events (the Mac OS inter-application communication protocol), not the old event queue. Therefore, there are no key events in the stream that SILKey watches, and so it is unable to interact with the typing process. A SILKey-like utility for the Unicode input world would have to find a different way to hook in to the system.6 Figure 3: From keystroke to character for a Unicode application An additional twist in the story is that some 'uchr' resources can actually be used by applications that expect 8-bit input! If a 'uchr' generates only Unicode values that can be mapped to a specific Mac OS script, it can be associated with that script, and TEC is used to map the Unicodes back to 8-bit characters which are then posted to the normal event queue.7 This means that the additional flexibility of the 'uchr' format can be used in the old 8-bit world. However, if the 'uchr' wants to generate Unicode characters that do not all correspond to a single Mac OS script, it must declare itself as a 'full Unicode' layout and can only be used with Unicode-enabled applications. Further details are available in the document Supporting Unicode Input:, available on Apple's developer documentation site (see above; follow the link for 'Unicode Utilities'). 5.1.3 Input methodsThe descriptions above have been concerned only with 'simple' keyboard layouts as implemented by keyboard layout resources. There is a second quite different type of keyboard also available: input methods: (IMs). Input methods are used for languages such as Chinese, Japanese, and Korean, where it is impractical to type all the required characters directly. An IM may allow the user to type phonetically, for example, and then present various matching ideographic characters for the user to choose among. In the Mac OS, IMs are plug-in software components that are managed by the Text Services Manager. They can interact with the user as needed, and with the current application if it supports inline input:, or otherwise with a floating ('bottom-line') input window, in order to determine the characters to be entered. IMs may generate text either in legacy Mac OS encodings or in Unicode, and TEC is used if necessary to match the input method's output to the application's expected input. IMs are a specialized topic that will not be discussed further here; details are available in Apple's developer documentation. 5.2 WindowsIn all versions of Windows since version 3, the keyboard processing has been, principally, the same. The primary differences between operating systems is over whether the final WM_CHAR message contains a Unicode character code or an 8-bit character code. Windows supports different keyboards, both physically and virtually, through the use of a keyboard DLL which has one entry point. This entry point is a function which returns a complex data structure containing various tables that the keyboard driver or other routines use for converting things like scan codes to virtual key codes or virtual key codes to character codes. Notice that the keyboard DLL is not passed the data to convert, it simply returns the necessary tables in an appropriate format for the calling utility to do the conversion itself. When you press a key on your keyboard, the hardware initiates an interrupt and this is handled by the keyboard driver. This reads the scan code: of the key pressed from the keyboard port. The driver then takes that scan code and converts it to a virtual key code. It then sends a VK_ message to the window that currently has the focus, containing that virtual key code. The application can then handle the messages directly or, as is most common, it passes the messages through to the TranslateMessage() function which then converts the VK_ message to a WM_CHAR message. It does this by getting a mapping table from the keyboard DLL and supporting, primarily a 1:1 mapping between virtual key code and character code. In addition, the mapping can also support dead keys and some simple ligatures (implemented like dead keys). The WM_CHAR message is then resent to the same window and either handled by the application or ignored (in which case the keystroke is considered to be ignored). Figure 4: Windows keyboard handling Since the entire conversion process is governed by the Keyboard DLL, it is possible to have completely different mappings between scan codes and virtual key codes depending on which language someone thinks they are typing. This can cause interesting effects and accounts for some of the problems when using different types of European keyboards. 5.2.1 KeymanThere are a number of locations that a keyboarding utility such as Keyman can insert itself into this process. For Keyman 3.2, Keyman worked by intercepting the WM_CHAR messages and outputting its own instead8. With Keyman 5 and the arrival of Unicode WM_CHAR messages, it is more appropriate, and in the long run easier, to intercept the VK_ messages and to output WM_CHAR messages. Thus, Keyman intercepts the VK_ keyboard message from the keyboard driver to the application and either passes it on (i.e. no conversion of that key is being done and the application uses the conventional handling of this key) or it intercepts the message and outputs zero or more WM_CHAR messages directly to the application. Since these are WM_CHAR messages, they do not need to be further converted. Since Keyman actually interprets keystrokes in terms of VK_ keyboard messages rather than WM_CHAR messages, the character used in the Keyman file for a keystroke is simply a mnemonic for a virtual key code. Keyman makes the necessary interpretation in the compiler. 6 ConclusionDifferent keyboarding requirements require different keyboarding solutions. For some, a simple 1:1 mapping is sufficient. For others, complex in-place editing is needed. Others still require some means of picking from a very long list, and narrowing down that list with each key press. As we have seen, each mechanism requires its own support, and different technologies provide different levels of that support. One thing that can be noted is that as more of the rendering burden is taken away from the encoding and placed upon the rendering subsystem, the keyboarding description is made easier, since most keying relates more to characters than to glyphs. But not all the complexity is removed. There are some cases where things remain extremely complicated. An example of such complexity is a multi-script encoding where one encoding may be rendered in different scripts. Since the keying is visually motivated, the relationship between keystroke and underlying encoding may not be so straight forward. Thus, while the job of expressing keyboards will get easier, we still need all the different mechanisms we can lay our hands on to support any keyboarding needs that may arise. Note: the opinions expressed in submitted contributions below do not necessarily reflect the opinions of our website.
"Bob Batzinger", Fri, Aug 17, 2007 14:56 (EDT) One useful technique is to map multiple characters onto the same key. While character overloading is not recommended for general use, it does serve to reduce the number of positions required on the key board. If similar characters are grouped together (especially characters that are used in lower frequency), this technique can also improve the speed at which the new keyboard can be learned. The following example works to combine the parenthesis, braces and brackets to common key position. The following lines of Keyman code would make it possible to free up four positions on the keyboard, i.e., '{','}','[',and ']', and yet provides support for all six of these characters. ( + ( -> { { + ( -> [ [ + ( - > ( ) + ) -> } } + ) -> ] ] + ) -> ) In short, this code has placed 3 characters on each of the left and right parenthesis positions. The last line of each set of rules allows one to toggle through to the correct character in case of a mistake. A second technique I have used a lot was patterned after that used on a standard court stenographer keyboard: place key phonetic features on the baseline keys of asdfghjkl;' When used with a language like Western Cree (which has a few basic consonants which change character encoded based on the vowel associated with it), typing speeds of up to 180 words per minute are easily attained because the number of characters on the keyboard has been reduced to a bare minimum that is easy to learn and remember.
"Hugh Paterson III", Mon, Nov 12, 2007 16:01 (EST) This article authoritatively handles Mac OS 8.5. I remember using that as a kid before high school. However, there is no mention in the article about OS X. We are currently at OS 10.5. I am sure that theoretical considerations may still be similar but what about OS X's native handling of Unicode? Would it be possible for the authors of Ukelele to update this article?
"Hugh Paterson III", Thu, May 17, 2012 11:31 (EDT) Is the Keyman mentioned in this article the same as the Keyman from tafultesoft? http://www.tavultesoft.com/keyman/
martinpk, Fri, May 18, 2012 10:58 (EDT) Hi Hugh! Yes, it is the same.
© 2003-2024 SIL International, all rights reserved, unless otherwise noted elsewhere on this page. |