WintextCom Utilities  >

The Keyboard Scripting Engine

Last updated: 26/04/2012 10:42:23 GMT
Print (Alt+1) Previous (Alt+P) Beginning of ChapterNext (Alt+N)

Keyboard scripting is the process of simulating (or synthesizing) keystrokes to make an application behave as if the keys on the keyboard were actually being pressed. Keyboard scripting can be a useful way to automate processes in an application when no other method is readily available, but it is not particularly easy to use and not always reliable. Keyboard scripts often need extensive testing to get them right and the method should generally be used only as a last resort to automate a task if the software in question does not provide its own scripting facility for the purpose. A keyboard script is often very effective and can be quite straightforward, however, especially when it is used in a particular application window, without the keyboard focus changing during the script. In addition, WintextCom's keyboard script engine offers some extra features that can come in handy without having to actually control the application with simulated keystrokes.

 

WintextCom provides a keyboard scripting facility that can simulate keystrokes both within WintextCom and in the context of Windows and its applications to automatically type text or operate the software. This section outlines how to use the keyboard script engine, and describes some of the problems that may arise and how WintextCom can solve them. A working knowledge of Windows programming and concepts is required to thoroughly understand all the information presented below.

 

You can initiate a keyboard script with a WintextCom directory entry by beginning the information field with a percent sign ('%'), followed by the keyboard script enclosed in braces. For example, the entry:

 

"+%{09}"

 

will advance to the next paragraph when selected. Note the use of the plus sign to keep the directory open.

 

The keyboard script syntax as a whole, the script in braces, constitutes a directory embed, and can be part of a sequence of other embeds, optionally followed by a standard entry. For example:

 

"%n{48,45,4c,4c,4f}+notepad.exe".

 

Note the plus sign to terminate the list of embeds before the standard syntax. Th list of embeds is processed before the standard entry, which means that the keyboard script is initiated before Notepad is run. However, once a keyboard script is initiated, it runs in the background on a time-sharing mechanism which enables it to interact with the rest of the system, it does not necessarily run to completion before the standard entry is processed or any other event occurs. The completion of a keyboard script depends on how it is written.

 

The above script first switches to the WintextCom Notes directory list (the letter 'n' in the embed list), and then triggers a keyboard script that types "hello". After the embed list, the standard entry launches the Windows Notepad editor. The idea is to type "hello" into the new Notepad window. This may well work successfully, but it might not! The problem is that there is no guarantee the keystrokes will not be process before Notepad is running. The success of the operation depends upon many random factors beyond the control of the user, related to Windows processing time and task sharing, the longer the new application takes to load, the less likely the script is to be successful. A common manifestation of this kind of problem is that the automatically typed characters appear in the underlying application window instead of the intended new window.

 

WintextCom's keyboard script engine offers a range of features to combat the problems that can arise when synthesizing a sequence of keystrokes. The above script needs to wait for Notepad to be running. Before presenting a more comprehensive example to illustrate this, we outline the features of WintextCom keyboard scripting in general terms.

 

Keyboard Script Syntax

 

The most basic keyboard script consists of one or more coma-separated actions, where the main action is to synthesize a keystroke and other actions (control actions) allow you to set conditions that WintextCom waits for before synthesizing the next keystroke. Each action is performed as a discrete unit, then WintextCom continues with its own processing and yields processor time to other applications, then the next action is executed provided that there is no prohibiting condition in effect, and so on, this sequence of events repeats until there are no more keyboard script actions. Each time an action is not carried out due to a prohibiting condition, the keyboard script continues without sending the currently pending keystroke, then the condition is checked again next time round, until the key is synthesized when the wait is satisfied. This allows WintextCom and other applications to respond to keystrokes before the next is synthesized or conditions are checked. The actions can include directives to suspend the script until the Windows focus changes to another window, for example.

 

However, sometimes you need to force a sequence of keystrokes to be synthesized in succession as a single unit, before processor time is yielded to WintextCom and other applications. For example, if you want to press Control+O then wait for the focus to change before pressing Shift+Tab, the first action of the keyboard script needs to suspend the script until the focus changes and then press Control+O; you cannot press Control+O first and then suspend and wait for the focus to change because it is possible in that case that the focus will change in response to Control+O before the condition is being monitored, so  the change will never be detected and the script will hang. Thus, suspending the script and pressing Control+O need to be done as a single unit, because subsequent units will not execute until the focus changes, which in turn will not happen until Control+O is pressed. To cause comma-separated actions to be executed as a single unit, use a semicolon instead of a comma after the last action. Semicolons break the script down into sections of actions that are executed as single units, which may themselves be comma-separated lists. A semicolon right at the end of the script forces the last or only comma-separated list to be executed all in one go; without a terminating semicolon, the last or only comma-separated list is executed as a sequence of individual actions with processor yielding and condition monitoring between each. The above sequence of events would invoke the open file dialogue in Notepad, for example, and switch the focus to the list of files:

 

"%{SF,+11,4f,-11;+10,09,-10}".

 

When you enter the above as a WintextCom directory item and select it, the 'S' at the beginning ensures that WintextCom yields focus to the application underneath by going into the background, then the 'F' suspends the script until the focus changes. But because Control++O is sent in the same unit, the keystroke is still synthesized. However, sending Shift+Tab is not done until the focus has changed, which happens when the open file dialogue pops up, because it is in the next unit after the semicolon. The Shift+Tab keystroke above is not sent as a single unit, but as 3 separate keystrokes with processor-yielding between each; it could be sent as a single unit by placing a semicolon before the closing brace, but because it is the last part of the script, there is usually no particular need to do so in this kind of situation, whereas it is good practice to allow as much processor-yielding as possible between actions that generate input to Windows or the current application.

 

The last example also illustrates "action groups". Many control actions are specified by just a single letter and can be grouped together without separation by a comma or semicolon, and are then executed sequentially as a unit, without processor-yielding. These actions do not involve system input or other interaction, and explicit processor-yielding is unnecessary. Some control actions require additional information, however, called a "parameter". Such control actions must be the last in an action group, followed immediately by the parameter, which may be empty, delimited by the comma or semicolon. In the above, the 'F' control action takes an optional parameter; in this case, it does not need a parameter, which is there empty, but it must still be the last action in the group.

 

If a comma or semicolon is required in a parameter, use character embedding. each action group in a keyboard script is de- embedded before it is processed for execution.

 

Keystroke Actions

 

To send (synthesize) a keystroke, both the downstroke and the release, specify the value of the virtual key code of the key. You can obtain this information from the keyboard information utility on the directory dialling/setup menu. Each time you press a key, key statistics are displayed for the key going down and then coming back up. For keyboard scripts, we need the 2-digit hexadecimal value of the virtual key code, shown in brackets followed by letter 'H'. Although it is not essential to do so in a WintextCom keyboard script, it is usual to always specify 2 digits for a hexadecimal value in this kind of situation, by using a leading 0 if the number has only 1 significant digit (such as Tab above as "09" rather than just "9").

 

The keystroke specification must be at the end of the action group specification, immediately before the comma or semicolon. It can be immediately preceded by other action directives that do not also have to be the last in a group. Just specifying the 2-digit hexadecimal value of the virtual key code causes the downstroke to be synthesized, followed by the upstroke. If you prefix the value with a plus sign ('+'), only the downstroke is sent, enabling you to keep a key down whilst synthesizing other keystrokes; this is shown in the example above, where Control and Shift are "held down" while letter 'o'' and Tab are pressed, respectively. Prefixing the virtual key code value with a minus sign ('-') synthesizes only the upstroke, enabling you to release a key that has previously been pressed, again shown in the above example with Control and Shift.

 

Even when keystrokes are synthesized in the same action unit, as well as between the down- and upstroke of a key that is being pressed and released by just specifying its virtual key code, WintextCom by default forces Windows to allow other applications to process between synthesized strokes. This slows down the operation of the script slightly, but increases its reliability. The default delay between key simulations is 250 milliseconds. You can set it with the setup directive:

 

"dp250".

 

Setting the delay to 0 is not recommended, but under normal Windows circumstances, properly constructed keyboard scripts with appropriate control actions should work reliably. The delay can also be set for the current script with the 'P' control action, see below.

 

Control Actions

 

The control actions allow you to perform operations during the keyboard script to ensure that subsequent keystrokes are generated at the appropriate time and under appropriate conditions. Each control action consists of either a single character or a group of characters wit a certain delimiter. Control actions can be specified successively, without comma or semicolon separators, preceding a keystroke specifier, except for those which themselves must be the last action in an action group. It is not necessary to have a keystroke at the end of a sequence of control actions, though often this is the case as the control action sets up a condition to wait for and the following keystroke()s cause that condition to be attained.

 

Some control actions do not set up conditional execution, but perform WintextCom functions such as capturing data from the clipboard. These actions greatly extend th scope of WintextCom keyboard scripts, enabling you to capture information made available by synthesizing application keystrokes, for example.

 

The supported control actions are  (specifiers case sensitive) --

 

Examples

 

The following example expands upon our earlier example with Notepad:

 

"%{fedit;m+12,46,-12;F,4f;+10,48,-10,45,4c,4c,4f}+notepad.exe".

 

The first action of the script is to suspend it until an edit window gains the focus. This condition is brought about by Notepad starting up as a result of the stand directory entry after the keyboard script syntax, the event is not generated by the script itself. Once the Notepad window is detected, the next script unit suspends the script until a menu becomes active, and in the same unit, presses Alt++F to drop the file menu. When this condition is detected, the next script unit suspends the script until the focus changes and then presses letter 'o', in the same unit, to activate the open file dialogue. Note that in this case we do not specify what the window class has to be, it is sufficient to just wait for a change in focus from the main edit window to the open file edit box. Once this condition is satisfied, the script types "Hello" into the open file box, using Shift down before 'h' and then Shift up to capitalise it.

 

The following example does the same thing with Microsoft Word, assuming it is your system word processor. Word menus are not standard Windows menus, so instead of going through the file menu, we press Control+O to activate the open file dialogue straight from the document windows.

 

"%{f_wwg;frichedit20w,+11,4f,-11;+10,48,-10,45,4c,4c,4f}+.doc".

 

The first script action suspends it to wait for the Word document window class to gain focus, "_WWG" (class specification not case-sensitive). When this happens, the second script unit suspends it to wait for another window class, "RichEdit20W", that of the open file edit box. In the same unit, the script presses Control+O to activate the open file dialogue and trigger the condition being waited for. We could have just suspended to wait for a focus change, as with the Notepad example, without specifying the class of the new window. Waiting until the target window explicitly has focus ensures that the typing does not start before the window is active, if the focus is going through some intermediate windows not normally apparent to the user. Even then, there is no absolute guarantee that the window is ready to receive keyboard input, and in some cases using the "P" control action to pause the script for a number of milliseconds before starting to type may be necessary or advisable, only trial and experience can tell. Once the open file dialogue is active, the script types "Hello", as above.

 

Developing Keyboard Scripts

 

The development of a keyboard script can be quite tedious and require a lot  of testing. Once a script is properly working, however, and written to take account of the circumstances under which it runs, it is usually reliable and rewarding.

 

WintextCom does not provide a built-in method of cancelling a script that is unable to complete. This can happen if the script waits for a condition that never arises, due to an error in a window class name, for example. In this situation, you may find that your compute in general behaves erratically or performs sluggishly. Only one keyboard script can be running at any one time. If you start a script while one is already running, the already-running script is canceled before the new one is initiated. This does provide a means of shutting down a hung script, though it is not always practical. If all else fails, shut down WintextCom and restart the software, or reboot the PC. Instances of system instability can also arise though typing errors accidentally triggering unwanted actions or invalid keystrokes.

 

In order to develop a keyboard script, you may require technical information such as window classes. The System Explorer utility provides a way of obtaining such information, by presenting a report on an application's structure, including window class names and other details. To set System Explorer to quickly provide information for just the window with keyboard focus:

  1. Press Control+Alt+WindowsKey+F11 to activate the System Explorer configuration or request dialogue box.
  2. Type the letters "wv" into the edit box and press ENTER or click the OK button.
  3. Press Escape or click OK to close the popup window that appears listing technical information about the focussed window.
  4. From then on, whenever you want information about the window that has the keyboard focus, press Shift+WindowsKey+F11 to bring up the System Explorer information window immediately with the required details.
  5. To reset System Explorer to its default presentation, press Control+Alt+WindowsKey+F11 again to bring up the edit box, press Delete to clear it, and then ENTER.

 

 


Page url: http://wtcmanual.wintextware.com/index.html?m_keyboard_scripting_engine.htm