March/April 2004

Technology Trends

New User Devices

By Dr. James A. Larson

New technology will change the way people interact with computers. PCs enabled users to use a keyboard and screen rather than review printed reports. The Xerox Star and Apple Macintosh introduced Graphical User Interfaces (GUIs) which made the mouse and other pointing devices popular. Now, we are on the verge of a revolution in technology that makes computing portable. Separating user interface devices from the computing device will dramatically change how people interact with computers.

The Computing Device

A personal server, one about the size of a deck of cards, will contain computer memory and Bluetooth communication for sending and receiving information to other Bluetooth-enabled devices within the vicinity.

This portable device can be used to store:

User interface devices enable users to interact with the personal server. Ideally, these devices should connect wirelessly with the personal server. Example user interface devices include pens, microphones, speakers and displays.

A Special Pen

A special pen has a camera in the tip that captures small marks on specially printed paper. The markings indicate the position where the pen is writing on the paper. By tracking these positions, the pen captures pen strokes written by the user. In table one, the “camera” column illustrates how users perform common tasks by using the pen to mark and write. Not only does the pen bring computing capabilities to the paper, but the original use of paper is preserved — without disrupting existing office procedures, except for the elimination of the labor-intensive data entry task. This special pen might also contain a microphone to capture the writer’s voice. The “microphone” column in table one illustrates how users perform common tasks by speaking. Additional capabilities are enabled by combining writing and speaking, as summarized in the “combined” column in table one (page 8).

Microphones and Speakers

Speech will be the primary means for interacting with software agents residing in the personal server. While complete natural language processing is still in the research stage, users will speak and listen using command and control as well as conversational styles of dialogs.

There are a variety of microphones and speakers, including badges and headsets. Users will speak and listen to the personal server via the ubiquitous telephone and cell phone. Handheld computers with microphones and speakers will provide a multimodal user interface. With speech processing to provide a natural user interface, the software agents in the personal server become a “genie in a bottle.” In addition to speaking with software agents, users can call and speak with other users much as they do today with telephones and cell phones.


One more ingredient is needed to make a user-friendly computer-supported environmental — a display for presentation of graphical information. Candidates for displaying information to the user include:

The Opportunity

New technology allows computer users to escape from the office position — sitting in front of a computer with hands on the keyboard — to move from place to place in the world. No longer will users need to go to the computer; instead, the computer is always with the user — just as a wallet or watch is always with their owner.

Many new types of applications will be possible. Here are just a few examples:

Get ready for new ways to communicate with computers — many involve speech.

Table 1: Common tasks performed by the Special Pen

User Task




Capture Data

Capture written doodles, drawings, notes, and illustrations for later presentation; for example, write notes during a lecture to study before the final exam.

Record spoken words and phrases for later replay; for example, capture verbal reminder to add to a “to-do” list.

Record both speech and pen gestures for later replay; for example, draw a map while speaking the directions or verbally describing how to solve a math equation while rewriting the equation.

User Identification

Register the user’s signature, verify that signature belongs to a registered user, identify a user from among registered users based on his/her signature.

Register the users voiceprint, verify that a voiceprint belongs to a registered user, identify a user from among registered users based on his/her voiceprint.

Increase security and reliability by using both handwriting and voiceprints.

Interpret Content

Handwriting recognition – convert word and phrases written on paper to electronic text.

Dictation – convert speech to electronic text.

The spoken text, “The hidden treasure is at the spot I’m marking with an X” is converted to the electronic text, “hidden treasure at coordinates (x,y).”

Interpret Requests

Convert pen strokes to comands; for example, save information when the user checks a box with the pen

Command and control by converting spoken commands to actions perfomred by a computer, for example, speak “save now” to save information to a file.

Synchronized multimodal input; for example, “reword this (point) paragraph,” “e-mail this form to that person (point to a person’s name or e-mail address)”

Dr. James A. Larson is Manager of Advanced Human Input/Output at Intel Corporation, and author of the book, VoiceXML — Introduction to Developing Speech Applications.