
Silence is indeed golden, especially when trying to maximize the efficiency of voice applications. By correctly using silence, errors can be minimized, new customers will have adequate time to respond, repeat customers will have the ability to effectively bypass redundant options and customers will have more time to correct misrecognitions. In this month’s VUI View, Angel.com’s Senior Product Manager, Ahmed Bouzid, illustrates the circumstances where silence is necessary and the steps to take to get the most out of your voice application.
The Six Golden Rules of Silence
Silence is for Voice User Interface (VUI) design what the number zero is for algebra. As a concept and a tool, it is at the same time essential, ubiquitous, and taken for granted. In this issue of the VUI View, I highlight the main cases where the use of silences and pauses can contribute to a smoother, more usable VUI.
Take the following brief interaction between an automated IVR stock management application and a human caller.
SYSTEM:
What would you like to do next? You can say, "get quotes," "buy stock,” or "sell stock." You can also say, "speak to a manager."
HUMAN:
Get quotes.
SYSTEM:
Getting quotes. As of 10:25 am, IBM is trading at eighty two dollars and thirty five cents, MicroStrategy at one hundred three dollars and twenty four cents, and Google at three hundred seventy four dollars and thirteen cents.
Let's pinpoint where silences can enhance the usability of the voice interface.
1. Prior to listing menu options
When the system is about to provide the caller with a list of options, a brief pause should be inserted between the announcement prompt and the first option that is played to the listener.
SYSTEM:
What would you like to do next? You can say,
[SILENCE]
"get quotes," "buy stock," or "sell stock."
You can also say,
[SILENCE]
"speak to a manager."
2. Between options in a menu list
When listing options for the user to choose from, separate consecutive options with silences. The pause will give the listener time to decide whether to select the option or wait for the next option.
SYSTEM:
What would you like to do next?
You can say,
[SILENCE]
"get quotes,"
[SILENCE]
"buy stock," or
[SILENCE]
"sell stock." You can also say,
[SILENCE]
"speak to a manager."
3. Between categories of options
In our example, the system plays to the caller three possible stock-related commands to choose from, and then plays one more option for transferring to a manager. Since the fourth option is not a stock-related command, silence should be inserted between the last stock command option and the announcement for the next command, "You can also say..."
SYSTEM:
What would you like to do next? You can say,
[SILENCE]
"get quotes,"
[SILENCE]
"buy stock," or
[SILENCE]
"sell stock."
[SILENCE]
You can also say,
[SILENCE]
"speak to a manager."
4. When interacting with power-users
Most of the callers to the stock-management application we are using for this example are going to be repeat callers – that is, power-users who will not want to listen to all the menu options every time they call. In such heavy power-user applications, use silences prior to listing menu options. In this case, add a silence after, "What would you like to do next?"
SYSTEM:
What would you like to do next?
[SILENCE]
You can say,
[SILENCE]
"get quotes,"
[SILENCE]
"buy stock," or
[SILENCE]
"sell stock."
[SILENCE]
You can also say,
[SILENCE]
"speak to a manager."
5. After echoing
A brief echo from the system of the option selected by the user can serve as a reassuring confirmation that the system understood what the caller said, or, in case of misrecognition, as a quick indication of error. In either case, insert a brief silence after the echo. In case of correct recognition, the silence will prepare the caller for the next prompt, while in case of misrecognition, it will give the user an opportunity to barge-in with a correction. (Of course, you will need to configure an error strategy that can elegantly recover from such an error.)
SYSTEM:
What would you like to do next?
[SILENCE]
You can say,
[SILENCE]
"get quotes,"
[SILENCE]
"buy stock," or
[SILENCE]
"sell stock."
[SILENCE]
You can also say,
[SILENCE] "speak to a manager."
HUMAN:
Get quotes.
SYSTEM:
Getting quotes.
[SILENCE]
As of 10:25 am…
6. Before and after Text To Speech (TTS) prompts
As we have mentioned in a previous newsletter avoid mixing recorded prompts with computerized, TTS prompts. Mixed prompts make for an unpleasant audio experience and should not be used whenever avoidable. In cases where you have no choice but to mix human-recorded and computer-generated prompts, insert a pause between the recorded prompts and the TTS prompts. The silence will alleviate the jarring transition and will increase the level of listener comprehension.
SYSTEM:
Getting quotes.
[SILENCE]
As of 10:25 am.
[SILENCE]
IBM is trading at
[SILENCE]
eighty two dollars and thirty five cents
[SILENCE]
MicroStrategy at
[SILENCE]
one hundred and three dollars and twenty four cents
[SILENCE]
and Google at
[SILENCE]
three hundred seventy four dollars and thirteen cents
Here is the entire interaction, with all silences inserted:
SYSTEM:
What would you like to do next?
[SILENCE]
You can say,
[SILENCE]
"get quotes,"
[SILENCE]
"buy stock," or
[SILENCE]
"sell stock."
[SILENCE]
You can also say, "speak to a manager."
HUMAN:
Get quotes.
SYSTEM:
Getting quotes.
[SILENCE]
As of 10:25 am
[SILENCE]
IBM is trading at
[SILENCE]
eighty two dollars and thirty five cents
[SILENCE]
MicroStrategy at
[SILENCE]
one hundred and three dollars and twenty four cents
[SILENCE]
and Google at
[SILENCE]
three hundred seventy four dollars and thirteen cents
If you are interested in professional assistance with optimizing your voice applications, feel free to contact me at bouzid@angel.com or call 1-888-MYANGEL (1-888-692-6435) and ask for "technical support".
