Voice User Interface Interaction: A Usability, Accessibility, and Inclusive Design Review

Cara Tracey
11 min readFeb 26, 2021

Voice user interactions (VUIs)

In recent years, voice user interactions have been emerging in technology. Voice user interfaces allow users to interact with the system and execute actions prompted by voice or speech commands. The advantage of this technology is that it helps reduce or eliminate the hands-on interaction, allowing users to utilize an application or device while cooking dinner, driving, taking a shower, and more.

Left to right: Google Home, Amazon Echo, Apple HomePod

The three most prominent voice user interfaces are the three devices pictured above, the Google Home, Amazon Echo, and Apple HomePod. I was completely amazed at the capability of the Amazon Echo the first time I had ever used one. A few years ago, my family and I took a trip to Ireland to visit our family out there. My dad comes from a large Irish family so as you can imagine family dinners were quite packed and rambunctious. My uncle Tom was showing off his new Amazon Echo and was able to command the device to begin a round of a song-guessing game. The user could choose a decade (the 60s, 70s, 80s, 90s, etc) and the device would play a few seconds of a song from that decade. The user then has a few seconds to collaborate with others and then answer. At this point, the whole family was in on this game and we must have played for hours. Here I was thinking that this device could only turn on lights and tell you the weather!

Don’t get me wrong, these devices have the ability to turn on the lights and tell you the weather but they also have the ability to play games, talk in different languages, find you a new recipe, play audiobooks, book an Uber, control your thermostat, and get you the latest news within seconds.

Let’s look at the research…

For research, I decided to look at the voice user abilities of Apple's iPhone as well as two different applications, Dominos and SideChef.

Apple voice command: “show names” (right), “show numbers” (left)

I wanted to look at Apple’s voice commands and the extent of their abilities. I went to the settings of the iPhone and searched “voice” which suggested the “Voice Control” page to me. This page allows the user to do the following:

  1. Turn voice control on and off
  2. Change the accent of the voice
  3. Customize commands
  4. Teach ‘Voice Control’ new vocabulary
  5. Show Confirmation
  6. Play Sound
  7. Show Hints
  8. Overlay item names and a numbered grid

When the user turns on “Voice Control” they are greeted with a message that introduces them to the general language to say commands and

General language for “Voice Control” commands

Next, I decided to look at Dominos…and maybe order a pizza while I was at it!

Dominos has a virtual voice ordering assistant called Dom who helps the go through the process of ordering a pizza online using voice commands. Dom uses both voice and a transcribed message to communicate with the user. Here is how I ordered my cheese pizza with Dom:

Dom: “Hi, Welcome to Domino’s would you like to place an order or track and order?”

Me: “New order”

Dom: “Great, let’s get started! Would you like carryout or delivery?”

Me: “Carryout”

Dom: “Would you like to login to your Pizza Profile for a faster checkout experience?”

Me: “Continue as Guest”

*gives information — phone number, address, zip code*

Dom: “Now what would you like to order?”

Me: “Build your own pizza”

Dom: “What type of crust do you want for your pizza?”

Me: “Hand Tossed”

Dom: “What size would you like for your Hand Tossed pizza?”

Me: “Small”

Dom: “What toppings would you like?”

Me: “None”

Dom: “Okay, I’ve added a small (10”) Hand Tossed Cheese Pizza to your order. What would you like next?”

Me: “Checkout”

Voila! I just ordered a pizza!

Lastly, I decided to look at SideChef an app that features thousands of recipes with step-by-step photos as well as a voice to guide you through. You can choose a recipe that you desire, down below I chose lime and chili chicken skewer, select “step by step mode”, and have a voice guide you through the steps along with videos as well as written instructions if needed.

Heuristic Analysis

#1: Visibility of system status

The design should always keep users informed about what is going on, through appropriate feedback within a reasonable amount of time.

It is important for the voice user interface to indicate that it is listening to the user’s command. Visual cues can update the user on the system status and indicate when the system is ready for the user's command. Domino’s voice ordering assistant, Dom, ensures that the user knows when to give their intended command. After Dom asks a question to the user, the microphone at the bottom of the screen slowly flashes a blue light indicating that it is the user's turn to respond.

#2: Match between system and the real world

The design should speak the users’ language. Use words, phrases, and concepts familiar to the user, rather than internal jargon. Follow real-world conventions, making information appear in a natural and logical order.

The voice user interface needs to be able to understand common phrases and words spoken in the user's language. Conversations consist of slang and abbreviations which the interface needs to be able to recognize, analyze, and understand. Apple’s “Voice Control” has a setting that allows the user to teach the system new words which can be incorporated into commands.

#3: User control and freedom

Users often perform actions by mistake. They need a clearly marked “emergency exit” to leave the unwanted action without having to go through an extended process.

User interface systems should allow the users an easy exit when executing a command they did not mean to make. A common exit that can be seen across devices is the command “stop” which will terminate the command previously given by the user.

#4: Consistency and standards

Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform and industry conventions.

With all of the user voice interface devices and applications, it is important that the command remain consistent because users do not want to have to learn multiple ways to say one command.

#5: Error prevention

Good error messages are important, but the best designs carefully prevent problems from occurring in the first place. Either eliminate error-prone conditions, or check for them and present users with a confirmation option before they commit to the action.

To ensure that no errors will be made when executing a command, it is important for the user voice interface to confirm the command that has been asked:

User’s Command: “Order an uber”

Device Confirmation: “Are you sure you would like to order an Uber?”

User’s Confirmation: “Yes!”

#6: Recognition rather than recall

Minimize the user’s memory load by making elements, actions, and options visible. The user should not have to remember information from one part of the interface to another. Information required to use the design (e.g. field labels or menu items) should be visible or easily retrievable when needed.

User voice interfaces should be able to use voice recognition to remember the user as well as remember the user's language. If a user customizes the settings of a device with their vocabulary and specific commands, it is important for the device to recognize these changes and execute them.

#7: Flexibility and efficiency of use

Shortcuts — hidden from novice users — may speed up the interaction for the expert user such that the design can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.

User voice interface devices and applications must use shortcuts that allow the user to skip through the beginner tutorials and messages. A simple command that could save a user a few minutes will ultimately shorten the user's interaction while increasing the user's satisfaction.

#8: Aesthetic and minimalist design

Interfaces should not contain information which is irrelevant or rarely needed. Every extra unit of information in an interface competes with the relevant units of information and diminishes their relative visibility.

No one wants a complicated interface, especially with new emerging technology. The user wants to have a quality interaction with the user voice interface. A quality interaction comes from a simple, easy-to-use interface that is understandable and executes the user’s commands with minimal complications and complexities.

#9: Help users recognize, diagnose, and recover from errors

Error messages should be expressed in plain language (no error codes), precisely indicate the problem, and constructively suggest a solution.

A user voice interface needs to have a system in place to further help the user when they are met with errors. This system needs to use understandable language to communicate the error with the user and help them find a solution.

#10: Help and documentation

It’s best if the system doesn’t need any additional explanation. However, it may be necessary to provide documentation to help users understand how to complete their tasks.

A user voice interface should be able to receive questions from the user when they are having an issue or confused about the device or application. The use should be able to receive answers to their questions or be “pointed” in the right direction to a source that could help.

Accessibility Review

Accessible design focuses on the end result of a design, aiming to ensure everyone will be able to access and excuse the interaction. Although optimizing an interface can be difficult, the user voice interactions that I used in my research including the following:

Transcripts

When using user voice interfaces, the user typically has an interaction consisting of multiple back and forths. Typically the users begin the interaction but they may be faced with a confirmation or a question to follow that they need to answer. The use of a transcript allows users of all-seeing abilities to read the words and further understand what is being asked of them before answering.

Ensures options to stop commands

User voice interfaces don’t rely too heavily on hands-on interactions meaning that there may not necessarily be a back button. It is essential for these interfaces to include a command that allows the user, whoever they are, to put a stop to a command. These stop commands provide a safety net for all users to ensure a positive interaction and simple operation.

Provides alternatives to voice-only interactions

As stated above, user voice interfaces don’t rely too heavily on hands-on interactions but that doesn’t mean they shouldn’t include them. It is not about quantity but quality and it is important for user voice interfaces to include an alternative to voice-only interactions, typically seen through buttons, that allow the user to opt-out of the voice interaction and still have an enjoyable exchange.

Inclusivity Analysis

User voice interface devices and applications have allowed millions of people to partake in and complete simple tasks that they may not have been able to complete before.

An estimated 62 million people in the U.S. have motor or mobility impairments. VUIs are now available on hundreds of millions of devices, including smart speakers, headphones, and watches, and are already being used to assist people with managing communication such as emails and texting, and controlling home devices. They are even helping people with muscular dystrophy use their voice to adjust their bed throughout the night to avoid bed sores.

These devices have been able to aid people with motor or mobility impairments and visual impairments or blindness. It has allowed people to complete simple tasks as well as more complex tasks that they have not been able to accomplish on their own and doing so without hands. Checking mail, texting a friend, even adjusting a bed has all become possible with one simple command and in return is has been able to help millions. Not only that, but it can also be used as a great tool for people with medical conditions who need hourly, daily, or weekly reminders about medications and doctor appointments.

Conclusion

No technology is perfect, there are still errors to be found and advances to be made. However, it is clear to say that voice user interfaces will be here to stay and here to better the future. It amazes me how far technology has become and how inclusive and accessible it is becoming in the world we live in. My research has allowed me to see that the system in our pockets is able to ensure a successful voice-controlled interaction for its user that allows them to get from point A to point B solely with the use of their voice. This system in our pockets is also able to hold and store applications which also ensures us these hands-free interactions. Although not every interaction is flawless, technology is working towards an inclusive, accessible user voice interaction.

This article was written in partial fulfillment of the requirements of the Spring 2021 section of DMD 3998 — Emerging Topics in Interaction Design at the University of Connecticut, Digital Media & Design Department.

References

--

--