As the IoT world continues to expand, researchers need to understand what to take into account when evaluating design in this space. In the course of evaluating several smart speakers, I have thought a lot about the design of voice interfaces along with the accompanying physical devices. There are many similarities to standard digital interfaces when setting up these devices (e.g., clear instructions and steps, easy access to support, consistent wording, enough contrast between text and background, clear error symbols, etc.). However, there are also additional aspects to take into consideration, such as the industrial design of the speaker, audio cues, light cues, and the interaction of an accompanying app in conjunction with all of these things. Additional complexity is added for smart speakers that use another company’s voice service (e.g., Sonos using Alexa as its voice service).
Figure 1. Sonos Smart Speaker.
In this article, I highlight several different aspects to consider when evaluating the setup and user experience of smart speakers, including the following:
- accompanying app location
- light and audio cues to indicate progress/processing
- clear images of the device in setup material
- clear instructions, especially when multiple interfaces are involved
- examples of how to use the voice aspect of smart speakers
- access to support
I give examples from existing smart speakers to illustrate these aspects of
smart speaker design. This emerging multi-dimensional technology is exciting, and we must take into account more aspects than we would in traditional digital interfaces.
Opening the Packaging
Because this is likely a new, and potentially complicated, type of device for many users, packaging that is inviting and helpful is important. This includes making it easy to remove from the box, sleek-looking/feeling cardboard that matches the device, and making sure that anything printed on the outside of the box is simple, short, and direct. Because setup should be as intuitive as possible, there should be minimal paper instructions in the box, and any instructions should be easy to understand and provide visual help. For example, if a brief getting started guide is included, there should be an image of the device where buttons are clearly labeled (e.g., volume, microphone, reset, etc.).
Getting the Device on the WiFi Network—Initial Setup
In an ideal world, you would just plug in a smart speaker and it would just work, right? Unfortunately, we are not yet there technologically. Many smart speakers, if not all, require use of an accompanying mobile app to set up the device(s). It should be made clear to the user that this is necessary, whether that information is on the device itself, the packaging, or within the packaging. This critical information should not be obscured by marketing material. Some smart speakers, like Google Home Mini, emits verbal instructions to tell the user to download the app (obviating the need to read anything to start use) as the next step in the setup process.
Figure 2. Google Home Mini speaker.
The setup app should be easy to use. One way of ensuring this is providing a step-by-step progression of what needs to be done to set up the app. Simple yet effective animations also help.
The app should reference what is happening with the device. One good way to do this is to provide visual and aural cues to indicate that something is happening, such as processing, connecting, or activating. There can also be auditory indications that something is happening or has happened (e.g., confirming that Bluetooth is connected). This additional confirmation, which goes beyond blinking or solid lights, serves as a way to reassure users that the system is doing what is expected and provides reassurance to users that they are doing the right thing. Along with this, the app instructions should include a visual and/or aural portrayal of what is occurring with the device. For example, if the user is supposed to see the device flashing a green light, then that is what the app should display; all buttons and lights should be represented so they can be easily identified on the device. This is exactly what Sonos does (see Figure 3). The same concept is also true of audio cues. If audio cues are on the device during setup, they should be indicated within the setup app as well.
Figure 3. Sonos app.
Setting Up Music and Voice Control
Once the smart speaker is set up initially on the wireless network, the “smart” capability must be set up if it is not inherently part of the setup steps. This typically entails connecting music services and voice control. Both of these should be as simple as possible. For devices that use another company’s voice control (e.g., Sonos, which uses Amazon’s Alexa voice control service), this can be tricky. Clear instructions about switching between Alexa and the speaker must be provided. There should also be a clear visual indicator that indicates that the setup of the speaker is still taking place even during the voice control part of setup (if they are two separate processes). In the case of a Harman Kardon Allure speaker, this is done by keeping a black bar consistently on the top part of the app. The continuation of the black banner at the top of the screen, even while logging into Amazon, makes it seem like you are still within the Harman Kardon experience. It makes the experience of switching between the Harman Kardon and Alexa (and vice versa) seamless (see Figure 4). The voice control part of the setup is especially important because full use of the speaker relies on it.
Figure 4. Harman Kardon app showing the integration of Alexa installation.
For music services, it should be clear how to do the following:
- Add the service.
- See which music services have already been subscribed to and added.
- Select and change the default music service.
It should also be clear what order the services are in, and the cost of the music subscriptions should be clear (i.e., free, paid, and if paid, how much).
Connecting Mobile Devices to the Speaker Via Bluetooth
Another feature some smart speakers provide is the ability to play music or other audio content from a smart phone or tablet using wireless Bluetooth capability. The process of what is called pairing the mobile device to the speaker allows this to occur. There is typically a Bluetooth button on the speaker that must be pressed and then the speaker name shows up under the Bluetooth section of the smart phone or tablet. Harman Kardon Allure provides audio information confirming that Bluetooth is connected and gives instructions about how to pair the speaker more easily next time (say, “pair to phone”), which is reassuring and helpful.
Onboarding and Using the Speaker
Once a speaker is set up, it should be made clear to the user that the system is ready for use. At some point along the way, whether during setup or during a post-setup onboarding flow, examples of how the speaker can be used should be given. This is important for getting users started with something that they may have never done before, for example, using a voice service. While users might know some things that can be done with a voice controllable speaker, there may be additional uses they had not considered. Google Home Mini not only gives specific examples of voice commands and questions, for example, “Where is the nearest flower store,” but it provides examples of different categories. So even if the specific questions are not applicable to the users, the categories give them ideas of how to use the device in ways they may not have previously considered (see Figure 5). The categories are Music, Get Answers, Get Stuff Done, and Fun & More.
Figure 5. Google Home Mini app displaying categories and examples of voice commands.
It should also be clear how to use the voice service. For example, does a user have to start the interaction with the device with a “wake” word or phrase, for example, “Alexa” or “Hey Google”? Can the device interpret follow-up questions correctly? For example, if one asks, “What is the weather in Boston today?” and the speaker responds “partly cloudy with a high of 78 degrees,” can the user then ask “What about Seattle?” Or do they have to ask the full question of “What is the weather in Seattle today?” Google Home Mini provides an app setting that allows the user to turn on or off the continuous conversation option that allows interaction that is closer to a human conversational norm, allowing users to get information more efficiently by using follow-up commands.
It should be clear how to raise and lower volume on the speaker manually and through voice interaction, and it should be made clear that the volume was raised or lowered after the user made the request. Google Home Mini emits a beep after setting the volume that lets the user know that the command was “heard” and implemented; the lights also change from all lights being lit to a subset of them being lit when the user changes the volume from 100% to a lower number.
Providing Support for Re-Setup
While most users should be able to use the speaker easily without additional help once it is set up, any good system should provide access to support, especially one that includes new technology and physical devices. A support section within the app should bring to the forefront issues that users might have that are critical to the use of the device. Harman Kardon does this by providing a short list of critical questions and answers. “How to” questions have answers that are never more than three steps. Extensive support is organized and includes imagery along with text.
It should be difficult to accidentally factory reset the speaker, but it should also be made clear on the device itself and/or within the app how to factory reset the device. This is typically done with a small button (e.g., Google Home Mini) or pin hole (e.g., Harman Kardon Allure). When resetting the Harman Kardon Allure, there is a voice confirmation that factory reset was successful and that the system is ready to be set up, which adds value to the user experience.
The Future of Smart Speaker UX Evaluation
This article only begins to touch on smart speaker UX evaluation guidelines. As smart speakers and voice technology continue to evolve, considerations we take into account when evaluating these interfaces and technology will also need to evolve. In addition to evaluating the topics mentioned above, there will need to be consideration of topics such as tension between privacy and easy information access, the integration of smart speakers to other home devices and functions, and the effectiveness of artificial intelligence and machine learning in providing more accurate responses. Effectively evaluating the UX of these technologies is key to moving these technologies forward in ways that work for users.