Attention system
Attention systems provide your users with an easy, integrated way to understand what’s happening with their voice interactions within an app. This can be an indication as to what they can interact with, using voice within a game or how to get feedback on their voice input. Creating an effective attention system for your app is essential to creating a good experience for your users. It’s the best way to address the idea that the game is always listening to them.
What is an attention system?
Creating an attention system is the first step in designing an efficient voice system for your app. It makes things easier for your users by providing them with audio and visual cues that let them know the microphone is active. It also creates a way to provide feedback about how your user’s voice commands are received, including responding to errors, reducing the perception of latency, and even just indicating that a voice command has been received. It’s a way for you, as the developer, to deepen the immersive experience that lets your users enjoy themselves more.
Additionally, attention systems are critical to helping your users know when the microphone is “active” and “listening.”
Some basic attention system components are available for you to use in the
Voice SDK Toolkit available on GitHub. These can be valuable to use while you customize the look and feel of your attention system to fit the app style and experience you’re creating.
Mic status cues (required)
Status cues can be used when mapped with the attention system states to visually guide your user and let them know when their mic is on and when the system is ready to receive their voice command input. This helps prevent frustration and misunderstanding about when to speak.
Some basic techniques can be used to show when the mic is on, so your user knows when their audio input is being collected.
Important: The Mic on status should be accurately represented on screen as soon as your app calls for audio input and maintained for the full duration that audio collection is enabled, until the mic is turned off.
A basic mic status attention system can be added to the user’s headset view or to individual objects or characters. More elaborate systems can be used, but the following cover the basic states that should be communicated:
- Mic on: This status covers the Listening (On), Inactive, Processing, and Response mic states.
- Mic off: This status covers the Not Listening mic state.
Using a visual cue or icons as a way to provide audio feedback is an easy way to help your user to gauge and calibrate the volume from their mic so voice commands can be heard and processed properly.
Before using a visual cue like this, you should test them to make sure the animation syncs correctly with audio input devices. You should also test it in the type of environment where you expect your user to use your app.
You can also use audio feedback to reinforce a visual attention system that shows mic states or as an alternate method that can be used when the mic is open for increased accessibility and immersion.
A selection of custom crafted earcon sounds is available to start. When using them, remember that activation earcons will be played frequently in an app, so make sure to use one that is short and easy to listen to repeatedly so your user has a better experience.
The following status cues and icons, combined with animation and earcons, provide an example of how you can use these tools in combination to effectively show voice interaction states.
These elements can be seen working in combination in the following video:
As you continue to explore ways to bring voice interaction into your app experiences, you may consider integrating attention systems directly into character and environment design. For example:
- Character animations: A character has specific expressions and gestures to show active listening or comprehension.
- Environment design: An object glows and moves, indicating interaction or response.
- Dialogue action prompts: The user is prompted and issues voice commands by way of conversational dialogue with an NPC.
- Transcription: Displaying the text transcription feedback of audio input can provide an additional indication as to when the mic is open and receiving input. This can also serve as light user guidance about how voice commands are heard by the system so that your user can then adjust input.
For additional information on using Attention Systems with Voice SDK, see the
Voice SDK Toolkit.