This page presents several best practices to create an immersive and engaging experience using the Voice SDK.
General Best Practices
Use Wit.ai to manage your app versioning so you can work on the next version while still having a stable production version. Wit allows you to control your app versions through the API or by using the versioning panel on the settings page of the app. Versions are represented by tags on a timeline and you can target a specific version by defining a tag in the API request parameter. For more information, see the Recipe section of the Wit.ai documentation.
Use built-in intents, entities, and traits to help you bootstrap the voice app experience in your app. These built-in NLP (Natural Language Processing) options are prebuilt and already trained, so they can make your app development faster and easier. For more information, see Built-In NLP.
If you’re using some form of on-device ASR (Automatic Speech Recognition) for transcriptions, use the Activate(string) method for activation of your App Voice Experience. This method takes the content of a provided string and sends it to Wit.ai for the NLU to process. For more information, see Activation.
Improving Your Results
When you first create an app, your initial results may not be very accurate, and you may have to try several times before it recognizes what you say. Try the following suggestions to help improve your results.
Improve the quality of your microphone and reduce the ambient noise in your room.
Listen to the log of attempted utterances on the Understanding tab of Wit.ai. Enter the correct transcriptions to help train Wit.ai to better recognize your voice commands.
Manually enter additional synonyms (words or phrases) for Wit.ai to choose from, into the training portion of the Understanding tab. For example, adding common color names to bias results toward words that are colors supported by the app.
Design Best Practices
Include a simple and clear way to trigger voice interactions when designing your app. This can include:
Something clear and visible in the UI that the user can select to invoke voice interactions, like a microphone icon, or an exclamation over a character’s head.
A clear action the user can perform that’s immersed in the game to trigger the interaction, such as rubbing a magic lamp or talking to an NPC.
A common action such as an eye gaze combined with a gesture to trigger the interaction, such as looking at an NPC and waving.
A hand gesture, either by itself or combined with eye tracking to start the interaction, such as waving a magic wand to say a spell.
Always indicate to the user when the microphone is active. This is a very important part of creating a user-friendly app voice experience, and can be something simple, like a sound or graphic indicating microphone status.
Improve discoverability and usability for voice interactions in your app by building some in-app user education, teaching the user what they can say and do with their voice.