Voice Recognition is one of the hottest trends in the era of Natural User Interfaces. When used wisely, speech recognition is an effective and intuitive means of communication. As Virtual and Augmented Reality emerge, voice recognition is becoming a vital communication method between the human and the computer. Microsoft is now bringing Cortana, the high-tech virtual assistant, to every platform, including PCs, phones, XBOX, and, of course, HoloLens.
Today, I’ll show you how to develop a simple and open-source voice recognition game using C# and Unity3D. Here’s the end result:
To run the project, you’ll need the following software components:
- Unity3D 5.5 or later
- Visual Studio 2015 or later
- Windows 10
In my setup, I am using Unity 2019 with Visual Studio Community 2019.
Did you know?…
LightBuzz has been helping Fortune-500 companies and innovative startups create amazing Unity3D applications and games. If you are looking to hire developers for your project, get in touch with us.
Source code
The source code of the project is available in our LightBuzz GitHub account. Feel free to download, fork, and even extend it!
Step-by-step tutorial
To demonstrate the speech recognition capabilities, we’ll create a very simple demo. In a nutshell, here’s what we are going to do:
The user will be prompted to move a visual element on the screen. To move the element, she’ll need to say the direction: up, down, left, or right. The app will be listening for voice commands and will change the direction of the object when a voice command is recognized.
Step #1 – Create the User Interface
Launch Unity, create a new project, and add the following elements inside a Unity Canvas:
- A background image
- A movable object (in our case, a bee)
- Some text prompts
Specify the desired keywords
A speech recognition engine works best when you feed it with as few words as possible. Instead of searching the whole spectrum of the English language, we shall limit it to a small range. For our game, we need only four keywords: up, down, left, and right.
We can also specify the confidence level of the speech recognition engine. The confidence level is a value that indicates how ambiguous words should be treated. Use Medium or High for native English speakers, and Low for non-native speakers.
The third variable is the speed of the object.
All of these members are declared public, so you can edit them right into the Unity Editor.
public string[] keywords = new string[] { "up", "down", "left", "right" }; public ConfidenceLevel confidence = ConfidenceLevel.Medium; public float speed = 1.0f;
Finally, we declare a variable for the word that was recognized:
protected string word = "right";
Step #3 – Use the Keyword Recognizer
Now, I would like to introduce you to the KeywordRecognizer class. The Keyword Recognizer encapsulates the voice recognition engine.
When the application starts, we initialize an instance of the recognizer by providing the keywords and confidence level to its constructor. Whenever a word is recognized, the OnPhraseRecognized event handler shall be called. To enable voice recognition, you need to call the Start() method:
private void Start() { if (keywords != null) { recognizer = new KeywordRecognizer(keywords, confidence); recognizer.OnPhraseRecognized += Recognizer_OnPhraseRecognized; recognizer.Start(); } }
Remember to call the Stop()
method when the application closes:
private void OnApplicationQuit() { if (recognizer != null && recognizer.IsRunning) { recognizer.OnPhraseRecognized -= Recognizer_OnPhraseRecognized; recognizer.Stop(); } }
Here is the OnPhraseRecognized event handler. The PhraseRecognitionEventArgs parameter provides us with the exact text that was recognized:
private void Recognizer_OnPhraseRecognized(PhraseRecognizedEventArgs args) { word = args.text; results.text = "You said: <b>" + word + "</b>"; }
Step #4 – Add some game logic
Since we know the recognized word, we can now move the visual element on our Canvas. All we need to do is change the position of the game object:
private void Update() { var x = target.transform.position.x; var y = target.transform.position.y; switch (word) { case "up": y += speed; break; case "down": y -= speed; break; case "left": x -= speed; break; case "right": x += speed; break; } target.transform.position = new Vector3(x, y, 0); }
Step #5 – Player Settings
This is a tricky one and it can be easily missed. To enable voice commands in your Windows app, you need to add the Microphone capability under the Player Settings.
- Go to File → Build Settings
- Select Player Settings
- Click the green Windows Store icon
- Find the Capabilities tab
- Check Microphone
Step #6 – Export
KeywordRecognizer is available for Windows Standalone and Windows Store Universal (Windows 10). The exported project will also work in Windows Store 8.1, however, the speech recognition features will not be available in Windows 8.1.
Of course, the API is compatible with Universal Windows Platform, so the exported app can run on:
- PC
- Phone
- XBOX
- HoloLens
The app can run offline, too, using Cortana’s speech infrastructure. Unlike Siri, no active Internet connection is required.
Summary
In this tutorial, we learned how to use Unity’s KeywordRecognizer to understand voice commands and integrate them into a Windows application.
Source code
You made it to this point? Awesome! Here is the source code for your convenience.
Before you go…
LightBuzz has been helping Fortune-500 companies and innovative startups create amazing Unity3D applications and games. If you are looking to hire developers for your project, get in touch with us.
Sharing is caring!
If you liked this article, remember to share it on social media, so you can help other developers, too! Also, let me know your thoughts in the comments below. ‘Til the next time… keep coding!
This looks amazing! Thanks for sharing. Do you know if this would be compatible with a Vive application?
Hi Blair! Thanks for your kind words and support.
Speech recognition is part of Unity and it runs on the application side, while VR is only a “projection on a headset monitor”. I strongly believe that running on Vive won’t be an issue. After all it’s a PC game and all you need is a microphone!
How to make it work on the Android system?
How to make it work on the Android system?
Ahh. Ok. I do remember seeing something about speech recognition a while back in Unity. I’m just surprised we haven’t seen more of it turn up inside of VR yet. Maybe the processing is heavy? I’ll have to try an experiment or two. Thanks.
Hello i seem to have an issue, im running unity 5.6 and windows 10, trying to run it in the editor and getting this error
UnityException: Speech recognition is not supported on this machine.
UnityEngine.Windows.Speech.PhraseRecognizer.CreateFromKeywords (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/artifacts/generated/common/runtime/SpeechBindings.gen.cs:43)
UnityEngine.Windows.Speech.KeywordRecognizer..ctor (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/Runtime/Export/Windows/Speech.cs:221)
UnityEngine.Windows.Speech.KeywordRecognizer..ctor (System.String[] keywords) (at C:/buildslave/unity/build/Runtime/Export/Windows/Speech.cs:201)
Hi Jake. Have you enabled the Microphone capability in your Player Settings?
Hello Jake. Seems your computer is not supporting Speech Recognition. Consider enabling Cortana or updating Windows to a newer version.
Hello, I have an issue when running the source code on Unity 5.6.0f3 and windows 10. The error is
UnityException: Speech recognition is not supported on this machine.
UnityEngine.Windows.Speech.PhraseRecognizer.CreateFromKeywords (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/artifacts/generated/common/runtime/SpeechBindings.gen.cs:47)
UnityEngine.Windows.Speech.KeywordRecognizer..ctor (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/Runtime/Export/Windows/Speech.cs:221)
SpeechRecognitionEngine.Start () (at Assets/Scripts/SpeechRecognitionEngine.cs:23)
And I had enabled the Microphone capability in Player Settings like this (image:http://imgur.com/a/X0mjC). But the error still occurred.
Hi Brian. Have you enabled Developer Mode in Windows 10?
Yes, I had enabled it. But the error still the same one.
What version of Windows are you running? Voice Recognition works with Windows 10 Anniversary Update or later. For example, the version of Windows on my computer is 1607 (Settings → System → About).
I’m using Win10 Business and version is 1607.
Couldn’t find any information related to your issue. Does the version of Unity you are using include Windows Store 10 support?
how one can set language? is inherited from the language of the unity project?
Hi Massimo. KeywordRecognizer does not have an API to specify the language. You could use the native UWP speech API to specify the language at runtime.
Hi Massimo,
Did you managed to add more languages?
@Vangos thanks for this great example!
It can be applied to Android phones?
Hi Amjad. This works on Windows only. It’s using the latest Windows 10 speech capabilities.
For Android and other mobile platforms check this plugin.
I tried to run this project and it seems that Unity does not recognize my microphone ( or something like that) because it does not get any command. Any idea why and how I can fix that?
Hi Andrew. You can check the micropohne devices list to ensure Unity is detecting the right microphone.
Hi. I’m developing a fully immersive VR game using speech to text, a conversational agent, and then text-to speech. The object is to have an NPC that can have full conversations with you. Originally, I jumped on the IBM Watson bandwagon, but the lag with trying to integrate 3 APIs the serializing and deserializing the data was too much. Not to mention the trouble I had with the very spare documentation from IBM. I ran across your page and am now going to this method. Thank you kind sir!
Dear Veronica Furukawa,
I’m trying to develop a NPC with the same features listed by you. How did your project go? Do you have any link to share it (github, etc.). I’m dealing with a lot of compatibility issues. Please, if this message reaches you, get in touch.
Best Regards,
Tiago
Hi,
Great work!
Do you know other plugins that may work on Android? The one you metion above is no longer avalaible (Unfortunately, Speech-to-Text is no longer available.).
Thanks 🙂
Teresa.
Hi Teresa. You can try the Android Speech TTS Unity plugin, which also added offline support for the English language.
hi can you make a tutorial for android? We need that for our capstone project, i can’t find any tutorial for that, plss i’m begging you.
Hello. You can use the Android Speech TTS plugin.
hi i am trying something like button press voice search and
using voice i need to keyword search
Eg : when i say animation, what are details regarding animation inside that project, it should display by points at the bottom and i need to able click and read the text
kindly guide me how to create that i am new to this voice recognition in unity
Thanks and it is great tutorial and it works fine
Hello Ajay and thanks for your comment. The voice recognition code would remain almost the same. To display your results, I suggest you get started with the new Unity UI system and animations.
Fantastic tutorial, thanks!
I have a question: Is it possible to use audio files as input instead of the microphone? I’ve been looking at DictationRecognizer and it seems to only support speech input from microphone only.
Any advise will be great, thank you!
Thank you for your comment. As of now, Unity only supports input from the microphone only.
Very helpful!
Thanks much!
I am developing a voice enabled media player for unity,I am stuck in this part i.e adding voice commands to control the video controls of media player,how can I make it work like when I say play it should play and it should pause when I say pause
I am very new to this
It will be of great help for me if you reply me with helpful information
I am having an idea of creating a play button for the video player and dumping the script of c#code which contains keyword recognition to the player and adding a button to the script and when I say play it should play!
I am in very much need of this code if anyone can help me with this code I will be so thankful to you
Please anyone help me
Reply to Prasanna011124@gmail.com
Hey,
Thank you very much for the good work!
Could you tell me if it is possible to learn the system new keywords it doesn’t know (e.g. some special names)?
It would be pretty cool!
Thank you 🙂
ben yanlış kelime söylediğinde nasıl geri dönüt verebilir onu öğrenmek istiyorum
Hi,
I’m a student working on a game project, can I use your code tweak a little and sell it? Would i need to buy any license from you or? Is mentioning in the credits is enough?
Thank you
Hi Anna. The code is part of the Unity3D engine, which has commercial licenses. You can use the code in any way you like. No credit is required, even though we would definitely appreciate it.
Thanks this has been super helpful.
Hi, is it just me or does it only use the system default microphone on Windows 10?
Hello. It’s using the default microphone.
Can it be deployed to real Hololens ?
Yes!
hello, sorry for bad English… 🙁
I try to this project into VR project.
I using samsung oddessey which based on Window Mixed Reality.
without put on Headset, voice recognition works good as I expect.
but, when I put on Headset on my head, voice recognition doesn’t work at all.
I checked window’s voice and mic option several times and there is no clue…
is there any way to use voice recognition during VR situation?
thanks a lot share this project. It helps a lot!!!!:)
I want to work on virtual reality. I want to talk to Unity about converting speech to text and working on conversations. please guide me.
please send my email. thanks.
my email : shaho1763@gmail.com
thx for tutorial!!
hi. I am a Korean who is studying Unity
I am follow the video and working it.
but “up” is low in recognition and if I put in another word, it won’t recognize at all.
Of course, my pronunciation is not good, but I don’t recognize simple words such as apples, bananas, one or two.
Is it correct that it works well even if other words are added?
I was not good at English, so I used a translator. I’m sorry.
Pronunciation is quite important. However, you can change the ConfidenceLevel to Low. This will allow for easier recognition.
Hi, I’s trying to make a VR game with voice commands, is there a way to set special words or fake words? i.e. Magic spells and names?
Hello, very nice project. I have been able to implement it into my game. I have two issues. The first is that once the project is build it only works (the voice recognition) for a few sessions then it stops working completely the game is still playable but the voice to text part does not. The second issue is I can’t use profanity is there a way aroud this?
Thanks for your comment. Ensure you are opening and closing the speech session properly, especially when navigating from/to a new scene.
Hello Vangos, I have everything working only thing is I can’t figure out how to remove the profanity filter, can it be done and if yes can you point me in the right direction please.