Skip to main content

Voice Recognition is one of the hottest trends in the era of Natural User Interfaces. When used wisely, speech recognition is an effective and intuitive means of communication. As Virtual and Augmented Reality emerge, voice recognition is becoming a vital communication method between the human and the computer. Microsoft is now bringing Cortana, the high-tech virtual assistant, to every platform, including PCs, phones, XBOX, and, of course, HoloLens.

Today, I’ll show you how to develop a simple and open-source voice recognition game using C# and Unity3D. Here’s the end result:

To run the project, you’ll need the following software components:

In my setup, I am using Unity 2019 with Visual Studio Community 2019.

Did you know?…

LightBuzz has been helping Fortune-500 companies and innovative startups create amazing Unity3D applications and games. If you are looking to hire developers for your project, get in touch with us.

Source code

The source code of the project is available in our LightBuzz GitHub account. Feel free to download, fork, and even extend it!

Download on GitHub

Step-by-step tutorial

To demonstrate the speech recognition capabilities, we’ll create a very simple demo. In a nutshell, here’s what we are going to do:

The user will be prompted to move a visual element on the screen. To move the element, she’ll need to say the direction: up, down, left, or right. The app will be listening for voice commands and will change the direction of the object when a voice command is recognized.

Step #1 – Create the User Interface

Launch Unity, create a new project, and add the following elements inside a Unity Canvas:

  • A background image
  • A movable object (in our case, a bee)
  • Some text prompts

Speech Recognition Game UI

Specify the desired keywords

A speech recognition engine works best when you feed it with as few words as possible. Instead of searching the whole spectrum of the English language, we shall limit it to a small range. For our game, we need only four keywords: up, down, left, and right.

We can also specify the confidence level of the speech recognition engine. The confidence level is a value that indicates how ambiguous words should be treated. Use Medium or High for native English speakers, and Low for non-native speakers.

The third variable is the speed of the object.

All of these members are declared public, so you can edit them right into the Unity Editor.

public string[] keywords = new string[] { "up", "down", "left", "right" };
public ConfidenceLevel confidence = ConfidenceLevel.Medium;
public float speed = 1.0f;

Finally, we declare a variable for the word that was recognized:

protected string word = "right";

Step #3 – Use the Keyword Recognizer

Now, I would like to introduce you to the KeywordRecognizer class. The Keyword Recognizer encapsulates the voice recognition engine.

When the application starts, we initialize an instance of the recognizer by providing the keywords and confidence level to its constructor. Whenever a word is recognized, the OnPhraseRecognized event handler shall be called. To enable voice recognition, you need to call the Start() method:

private void Start()
{
    if (keywords != null)
    {
        recognizer = new KeywordRecognizer(keywords, confidence);
        recognizer.OnPhraseRecognized += Recognizer_OnPhraseRecognized;
        recognizer.Start();
    }
}

Remember to call the Stop() method when the application closes:

private void OnApplicationQuit()
{
    if (recognizer != null && recognizer.IsRunning)
    {
        recognizer.OnPhraseRecognized -= Recognizer_OnPhraseRecognized;
        recognizer.Stop();
    }
}

Here is the OnPhraseRecognized event handler. The PhraseRecognitionEventArgs parameter provides us with the exact text that was recognized:

private void Recognizer_OnPhraseRecognized(PhraseRecognizedEventArgs args)
{
    word = args.text;
    results.text = "You said: <b>" + word + "</b>";
}

Step #4 – Add some game logic

Since we know the recognized word, we can now move the visual element on our Canvas. All we need to do is change the position of the game object:

private void Update()
{
    var x = target.transform.position.x;
    var y = target.transform.position.y;
    switch (word)
    {
        case "up":
            y += speed;
            break;
        case "down":
            y -= speed;
            break;
        case "left":
            x -= speed;
            break;
        case "right":
            x += speed;
            break;
    }
    target.transform.position = new Vector3(x, y, 0);
}

Step #5 – Player Settings

This is a tricky one and it can be easily missed. To enable voice commands in your Windows app, you need to add the Microphone capability under the Player Settings.

  • Go to File → Build Settings
  • Select Player Settings
  • Click the green Windows Store icon
  • Find the Capabilities tab
  • Check Microphone

Step #6 – Export

KeywordRecognizer is available for Windows Standalone and Windows Store Universal (Windows 10). The exported project will also work in Windows Store 8.1, however, the speech recognition features will not be available in Windows 8.1.

Speech Recognition Unity Export

Of course, the API is compatible with Universal Windows Platform, so the exported app can run on:

  • PC
  • Phone
  • XBOX
  • HoloLens

The app can run offline, too, using Cortana’s speech infrastructure. Unlike Siri, no active Internet connection is required.

Speech Recognition - Unity Game

Summary

In this tutorial, we learned how to use Unity’s KeywordRecognizer to understand voice commands and integrate them into a Windows application.

Source code

You made it to this point? Awesome! Here is the source code for your convenience.

Download the source code on GitHub
Subscribe on YouTube

Before you go…

LightBuzz has been helping Fortune-500 companies and innovative startups create amazing Unity3D applications and games. If you are looking to hire developers for your project, get in touch with us.

Sharing is caring!

If you liked this article, remember to share it on social media, so you can help other developers, too! Also, let me know your thoughts in the comments below. ‘Til the next time… keep coding!

Michail Moiropoulos

Michail is a Unity Specialist with significant experience in game development and motion technology. His professional experience includes Kinect, HoloLens, Oculus, Leap Motion, and RealSense.

51 Comments

  • Blair Adams says:

    This looks amazing! Thanks for sharing. Do you know if this would be compatible with a Vive application?

    • Hi Blair! Thanks for your kind words and support.
      Speech recognition is part of Unity and it runs on the application side, while VR is only a “projection on a headset monitor”. I strongly believe that running on Vive won’t be an issue. After all it’s a PC game and all you need is a microphone!

    • Omar says:

      How to make it work on the Android system?

    • Omar says:

      How to make it work on the Android system?

  • Blair Adams says:

    Ahh. Ok. I do remember seeing something about speech recognition a while back in Unity. I’m just surprised we haven’t seen more of it turn up inside of VR yet. Maybe the processing is heavy? I’ll have to try an experiment or two. Thanks.

  • Jake Aquilina says:

    Hello i seem to have an issue, im running unity 5.6 and windows 10, trying to run it in the editor and getting this error

    UnityException: Speech recognition is not supported on this machine.
    UnityEngine.Windows.Speech.PhraseRecognizer.CreateFromKeywords (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/artifacts/generated/common/runtime/SpeechBindings.gen.cs:43)
    UnityEngine.Windows.Speech.KeywordRecognizer..ctor (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/Runtime/Export/Windows/Speech.cs:221)
    UnityEngine.Windows.Speech.KeywordRecognizer..ctor (System.String[] keywords) (at C:/buildslave/unity/build/Runtime/Export/Windows/Speech.cs:201)

  • Brian says:

    Hello, I have an issue when running the source code on Unity 5.6.0f3 and windows 10. The error is

    UnityException: Speech recognition is not supported on this machine.
    UnityEngine.Windows.Speech.PhraseRecognizer.CreateFromKeywords (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/artifacts/generated/common/runtime/SpeechBindings.gen.cs:47)
    UnityEngine.Windows.Speech.KeywordRecognizer..ctor (System.String[] keywords, ConfidenceLevel minimumConfidence) (at C:/buildslave/unity/build/Runtime/Export/Windows/Speech.cs:221)
    SpeechRecognitionEngine.Start () (at Assets/Scripts/SpeechRecognitionEngine.cs:23)

    And I had enabled the Microphone capability in Player Settings like this (image:http://imgur.com/a/X0mjC). But the error still occurred.

  • Massimo Fattorusso says:

    how one can set language? is inherited from the language of the unity project?

  • Amjad says:

    It can be applied to Android phones?

  • Andrew says:

    I tried to run this project and it seems that Unity does not recognize my microphone ( or something like that) because it does not get any command. Any idea why and how I can fix that?

  • Veronica Furukawa says:

    Hi. I’m developing a fully immersive VR game using speech to text, a conversational agent, and then text-to speech. The object is to have an NPC that can have full conversations with you. Originally, I jumped on the IBM Watson bandwagon, but the lag with trying to integrate 3 APIs the serializing and deserializing the data was too much. Not to mention the trouble I had with the very spare documentation from IBM. I ran across your page and am now going to this method. Thank you kind sir!

    • Tiago says:

      Dear Veronica Furukawa,

      I’m trying to develop a NPC with the same features listed by you. How did your project go? Do you have any link to share it (github, etc.). I’m dealing with a lot of compatibility issues. Please, if this message reaches you, get in touch.

      Best Regards,
      Tiago

  • Teresa says:

    Hi,

    Great work!

    Do you know other plugins that may work on Android? The one you metion above is no longer avalaible (Unfortunately, Speech-to-Text is no longer available.).

    Thanks 🙂
    Teresa.

  • emil delacruz says:

    hi can you make a tutorial for android? We need that for our capstone project, i can’t find any tutorial for that, plss i’m begging you.

  • ajay says:

    hi i am trying something like button press voice search and
    using voice i need to keyword search

    Eg : when i say animation, what are details regarding animation inside that project, it should display by points at the bottom and i need to able click and read the text

    kindly guide me how to create that i am new to this voice recognition in unity

    Thanks and it is great tutorial and it works fine

  • Brian H says:

    Fantastic tutorial, thanks!

    I have a question: Is it possible to use audio files as input instead of the microphone? I’ve been looking at DictationRecognizer and it seems to only support speech input from microphone only.

    Any advise will be great, thank you!

  • Prasanna N says:

    Very helpful!
    Thanks much!

    I am developing a voice enabled media player for unity,I am stuck in this part i.e adding voice commands to control the video controls of media player,how can I make it work like when I say play it should play and it should pause when I say pause

    I am very new to this
    It will be of great help for me if you reply me with helpful information

    I am having an idea of creating a play button for the video player and dumping the script of c#code which contains keyword recognition to the player and adding a button to the script and when I say play it should play!

    I am in very much need of this code if anyone can help me with this code I will be so thankful to you

    Please anyone help me

    Reply to Prasanna011124@gmail.com

  • Stefan says:

    Hey,
    Thank you very much for the good work!
    Could you tell me if it is possible to learn the system new keywords it doesn’t know (e.g. some special names)?
    It would be pretty cool!
    Thank you 🙂

  • cemile says:

    ben yanlış kelime söylediğinde nasıl geri dönüt verebilir onu öğrenmek istiyorum

  • Anna says:

    Hi,

    I’m a student working on a game project, can I use your code tweak a little and sell it? Would i need to buy any license from you or? Is mentioning in the credits is enough?

    Thank you

    • Hi Anna. The code is part of the Unity3D engine, which has commercial licenses. You can use the code in any way you like. No credit is required, even though we would definitely appreciate it.

  • byrontik says:

    Thanks this has been super helpful.

  • Hello World! says:

    Hi, is it just me or does it only use the system default microphone on Windows 10?

  • Heromoga2000 says:

    Can it be deployed to real Hololens ?

  • HanJaeHwan says:

    hello, sorry for bad English… 🙁
    I try to this project into VR project.
    I using samsung oddessey which based on Window Mixed Reality.

    without put on Headset, voice recognition works good as I expect.
    but, when I put on Headset on my head, voice recognition doesn’t work at all.

    I checked window’s voice and mic option several times and there is no clue…
    is there any way to use voice recognition during VR situation?

    thanks a lot share this project. It helps a lot!!!!:)

  • shaho says:

    I want to work on virtual reality. I want to talk to Unity about converting speech to text and working on conversations. please guide me.
    please send my email. thanks.
    my email : shaho1763@gmail.com

  • melon says:

    thx for tutorial!!
    hi. I am a Korean who is studying Unity
    I am follow the video and working it.
    but “up” is low in recognition and if I put in another word, it won’t recognize at all.
    Of course, my pronunciation is not good, but I don’t recognize simple words such as apples, bananas, one or two.
    Is it correct that it works well even if other words are added?
    I was not good at English, so I used a translator. I’m sorry.

  • Brandy Wong says:

    Hi, I’s trying to make a VR game with voice commands, is there a way to set special words or fake words? i.e. Magic spells and names?

  • KZ says:

    Hello, very nice project. I have been able to implement it into my game. I have two issues. The first is that once the project is build it only works (the voice recognition) for a few sessions then it stops working completely the game is still playable but the voice to text part does not. The second issue is I can’t use profanity is there a way aroud this?

  • KZ says:

    Hello Vangos, I have everything working only thing is I can’t figure out how to remove the profanity filter, can it be done and if yes can you point me in the right direction please.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.