I know… This is what you’ve expected most. The reason you purchased an Azure Kinect camera in the first place. Throughout my Kinect Masterclass articles, I have been covering the various aspects of the Kinect device, helping you understand how this magnificent sensor works. Body-Tracking is what made Kinect popular back in 2009. Without further ado, I am going to show you how to track a human body in 3D.
The video below shows exactly what we’ll develop: a body-tracking application that allows you to view the skeleton from multiple angles in three-dimensional space.
Prerequisites
To run the demos, you need a computer with the following specifications:
- 7th Gen Intel® CoreTM i5 Processor (Quad Core 2.4 GHz or faster)
- 4 GB Memory
- NVIDIA GeForce GTX 1070 or better
- Dedicated USB3 port
- Windows 10
To write and execute code, you need to install the following software:
Did you know?…
LightBuzz has been helping Fortune-500 companies and innovative startups create amazing body-tracking applications and games. If you are looking to get your business to the next level, get in touch with us.
How Body Tracking works
Before diving into the code, it’s worth understanding how exactly body-tracking works and what kind of skeleton data Kinect can provide.
What is Body Tracking?
Body tracking is the ability to detect skeleton joints using depth or color image data. The Kinect technology can identify the coordinates of the points which belong to a specific person and output their positions in 3D. That kind of information can be used in a variety of fields. In healthcare and fitness, developers can measure the range of motion and provide smart rehabilitation. In manufacturing, Kinect systems can analyze worker behavior, performance, and safety. When used in Robotics, autonomous systems can map their surroundings and imitate human movement.
Artificial Intelligence and Machine Learning at your service!
The original Kinect for XBOX 360 had an exceptionally memorable pitch to describe its functionality: “you are the controller.” Microsoft envisioned a future beyond keyboards and mice. It was a future of natural interaction with computers. Even though that vision came true via the HoloLens device, Kinect set the path to natural user interaction due to its remarkable skeleton tracking functionality.
The software is more heavily relying on Machine Learning and starts with a 2D approach.
- First, the Azure Kinect SDK is acquiring the depth and infrared images.
- Then, it feeds the infrared image to a Neural Network and extracts the 2D joint coordinates and the silhouette of the users.
- Each 2D pixel is assigned the corresponding depth value from the depth frame, thus giving its position in the 3D space.
- The results are post-processed to produce accurate human body skeletons.
You can check the official Microsoft presentation here.
Thankfully, all of the heavy-lifting is done internally by the Azure Kinect Body Tracking SDK. There is no need to mess with the internals of the AI algorithms. All we need to do is use the proper SDK methods and, boom, we access the data!
Structure of a human body
So, what kind of data do we have available? Well, each body instance has a unique identifier (ID) and a collection of joints. The ID of the body is simply a numeric value that distinguishes one body from another. The joint collection holds a list of joint structures with their corresponding properties. Let’s explore the members of the joint structure further.
Joint Type (ID)
The ID is the unique name or type of each joint. In C#, the IDs of the joints are exposed in the JointType
enumeration. The image below illustrates all of the tracked joints.
Tracking State (Confidence)
People may stand in front of the camera in a lot of different ways. There are inconvenient cases where not every single joint is visible. Some joints may be outside the field of view or even behind physical objects. Other joints may move too quickly. In either case, developers need to know whether a joint is tracked reliably before using it. That’s why the Joint structure includes a property named TrackingState
. The TrackingState
allows us to know just how well Kinect is monitoring each joint. There are four levels of confidence:
- High – Kinect is tracking this joint reliably.
- Medium – Kinect is tracking the joint with average confidence.
- Low – The joint is probably occluded, so Kinect is predicting its position. A joint with low confidence it’s not visible. Instead, the SDK is internally trying to estimate its coordinates based on neighboring joints.
- None – The joint is totally off the field of view.
As a software developer, you need to take the confidence levels seriously. Imagine you are working on a healthcare application, and you are trying to measure the range of motion of the spine. If the spine joints have a confidence level of Low or None, the measurements will be irrelevant. Before accessing vital information, always check the confidence level of the joints that matter!
Position
The Azure Kinect SDK is providing the coordinates of joints in the 3D space. What are those coordinates, exactly? The position of a joint is a set of three values: X, Y, and Z. The X, Y, and Z values are measured relative to the 3D Cartesian System. More specifically:
- X – The horizontal coordinate
- Y – The vertical coordinate
- Z – The depth coordinate
If you don’t remember the Cartesian System from your high school Math class, don’t worry. The Unity3D Editor Scene view is built around the Cartesian System!
Orientation
Lastly, the joint structure includes the Orientation
property. Orientation describes the rotation of a joint relative to an axis and is used to determine its rotation.
Working with Body data
It’s time to launch Untiy3D and Visual Studio to start writing a few lines of code. As we’ve seen in all of my Masterclass articles, we first need to instantiate a KinectSensor
object. Here’s how to open and close the device in Unity’s Start()
and OnDestroy()
methods, respectively.
private KinectSensor _sensor; private void Start() { _sensor = KinectSensor.GetDefault(); _sensor.Open(); } private void OnDestroy() { _sensor?.Close(); }
Now, in Unity’s Update()
method, we are going to grab the latest Kinect frame and extract its data. The code below shows you how to:
- Acquire a Kinect frame.
- Get the skeleton data.
- Loop into the available skeleton objects.
- Display the position of the head joint.
private void Update() { Frame frame = _sensor.Update(); if (frame == null) return; List<Body> bodies = frame.BodyFrameSource?.Bodies; foreach (Body body in bodies) { Joint head = body.Joints[JointType.Head]; TrackingState confidence = head.TrackingState; Vector3 position = head.Position; Quaternion orientation = head.Orientation; Debug.Log($"Head joint" + $"Confidence: {confidence}, " + $"Position: {position}, " + $"Orientation: {orientation}."); } }
Piece of cake, huh?
Displaying Body data in Unity3D
Unity3D is a very hand engine when it comes to visualizing data in the 3D world. Unity’s coordinate system matches Kinect’s, and the physical units are measured in meters. You can place objects in your 3D scene, safely assuming it’s the real world. There is one caveat: Kinect does not have negative depth (Z) values. A negative value would mean that an object is located behind the sensor. Something like that is not possible since Kinect cannot “see” behind itself.
The Azure Kinect SDK for Unity3D comes with a handy visualization element called Stickman. The Stickman prefab is nothing but a set of spheres and lines that connect them. Each sphere represents one particular joint, while each line represents the connections between them (you can think of them as bones). Here is the simple structure of the Stickman prefab:
To use the Stickman in our code, all we need to do is declare a StickmanManager
element and call its Load()
method, providing the list of skeletons as a parameter. Internally, the Stickman Manager will assign a Stickman to each tracked body and position the joints accordingly.
[SerializeField] private StickmanManager _stickmanManager; private void Update() { Frame frame = _sensor.Update(); if (frame == null) return; List bodies = frame.BodyFrameSource?.Bodies; _stickmanManager.Load(bodies); }
Then, all you need to do is hit the Run button and stand in front of your Kinect camera!
Amazingly easy, right? Viewing the skeleton in 3D is particularly useful in applications that analyze human motion. For example, a client of ours wants to evaluate the posture of patients with kinesiology issues. As a result, it’s important to see the body from multiple angles without having the patient move. How can we do that? Simple: we’ll rotate the camera around the skeleton! Here’s the C# code that allows us to achieve the results demonstrated in the video.
private void LateUpdate() { Vector3 cameraPosition = Camera.main.transform.localPosition; Vector3 originPosition = Vector3.zero; float angle = 50.0f * Time.deltaTime; if (Input.GetKey(KeyCode.RightArrow)) { Camera.main.transform.RotateAround(originPosition, Vector3.up, angle); } if (Input.GetKey(KeyCode.LeftArrow)) { Camera.main.transform.RotateAround(originPosition, Vector3.down, angle); } if (Input.GetKey(KeyCode.UpArrow)) { Camera.main.transform.RotateAround(originPosition, Vector3.right, angle); } if (Input.GetKey(KeyCode.DownArrow)) { Camera.main.transform.RotateAround(originPosition, Vector3.left, angle); } if (Input.mouseScrollDelta != Vector2.zero) { Camera.main.transform.localPosition = new Vector3( cameraPosition.x, cameraPosition.y, cameraPosition.z + Input.mouseScrollDelta.y * _speed); } }
Try it out yourself. Here’s what each key is doing:
↑ | Up arrow | Rotates the view upwards (overhead). |
↓ | Down arrow | Rotates the view downwards. |
← | Left arrow | Rotates the view sideways to the left. |
→ | Right arrow | Rotates the view sideways to the right. |
⇕ | Mouse wheel | Zooms in or out. |
Summary
In this Masterclass, you’ve learned how to acquire and visualize the 3D positions of the human body joints using the Azure Kinect Body Tracking SDK and Unity3D.
Source code
You’ve made it to this point? Awesome! Here is the source code for your convenience.
One more thing…
LightBuzz has been helping Fortune-500 companies and innovative startups create amazing body-tracking applications and games. If you are looking to get your business to the next level, get in touch with us.
Sharing is caring!
If you liked this article, remember to share it on social media, so you can help other developers, too! Also, let me know your thoughts in the comments below. ‘Til the next time… keep coding!