In the previous chapters, we’ve seen that the video and depth cameras provide two-dimensional screen-friendly outputs, while the joint data are referenced in the three-dimensional physical world. A video frame is an image with X and Y data measured in pixels, while a skeleton frame includes X, Y, and Z values measured in meters. How is that possible, and what does it mean for the way you should approach your Kinect development process? In this chapter, we will demystify Kinect’s coordinate spaces and transform between 2D and 3D.
It’s natural to feel overwhelmed by all those ...