Home > Electronics, NUI, Programming > What the Kinect sensor actually does…

What the Kinect sensor actually does…

What the Kinect sensor actually does…

Interesting post on how the MS Kinect may actually work.

Some (unofficial and still to be confirmed) specs summarized from the post linked above (and comments):

  • the Kinect appears to be a 640×480 30fps video camera that knows the *depth* of every single pixel in the frame. It does this by projecting a pattern of dots with a near infrared laser over the scene and using a detector that establishes the parallax shift of the dot pattern for each pixel in the detector (parallax seems to be more robust than intensity – some sources said that materials ( hair in particular ) caused large fluctuations in intensity, so it doesn’t seem like it would be a useful channel to probe for depth data).
  • The depth buffer is only 320×480 (unconfirmed). It seems that the hardware will happily give a 640×480 version (this is Xbox360 API memory, so upscalingmay actually occur on the XBox360) but the hardware itself only gets enough data to fill 320×480.
  • Alongside this there is a regular RGB video camera that detects a standard video frame. This RGBZ (or ‘D’) data is then packaged up and sent to the host over USB.
  • It seems that the Kinect framerate (for RGB image and depth buffer) is 30Hz.
  • The Kinect does not identify shapes within the field of view and does not attempt to map skeletal outlines of those shapes recognised. For that, you would need to take each one of the 640×480 frames and copy them into a framebuffer so they can be processed by a vision library like OpenCV. Typical operations would be to threshold the depth image to get the “closest” pixels – then perform a blob analysis ROI to group these pixels into identifiable features and then track those blobs over their lifetime.
  • The Kinect uses a pattern of laser dots to detect depth, as can be seen in this video (and another one, and another one, and another one ;-)) and in these images. It seems to exist a 3×3 checker board effect in that dot pattern (no clue why yet… any suggestions?).

So, processing all this data seems to be quite heavy (mainly if you try to do it in an embedded board like the guy from the post above). Using a full-fledged PC/Mac using openCV and/or OpenCL in a multicore machine will get you the required juice for advanced image processing.

Finally, some quite interesting resources for Kinect related stuff:

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: