A really nice project developed by Jorge Coutinho, a Master student at the School of Arts of the Portuguese Catholic University, Porto, Portugal.
Some pointers related to this:
Interesting post on how the MS Kinect may actually work.
Some (unofficial and still to be confirmed) specs summarized from the post linked above (and comments):
- the Kinect appears to be a 640×480 30fps video camera that knows the *depth* of every single pixel in the frame. It does this by projecting a pattern of dots with a near infrared laser over the scene and using a detector that establishes the parallax shift of the dot pattern for each pixel in the detector (parallax seems to be more robust than intensity – some sources said that materials ( hair in particular ) caused large fluctuations in intensity, so it doesn’t seem like it would be a useful channel to probe for depth data).
- The depth buffer is only 320×480 (unconfirmed). It seems that the hardware will happily give a 640×480 version (this is Xbox360 API memory, so upscalingmay actually occur on the XBox360) but the hardware itself only gets enough data to fill 320×480.
- Alongside this there is a regular RGB video camera that detects a standard video frame. This RGBZ (or ‘D’) data is then packaged up and sent to the host over USB.
- It seems that the Kinect framerate (for RGB image and depth buffer) is 30Hz.
- The Kinect does not identify shapes within the field of view and does not attempt to map skeletal outlines of those shapes recognised. For that, you would need to take each one of the 640×480 frames and copy them into a framebuffer so they can be processed by a vision library like OpenCV. Typical operations would be to threshold the depth image to get the “closest” pixels – then perform a blob analysis ROI to group these pixels into identifiable features and then track those blobs over their lifetime.
- The Kinect uses a pattern of laser dots to detect depth, as can be seen in this video (and another one, and another one, and another one ;-)) and in these images. It seems to exist a 3×3 checker board effect in that dot pattern (no clue why yet… any suggestions?).
So, processing all this data seems to be quite heavy (mainly if you try to do it in an embedded board like the guy from the post above). Using a full-fledged PC/Mac using openCV and/or OpenCL in a multicore machine will get you the required juice for advanced image processing.
Finally, some quite interesting resources for Kinect related stuff:
“This is the curved (!) screen in our reality center of the University of Groningen. We just finished building our own touch detection for it.
We used six Optitrack v120 slim camera’s which have a good sensitivity for infrared light. We used 16 cheap infrared emitters (the kind used for security systems) with a total of 1000 LED’s.
The touch detection software runs on three old computers each with two camera’s connected. One extra computer combines the output from the detection computers and send event data to our main visualization system.
This way we have (even using the old computers) enough processing power to be able to run the detection software at 60Hz and with a latency between 30 ms and 50 ms. It can detect without any problem 100 different touches at any time (more is possible, but it becomes slower)
We used a modified version of Community Core Vision (CCV) 1.4 (nuigroup.com) (modified so it can do two camera’s on one computer).
The communication protocol is preferable TUIO (tuio.org) and we did install Multi-touch Vista (multitouchvista.codeplex.com), which translates TUIO events to WM_TOUCH events for windows 7.
The demos you see in the video are from Multitouch for Java(tm) MT4J (mt4j.org/mediawiki/index.php/Main_Page). The part where the wizards are throwing fireballs at each other is using msafluid (project home is at msavisuals.com/msafluid ).
The curved screen itself is consist of a 3 mm dark acrylic layer, coated with a diffuser on the front. Illumination is from behind using six full hd Barco projectors.
The cameras and the ir-leds are also located behind the screen.“