Kinect Development--kinect for Windows SDK

Last Update:2014-07-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Development--Basic SDK and Windows programming tips (color image video streaming, deep image video streaming, bone tracking, audio processing, speech recognition API)

Deep data is the essence and soul of Kinect, and many problems are transformed into pattern recognition problems of deep images.

Aforge.net is a set of frameworks written in C # that provides computer vision, machine learning www.aforgenet.com

Image processing requires a lot of computational resources, and using a managed language like C # is unwise and should be used more OPENCV

Application Layer API Detailed

NUI API

Kinect Audio DMO: provides beam forming and audio source positioning functions

Windows Speech SDK: Provides audio, voice, multimedia API sets, and Microsoft's language recognition capabilities

Kinect's core NUI API

Up to 4 Kinect devices are connected to the same computer, but the application can only specify one of the Kinect-enabled bone tracking features. Multiple applications cannot share a single Kinect sensor at the same time

1, get the Kinect instance

Kinectsensor sensor= (from Sensortocheck inkinectsensor.kinectsensors where sensortocheck.status== kinectstatus.connected Select Sensortocheck). FirstOrDefault ();

foreach (kinectsensor kinectsensor indexer kinectsensor.kinectsensors) {    if(kinectsensor.status== kinectstatus.connected)    {        Kinectsensor=kienct;          Break ;    }}

2, call the Kinectsensor.start method to initialize and start the Kinect sensor

3, registering related events, (such as video streaming or deep data arrival events, skeleton tracking events, and invoking SDK-provided APIs for processing based on these events)

Kinectsensor.colorframeready

Kinectsensor.depthframeready

Kinectsensor.skeletonframeready

Kinectsensor.allframeready

4. Call the Kinectsensor.stop method to turn off the Kinect sensor

The Kinect NUI API handles data from the Kinect sensor in a "pipeline" manner. At initialization time, the application specifies the sensor data it needs. (color, depth, depth and user number, bone tracking)

These options must be set in the initialization, or the data will not be available.

Kinect Audio DMO

Improve audio quality beamforming

echo cancellation echo suppression automatic gain control (automatic gain algorithm makes sound amplitude consistent when the user approaches or stays away from the microphone) beamforming

The key object of the Microsoft.speech class library is speechrecognitionengine, which is responsible for getting noise-pre-processed audio data streams from the Kinect sensor, then analyzing and interpreting them to further match the most appropriate voice commands

Speechrecognitionengine Speech command recognition based on certain grammatical expression, Grammar object consists of a series of individual words or phrases, which are expressed by class Grammarbuilder, and the syntax can be expressed based on the choice of choices class and wildcard character.

Data Flow Overview

1, color image data

Image quality will affect the transmission rate between the Kinect sensor and the computer

Applications can set the encoding format for color images, including RGB,YUV two encodings

30 frames per second The transmission speed and resolution of the 320*240

2, the user splits the data

The depth image of each pixel consists of 2 bytes, a total of 16 bits

The height of 13 bits per pixel represents the distance from the Kinect infrared camera to the nearest object, in millimeters
The
Low 3-bit byte represents the tracked user index number, which is converted to an integer value type, not as a flag bit

Do not reference a specific "user index number" during code writing, and the "User index number" returned by the Kinect skeleton trace may change even for the same person
The
Split data by user can separate the user depth image from the original depth image, coordinate mapping, and further separate the user color image from the original color image-to achieve the "augmented reality" effect

3, Depth image data

Each pixel contains a specific distance information

For the Kinect infrared camera, you can learn about the current camera's operating mode with the Depthimagestream.depthrange enumeration type:

Toofardepth

tooneardepth

unknowdepth

Depth image each pixel is 16 bits, defining a short data to store the depth image:
 short  [] depthpixeldata=new   
 Showrt[depthframe.pixeldatalength];d Epthframe.copypixeldatato (depthpixeldata);  

For each point in the depth image, p (x, y), Depthframe.width is the depth image width, and the distance of the target object to the Kinect is calculated by a bitwise operation.

Int32 depth = depthpixeldata[pixelindex]>>depthimageframe.playerindexbitmaskwidth;

How to get the data flow

1, polling mode (pull)

First, the image data stream is turned on, then the frame data is requested and the wait time is set to T, in milliseconds, and if the frame data is not ready, the system waits for the T time to return. If the frame data returns successfully, the application can request the next frame of data and perform other operations on the same thread

Opennextframe (T) t--The maximum time to wait for new data to be returned

2, Event model

The application registers the FrameReady event for the data flow, and when the event is triggered, it invokes the event's property Framereadyeventargs to get the data frame

Cannot use both modes for the same data stream

The Allframeready event consists of three data streams, such as an application registering a Allframeready event, and any attempt to get data in a stream in a pull (poll) mode will generate InvalidOperationException

In some applications, in order to keep the depth image and color image as synchronized as possible, you can use polling mode--through the timestamp property

Bone Tracking

Skeleton Information Retrieval

1, polling mode Skeletonstream.opennextframe

2, Event model Kinectsensor.allframesready event, once the new bone data is ready, the event is triggered, call Skeletonframereadyeventargs.openskeletonframe to get the frame

Bone Tracking Object Selection

If you need to manually select a tracking object, you need to use the Appchoosesskeletons property and the Chooseskeletons method.

NUI Coordinate conversion

mapdepthtocolorimagepoint--depth Image coordinate system--color image coordinate system

mapdepthtoskeletonpoint--depth Image coordinate system--bone tracking coordinate system

mapskeletonpointtocolor--Skeleton Tracking coordinate system--color image coordinate system

mapskeletonpointtodepth--Skeleton Tracking coordinate system--depth image coordinate system

Even with the same resolution, the pixels of a depth-image frame cannot be mapped to a color image frame at all-because two cameras are located in different locations of the Kinect

3 coordinate conversion methods in the Depthimageframe class of the depth image frame

Mapfromskeletonpoint mapping bone Joint point coordinates to depth image point coordinates

Maptocolorimagepoint Map a point coordinate in the depth image to the point coordinate of the synchronized color image frame

Maptoskeletonpoint mapping a point coordinate in the view image to the point coordinate of the corresponding bone data frame

The z-axis represents the optical axis of the infrared camera, perpendicular to the image plane. The intersection point between the optical axis and the image plane, which is the origin of the image coordinate system

Both the depth image coordinate system and the bone tracking coordinate system are the Kinect camera coordinate system, the origin is the Infrared Camera center, the x-axis is parallel to the x-axis y-axis of the image, the z-axis infrared camera axis, it is perpendicular to the image plane

Screen coordinate system--the upper-left corner is the origin, the x-axis is positive and the y-axis is positive.

Depth image spatial coordinates--in millimeters

Bone space coordinates--in meters

Sensor array and tilt compensation

Each skeleton frame includes a value that describes the gravity. The value is calculated by the internal triaxial accelerometer and the sensor image measurement. In the case of motion, the accelerometer measures the direction of gravity and the remaining horizontal vertical vector

Bone Mirroring

Non-mirrored bone tracking is not available in the SDK

Implementing a mirrored bone is simple-reversing the ex-coordinate value of a bone node can achieve this effect

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Kinect Development--kinect for Windows SDK

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Kinect Development--kinect for Windows SDK

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support