Microsoft HoloLens Technology Puzzle (top)

Last Update:2015-05-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is HoloLens?

HoloLens is a wearable augmented reality computing device released by Microsoft that has several key elements:

It is an augmented reality product, augmented Reality (AR), which overlays computer-generated images with the real world. Similar products include images projected onto the retina of Google Glass, as well as mobile phone AR applications superimposed on the phone's camera screen.
It has a separate compute unit with CPU + GPU + HPU and no external computer required. Its CPU and GPU are based on the Intel 14 nm process of the Cherry Trail chip, HPU is a Microsoft invented abbreviation, the full name is the holographic processing unit, the holographic processing unit. According to the anonymous user's answer, HPU is an ASIC (application-specific integrated circuit), Microsoft is a custom-made integrated circuit for HoloLens, I can only say "rich wayward."

HoloLens is not what?

After watching Microsoft lifelike promotional video, if your reaction is

The
Matrix is coming.

So you should take a good look at this paragraph, because the Matrix is virtual REALITY/VR/VR, which is characterized by the fact that participants are exposed to a computer-generated, three-dimensional image of the world, to dilute the real world. VR's recent representative product is Oculus Rift, wearing Rift after you can not see the real world. In my opinion, the biggest problem of VR is: This virtual world is very real and wonderful, but what is the use of it? This means that VR can only achieve a more realistic three-dimensional world, it can not help people to better understand the real world.

HoloLens is not Google Glass, it is more than GG:

Three-dimensional perceptual ability, you can model the three-dimensional scene around. And GG can only see RGB pixel values.
Three-dimensional rendering capability.
The human-computer interaction ability can be controlled by gestures.

HoloLens is also not a common AR on the market, the common camera-based AR application based on the camera is:

AR based on an ugly black and white tagged image

and AR based on any image.

It's cool, but they can only detect the plane where the picture is. HoloLens than they are cattle, it can detect all angles of the three-dimensional scene!

HoloLens AR is how to get the depth information of three-dimensional scene?

When we go back to the definition of AR and want to realize augmented reality, we must first understand the reality, what is the reality for HoloLens? Is the data of the sensor.

What is a sensor? It's the camera.

The same is the camera, why HoloLens can perceive depth? Microsoft's Kinect is a success in this area, so is it HoloLens to put an embedded Kinect on?

The answer is in the prototype image below

HoloLens has four cameras, two units on each side. By analyzing the real-time images of these four cameras, the HoloLens can reach 120 degrees in both horizontal and vertical angles.

This means that it uses stereoscopic vision/Stereo vision technology to obtain similar depth maps/depth maps.

Stereoscopic Vision is a sub-discipline of computer vision, focusing on the distance from the camera of objects in real scenes from the image data of two cameras. As follows

Here are the basic steps, consult the OpenCV documentation to learn more about the Function usage http:// docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d _reconstruction.html

Camera correction, undistortion. Because the camera lens is factory distorted, in order to obtain accurate data need to be more positive before use. The usual method is to shoot several times based on the various gestures of the chessboard and then calculate the camera's matrix entry. is a common calibration interface.
Image alignment, rectification. Because the two cameras have different positions, they see the scene as biased, the left camera can see the leftmost scene, and the right side sees the rightmost scene. The goal of image alignment is to get the same scene part.
Left and right image matching, correspondence. You can use HTTP. /docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_ in OpenCV reconstruction.html get disparity map.
A depth map is obtained by remapping functions, such as cv::reprojectimageto3d in OpenCV.

There is only one depth map that is not enough, it is just a reflection of the real scene in a camera in a moment. To get a complete three-dimensional scenario, we need to analyze a series of depth graphs.

HoloLens How to reconstruct three-dimensional scenes from multiple depth maps?

The answer is slam,simultaneous Localization and Mapping, which is the synchronous positioning and mapping system. This technology is used in the positioning and pathfinding systems of robots, unmanned vehicles and unmanned aerial vehicles. The solution is a very philosophical question:

Where am I now?
Where can I go?

SLAM There are many ways to implement, there is an open source http:// pointclouds.org/ Implementation of a lot of depth map processing and matching algorithm, can be considered as a three-dimensional version of the OpenCV.

and Microsoft invented the Kinect fushion algorithm around the Kinect's deep-drawing data, and published two papers:

Kinectfusion:real-time 3D Reconstruction and Interaction Using a moving Depth Camera
Kinectfusion:real-time dense Surface Mapping and Tracking

Why do I think HoloLens is related to Kinect fushion? The answer is on this page, http:// research.microsoft.com/en-us/people/shahrami/. Shahram Izadi is Shine's principal researcher and the manager of the. He led the interactive 3D technology Group/Interactive 3D Technologies to provide research capabilities for Microsoft's many products, including Kinect for Windows, Kinect Fusion and HoloLens. By the way, their group is hiring:)

Kinect fushion, by moving the Kinect device indoors, gets a depth map of different angles, real-time iterations, and accumulates the different depth graphs, calculating the precise room and the three-dimensional model of the objects in the room.

It is divided into four stages:

Depth chart format conversion, the unit of the converted depth is m, saved with floating point number. and calculates the vertex coordinates and the normal vector of the surface.
Calculates the camera's posture (including position and orientation) in the world coordinate system, tracking both values through an iterative alignment algorithm so that the system always knows how much the current camera has changed compared to the original gesture.
In the third stage, the depth data of the attitude is known to be fused into a single three-dimensional Lego space, and you can call it MineCraft space because the basic element of this space is not a triangle, but a square lattice. The occurrence of MineCraft scene estimation in the demo video is also related to this stage.
Based on Raycasting's three-dimensional rendering, raycasting needs to emit rays from the current camera position and intersect with three-dimensional space. The Lego space is particularly suitable for raycasting, and can be used to accelerate the intersection of rays using eight-fork trees. Raycasting, raytracing, and rasterization are three common ways to render, and this is not the case.

In the application of HoloLens we run to the third step, that is, to obtain a three-dimensional Lego model, the fourth step is not required. Because HoloLens's screen is transparent and does not need to render the model of the house again, our own eyes have been rendered again:)

How is HoloLens cool demo made?

There are still three difficult points left for subsequent articles to narrate:

How does gesture recognition work?
What does eyeball tracking do?
How does a very fit three-dimensional rendering work?

Microsoft HoloLens Technology Puzzle (top)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Microsoft HoloLens Technology Puzzle (top)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support