Some thoughts on Slam+ar technology and its application _ algorithm

Source: Internet
Author: User

Click to have a surprise


I. Overview

The field of image processing involves
AR technology can be abstracted into such a kind of technology:

* To compute the location of the camera and the three-dimensional structure information of the environment through the image and other sensor information, and to provide a more natural man-machine interaction ability with 3D rendering. *

As shown in Figure 1, position and structure information generally includes camera seat (Camera Pose) and Point Cloud/3d model (points Cloud/mesh), different technical problems, different emphases.

Fig. 1 AR Related Technical diagram
Marker/markerless tracking only focus on camera tracking, this is the first class of AR technology, its technical indicators mainly require fast operation, protection of real-time, there is tracking stability and not easy to lose. Marker is generally a 2D picture, so the processing is relatively simple, the world coordinate system is often in the marker plane as the XY axis plane, perpendicular to the surface of the normal vector is Z axis. By matching the feature points on the camera image with the feature points in the marker, you can calculate the position of the camera relative to the marker, as shown in Figure 2.

Fig. 2 Marker ar schematic diagram
IMU (Inertial Sensing Unit) is a modern smartphone almost all equipped with the components, through IMU can accurately obtain the mobile phone (camera) rotation dimension of the transformation, but for space displacement is difficult to accurately measure, so IMU often as a supplement to other technologies, such as slam.
SLAM (at the same time positioning and mapping), as the name suggests, its purpose includes the computation camera pose and scanning environment three-dimensional structure information, and it is very similar to SFM (Structure from Motion), the biggest difference is that SFM is commonly used in 3D reconstruction, In general, the structural details of the reconstruction and the precise requirements are very high, after the reconstruction of the results are more user-oriented, aesthetic degree also has certain requirements, so generally only offline processing, and slam application scenarios, such as AR, Robot control, and so on, the general requirements of real-time, so the scanning environment structure information is generally relatively rough, And the information of the environment is mainly to support its own positioning. The biggest difference between SLAM/SFM and marker tracking is the fact that you don't know any three-dimensional information in advance, and you need to first recover the three-dimensional structure from two two-dimensional images (as shown in Figure 3), and then constantly track and expand the map. The coordinate system of the 3D map established by slam is random, and the following mention of an interesting improvement is that we can use a marker as the initial map of slam so that the slam coordinate system can be determined.


Figure 3 Restoring three-dimensional structures from two-dimensional images
3D object tracking, or three-dimensional object tracking, is similar to the marker tracking is similar to the Slam,maker tracking is to calculate 2D picture relative to the position of the camera, slam solve one of the problems in the construction of a good three-dimensional point cloud, Locate the position accurately and keep track of it; 3D object tracking the problem to be solved is to pinpoint the position of the 3D object relative to the camera. Compared with the marker tracking, it is to upgrade the tracking 2D image to trace 3D objects, the difference is that the 2D picture is always located in a plane in three-dimensional space to facilitate the construction of point clouds and the world coordinate system. Compared with slam, slam constructs a good environment and its own environment to be positioned must be relatively stable, and when 3D objects are identified, background information is often changed. The prior knowledge required by 3D Object tracking is generally a 3D model, which can sometimes be obtained by slam or SFM.
Object segmentation, i.e. object segmentation, is the separation of the specified object from the camera picture or the background of the foreground. There are many ways to implement this kind of problem, one of which is to find and separate the objects in the picture with the information of the 3D model of the object, and to calculate the position of the 3D object relative to the camera accurately. This type of technology is complementary to the 3D object tracking, for example, if the object can be separated first, then the tracking will be simpler, conversely, if you can trace the position of the object, then it is relatively easy to separate the object.
In the AR technology based on single purpose camera, the feature point (Feature) is the most universal technique, and we can accurately calculate the motion of the camera by matching the characteristic points between different frames, so as to restore the three-dimensional structure of the environment. Many feature point correlation algorithms, at the mobile end, in order to improve computational efficiency, tend to use orb and freak, such as slam we use the Orb features, Marker tracking we use the freak features, and similar to sift this, the matching precision, but low computational efficiency, Generally used in SFM such techniques. The advantage of using the feature point is that the feature has good scale, no distortion, no distortion and so on, and the matching result is more robust. The optical flow method (optical flows) is another kind of tracking camera motion relative to the feature point, the basic principle of it is not to compute feature points and feature descriptors, but to use pixel blocks to match directly, which improves the efficiency of operation, and its disadvantage is that the optical flow method only fits the adjacent two-frame images to calculate the motion changes. It requires the conservation of light intensity (brightness consistency), so it can not be used for relocation and other technical implementation.
Behind each AR technology is a number of new business forms, the AR business is mainly focused on marketing interaction and utility two types of expansion, as shown in Figure 5. The earliest marker tracking gave birth to AR Interactive marketing, AR Education and other business development, hand Amoy Armagic Interactive platform is relying on this technology to build up; Slam technology supports the AR virtual baby, so that 3D of goods can more naturally enter the user's vision Ar catch cat is also rely on IMU in the application of AR can be spread out. More importantly, these technical potential has not yet been fully released, and in the existing technology system we have planned some future business blueprint:

AR instruction, through the way of AR to show the actual life, the production process of some of the equipment used or working principle. For example, home appliances (washing machines, rice cookers, etc.) of the 3D form of instructions (as shown in Figure 4), some of the plant's virtual operating instructions. This business is mainly based on 3D Object tracking and slam technology.

Figure 4 AR Instruction sample diagram

AR scene interaction, in some offline scenes (shopping malls, leisure venues, museums, etc.), through the AR way to complete a more natural and rich interaction. For example, the current plan of the Shanghai Starbucks new concept flagship store, plans to use AR way to display the traditional machine and traditional crafts history, so that users can use AR way to explore the history of Starbucks culture. At the same time, combined with payment, logistics and other links to complete the integrated fun shopping experience.

AR live/ar video, in the live scene through the layout of a number of special "tags" in the Live and video overlay a richer interactive effects.
The form advantage of AR business is not only novelty, but it through the camera through the virtual and the real world, also has the natural through the line and the advantage of offline, is to carry out "new retail" to explore a sharp weapon.

Fig. 5 AR technology System and business relation Diagram II, SLAM technology

Mention SLAM open source project, natural first thought of famous Orb-slam. First, the algorithm structure of the lower orb-slam2 is briefly introduced. As shown in Figure 6, ORB-SLAM2 is mainly divided into three threads, tracking used to track camera pose,localmapping used to build point cloud maps, loop closing for closed loop detection, optimized point cloud location. Place recognition, which is reposition, uses the bow (Bag of Words) model to locate camera within a map that has already been built. Although the overall effect of orb-slam2 in many open source projects, but he is still just a laboratory products, distance to the real practical need to do a lot of work. After porting the project to the mobile end, the performance is as follows:
1. IPhone7 Plus, 15FPS; Android Samsung Note7 1-2fps
2. Bow Thesaurus size 140M, load time 8 seconds, occupy memory 400+m
3. Position tracking instability, jitter
4. There are many bugs in the code, especially the memory leak is very serious


Figure 6 Orb-slam2

Therefore, the application of Slam technology on the mobile end must be able to break through the obstacle of "limited hardware Resources".

Click to have a surprise


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.