A ramble on visual Slam (iii): Introduction to research points

Last Update:2018-08-02 Source: Internet

Author: User

Tags advantage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is reproduced and has not been modified. The "I" and the contact email are the original authors. /************************************************************************************************************** ***************/ 1. Preface

Hello, readers and friends. (Long time ago), we introduced the basic concepts and methods of slam. I believe that we should have a basic understanding of the slam. After a pile of writing papers and a PhD, I am ready to return to pits: to introduce all aspects of slam research. If the first two articles are considered "first knowledge", the next few are "getting better". In the third chapter, we will talk about the various research points in slam, and make a sketchy summary for the graduate students (who should be the most readers of the blog). Then, we will talk about the various small problems, the classical algorithm and classification. I have the patience to say, do you have the patience to listen to it.

In "SLAM for Dummy", there is a good saying: "SLAM is not an algorithm, but a concept." (SLAM is more like a concept than A and a single algorithm.) "So you can talk to your mentor, your brother (and sister, if you have one) that you're studying SLAM, but, as a peer, I may be more concerned with: you are studying which of the slam questions. Some researchers focus on achieving a specific slam system, while more people are studying the improvement of some methods in slam. The people who do the application and the theory often can't afford to look at each other, but they both contribute to the scientific research. As a graduate student, I suggest you catch a small problem in slam and see if you can improve or compare the existing algorithms. Don't think this kind of thing is superficial, it is practical help and meaning to research. At the same time, I have some friends, do a filter/graph optimization based on the slam implementation. The program is running, but he/she does not know what contribution they have, delve into which problem, writing a paper is a headache. So, as a graduate student, I suggest you choose a problem in the slam, improve the algorithm, instead of looking for a bunch of programs to run up again.

So the question is: what can be studied in slam? I have a brain map for everyone.

This picture is taken from my notebook (do not spit out the slot and focus). As you can see, with slam as the center, there are five laps connected to it. I call it basic theory (basic theory), sensor, Mapping (build), loop Detection (loopback detection), Advanced Topic (high-level problem). This can be said to be the research direction of slam. Below we "flower five flower, each table one branch". 2. Basic theory

The basic theory of Slam refers to its mathematical modeling. That is how you use mathematical models to express this problem. Why does it say "basic"? Because the mathematical model affects the performance of the whole system, it determines the processing methods of other problems. In previous studies (86 [1] to early 21st century [2]), a mathematical model of Kalman filter was used. The robot there is a time-series of posture, and the map is a collection of road punctuation. What is a collection of signpost points. is to use (x, Y, z) to represent each signpost, and then in the process of updating the filter, let the three numbers slowly converge.

Well, how about this model?

The advantage is that the method of solving the filter can be applied directly. Kalman filter is a very mature theory, more reliable.

Shortcomings. First of all, what is the disadvantage of the filter, based on its slam have any shortcomings. So EKF's linearization hypothesis, the resource consumption that must be stored in the covariance matrix, is a disadvantage (as described later in this article). Then, the most intuitive thing is to use (x, Y, z) to indicate a signpost. What if the road signs change. We don't usually move the table chairs in the house. The filter was then hung up. So, it doesn't apply to dynamic situations, either.

This limitation is the result of the mathematical model itself, independent of other algorithms. If you want to run slam in a dynamic environment, you need to use a different model or improve the existing model.

The basic theory of Slam has always been divided into two types: Filter and optimization method. The filter has an extended Kalman filter (EKF), particle filter (PF), Fastslam, etc., which appear earlier. The idea of the orientation diagram (Pose graph) in the direction of optimization was introduced in the previous article. In recent years, with the gradual increase in optimization, and the filter aspect in 13, the method based on the random finite set [3], is also a new wave [4]. For more information on these methods, we will discuss them in a future article.

As slam's researchers, there should be a general understanding of the basic theories and the pros and cons, although their implementation can be very complex. 3. Sensor

Sensors are the way robots perceive the world. The choice and installation of the sensor determines the specific form of the observation equation, and it also affects the difficulty of the slam problem to a great extent. The early slam used laser sensors (lasers Range Finder) and now use visual cameras, deep cameras, sonar (underwater), and sensor fusion. I think there are several points to this direction: how to use emerging sensors for slam. To know that the sensor is constantly evolving, there are always new things will come out, so this research will certainly not break. The effect of different installation methods on Slam. For example, the slam problem of camera, top view (ceiling) and bottom view (see floor) is much easier than head-up. Why is it easy. Because the top/bottom view of the data is very stable, unlike the head-up, to be disturbed by various things. Of course, you can also study other installation methods. Improved data processing for traditional sensors. This part is somewhat difficult, because the regular sensor already has a lot of people in use, you do the improvement, may not be better than the existing mature method. 4. Building Map

Build a map, as implies, is how to draw maps Bai. In fact, if you know the true trajectory of the robot, drawing maps is a very simple thing. However, the specific form of the map is also one of the research points. For example, the following are common: road map.

The map consists of a bunch of signpost points. This is the map in the EKF. But, some people say, is this really a map (what are these bits and pieces?). So although the road map is very convenient, most people are not satisfied with this kind of maps, at least it doesn't look like a map. So there's a dense map (dense maps). Measurement maps (Metric map)

Usually refers to the 2d/3d grid map, that is, the common kind of black-and-white/point cloud-like map. Point cloud Map is cool, very kind of high-tech feeling. It has the advantage of high accuracy, such as 2D map can use 0-1 to indicate whether a point can be passed, is useful for navigation. The disadvantage is to eat a lot of storage space, especially in the big, all the space points are saved up, but most of the corners of the points in addition to good looks do not have any meaning ...

Topological maps (topological map)

Topological maps are a more compact map than a metric map. It abstracts maps into "dots" and "edges" in graph theory, making them more consistent with human thinking. For example, I want to go to the five crossing, do not know the road, to ask others. That person will not say, you first walk 621 meters, turn left 94.2 degrees, then walk 1035 meters ... (This is crazy). Normal people will say, go forward to the second intersection, turn left, go to the next traffic light, and so on. This is the topology map. Mixed map.

Since some people want to classify, it is certain that some people want to rub all kinds of benefits together. That's not much to say. 5. Loopback detection

Loopback detection, also known as Closed loop detection (loop closure detection), refers to the ability of a robot to recognize a scene that has arrived. If the detection is successful, the cumulative error can be significantly reduced.

Loop detection is now more use of the word bag model (Bag-of-word), the study of computer vision students are certainly not unfamiliar. It is essentially a problem to detect the similarity of observational data. In the word bag model, we extract the features from each image, cluster their eigenvector (descriptor), and set up a class database. For example, eyes, nose, ears, mouth, etc. (actually not so high, basically some edges and horns). Let's say there are 10,000 of classes. Then, for each image, you can analyze which classes in the database it contains. 1 is indicated, with 0 indicating no. So, this image can be expressed as a vector of 10000 dimensions. and different images, just compare their vectors.

Loopback detection can also be a model recognition problem, so you can also use a variety of machine learning methods to do, such as what decision Tree/SVM, you can also try deep learning. But in practice it requires real-time detection, and there's not much time for you to train the classifier. So slam more emphasis on online learning methods. 6. Advanced Topics

The front is the foundation of the slam, only "positioning" and "build" two things. These two things have been done more perfect today. In recent years rgb-d slam[5], svo[6], Kinect fusion[7] and so on, have made a very dazzling effect. But slam has not yet entered into people's real life. Why is it.

Because the actual environment is often very complex. The light will change, the Sun Dongsheng West, constantly someone from the door into and out, not a quiet empty room, let a robot to 2cm/s speed slowly stroll. The paper looks cool algorithm, in the actual environment is often stretched, everywhere. The challenge to the real environment is the main development direction of slam technology, which is what we call the Advanced topic. Mainly include: Dynamic scene, semantic map, multi-robot collaboration and so on. 7. Summary

In this paper, we introduce the various research points in slam. I do not want to write it as a review, because there is not necessarily someone willing to read a bunch of references, I would like to write it in the form of a small story.

Finally, let's imagine what the future slam looks like:

One day, the turnip was ushered into a new laboratory building. After a brief introduction, he took a quick stroll around the building, remembering where the corridor was and where the room was. He deliberately looked at the unique items in each room in order to differentiate the rooms that looked similar. He then returned to the scientists to assist him in his research. Sometimes, the scientist would ask him to go to the house to find someone, get information, and sometimes take him to know the newly installed instruments and equipment. At leisure, the radish will also be in the building to see what has changed in those houses. Whenever a new visitor arrives, the turnip will show them the floor plan, and introduce them to the location and status of each floor, and navigate them. We all like radishes very much. And the radishes understand that all this is the result of Slam researchers ' continuous exploration over the last few decades.

References:

[1]. Smith, R.C and P. Cheeseman, on the representation and estimation of Spatial uncertainty. International Journal of Robotics, 1986. 5 (4): P. 56--68.

[2]. Se, S., D. Lowe and J. Little, Mobile robot localization and mapping with uncertainty using scale-invariant visual Landmarks. The international Journal of Robotics, 2002. (8): P. 735--758.

[3]. Mullane, J., et al., A random-finite-set approach to Bayesian SLAM. IEEE Transactions on Robotics, 2011.

[4]. Adams, M., et al, SLAM Gets a phd:new concepts in Map estimation. IEEE Robotics Automation Magazine, 2014. (2): P. 26--37.

[5]. Endres, F., et al., Mapping with an rgb-d Camera. IEEE Transactions on robotics, 2014. (1): P. 177--187.

[6]. Forster, C., M. Pizzoli and D. Scaramuzza, svo:fast semi-direct monocular visual odometry. The IEEE. P. 15--22.

[7]. Newcombe, R.A, et al, Kinectfusion:real-time dense surface mapping and tracking. , IEEE. P. 127--136.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More