Do It Together rgb-d SLAM (2)

Source: Internet
Author: User

Second speaking from the image to the point cloud

In this lecture, we will lead the reader to write a program that transforms an image into a point cloud. This program is the basis for post-processing maps. The simplest point cloud map is the stitching of point clouds in different locations.

When we use the rgb-d camera, we read two kinds of data from the camera: color images and depth images. If you have Kinect and Ros, you can run:

1 roslaunch openni_launch openni.launch

Make Kinect work. Then, if the PC is connected to Kinect, the color image and the depth image are published in/camera/rgb/image_color and/camera/depth_registered/image_raw. You can do this by:

1 rosrun image_view image_view image:=/camera/rgb/image_color

To display color images. Alternatively, you can see the visual data of the image and point cloud in the Rviz.

Turnip: But brother! I don't have a Kinect on hand right now, what to do!

Brother: No problem! You can download the data we provide to you. is actually the following two pictures!

Turnip: How the depth map is a black one!

Brother: Please look carefully with your eyes wide open! How could it be black!

Radish: Uh ... But it's black!

Brother: Yes! This is because the object in the picture is closer to us, so it looks darker. But you actually read it, but there's data!

Important NOTES:

    1. These two images are from the Nyuv2 DataSet: http://cs.nyu.edu/~silberman/datasets/original format is ppm and PGM, I turned into the PNG format (otherwise the blog Park will not pass ...) )。
    2. You can save these two graphs directly, or you can get the two graphs in my git.
    3. The RGB and depth graphs that are directly picked up in the actual Kinect (or in other rgb-d cameras) may have some minor problems:
      • There are some slack (about a few to more than 10 milliseconds). The existence of this time difference, will produce "RGB map has turned to the right, how the depth of the map has not turned" feeling oh.
      • The Aperture Center is misaligned. Because depth is acquired by another camera, the parameters of the depth sensor and the color sensor may be inconsistent.
      • There are many "holes" in the depth map. Because the rgb-d camera is not omnipotent, it has a detection distance limit! Things that are too far or too close are invisible. About these "holes", we temporarily open one eye to close one eye, not to ignore it. In the future we can also rely on the bilateral Bayes filter to fill these holes. But! This is the limitation of the rgb-d camera itself. The software algorithm can fix it at most, and it can't completely compensate for its flaws.

But you can rest assured that the two diagrams we have given are pre-processed. You can think of "depth map is the distance of each pixel from the sensor in the color map"!

Brother: Now, we are going to turn these two graphs into point clouds, because the space point of each pixel is calculated, but the basis of a series of things, such as registration, puzzle, etc. For example, in the match on time, we must know the characteristics of the 3D position, it is necessary to use the knowledge we have here!

Turnip: it sounds important!

Brother: Yes! So please reader friends must master this part of the content!

From 2D to 3D (Math section)

The above two images give a partial information about the outside world of the robot. Suppose the world is described by a point cloud: $X =\{x_1, \ldots, X_n \}$. Each of these points, there are $r, g,b,x,y,z$ a total of 6 components, indicating their color and space position. Color aspect, mainly by the color image record, but the space position, may by the image and the camera model, the posture calculates together.

For conventional cameras, the pinhole camera model is used in the slam (figure from Http://www.comp.nus.edu.sg/~cs4243/lecture/camera.pdf):

In short, a spatial point $[x, Y, z]$ and its pixel coordinates in the image $[U, V, D]$ ($d $ refer to depth data) correspond to this:

$$ u = \frac{x \cdot f_x}{z} + c_x $$

$$ v = \frac{y \cdot f_y}{z} + c_y $$

$$ d = z \cdot s $$

Among them, $f _x, f_y$ refers to the camera on the $x,y$ two axes of the focal length, $c _x, c_y$ refers to the camera's aperture center, $s $ refers to the zoom factor of the depth graph.

Turnip: So dizzy! There are so many variables popping up suddenly!

Senior: Don't worry, this is already a very simple model, and so you are familiar with it does not feel complicated.

This formula is pushed from $ (x, y, Z) $ to $ (u,v,d) $. Conversely, we can also write it in a way known as $ (u,v,d) $, deriving $ (x, y, z) $. Let the reader deduce for themselves.

No, let's just deduce it ... The formula is this:

$$ z = d/s $$

$$ x = (u-c_x) \cdot z/f_x $$

$$ y = (v-c_y) \cdot f_y $$

What, isn't it easy? In fact, according to this formula can build point cloud.

In general, we will define the $f_x, F_y, c_x, c_y$ four parameters as the camera's internal reference matrix $c$, that is, the camera does not change the parameters after the good. The camera's internal reference can be calibrated in many ways, detailed steps are cumbersome, we do not mention here. Given an internal reference, the spatial position and pixel coordinates of each point can be described using a simple matrix model:

$$ s \cdot \left[\begin{array}{l} u \ \ v \ 1 \end{array} \right] = R \cdot C \cdot \left[\begin{array}{l} x \ y \ \ z \ End{array} \right] + t $$

Among them, the $R $ and $t$ are camera gestures. $R $ represents the rotation matrix, $t $ represents the displacement vector. Because we're doing a single point cloud, we think the camera doesn't rotate and pan. Therefore, the $r$ is set to the unit matrix $i$, the $t$ is set to 0.

Radish: So there is the above $ (u,v,d$ to $ (x, y, z) $ formula?

Brother: Yes! How clever! If the camera has a displacement and rotation, then only the displacement and rotation of these points can be done.

From 2D to 3D (programming section)

Below, let's implement a program that completes the conversion from the image to the point cloud. Create a new GeneratePointCloud.cpp file in the code root/src/folder that is mentioned in the previous section:

1 Touch Src/generatepointcloud. CPP

Turnip: Brother! Can there be several main functions in a project?

Brother: Yes, CMake allows you to define the process of compiling yourself. We will also compile this CPP into an executable binary, as long as the corresponding changes in the Cmakelists.txt are done.

Next, enter the following code in the file you just built. To ensure the consistency of the text, we first give the complete code and then explain it in an important place. It is recommended that the novice do a word for yourself and you will have a firmer grip.

1 //C + + standard library2#include <iostream>3#include <string>4 using namespacestd;5 6 //OpenCV Library7#include <opencv2/core/core.hpp>8#include <opencv2/highgui/highgui.hpp>9 Ten //PCL Library One#include <pcl/io/pcd_io.h> A#include <pcl/point_types.h> -  - //defining point Cloud types the typedef PCL::P Ointxyzrgba Pointt; -typedef PCL::P ointcloud<pointt>Pointcloud; -  - //Camera Reference + Const DoubleCamera_factor = +; - Const DoubleCAMERA_CX =325.5; + Const DoubleCamera_cy =253.5; A Const DoubleCamera_fx =518.0; at Const DoubleCamera_fy =519.0; -  - //Main function - intMainintargcChar**argv) - { -     //read./data/rgb.png and./data/depth.png, and convert to point cloud in  -     //Image Matrix to cv::mat RGB, depth; +     //use Cv::imread () to read an image -     //API:Http://docs.opencv.org/modules/highgui/doc/reading_and_writing_images_and_video.html?highlight=imread#cv2. Imread theRGB = Cv::imread ("./data/rgb.png" ); *     //RGB image is a color image of 8UC3 $     //depth is a single-channel image of 16UC1, note that flags is set to 1, which means that the original data is read without any modificationPanax Notoginsengdepth = Cv::imread ("./data/depth.png", -1 ); -  the     //Point Cloud Variables +     //use the smart pointer to create an empty point cloud. This pointer is automatically released when it runs out.  APointcloud::P TR Cloud (NewPointcloud); the     //Traverse Depth Map +      for(intm =0; M < depth.rows; m++) -          for(intn=0; n < Depth.cols; n++) $         { $             //get the value at (m,n) in the depth graph -             ushortD = depth.ptr<ushort>(m) [n]; -             //D may not have a value, if so, skip this point the             if(d = =0) -                 Continue;Wuyi             //d exists value, then add a point to the point cloud the Pointt p; -  Wu             //calculate the spatial coordinates of this point -P.z =Double(d)/Camera_factor; Aboutp.x = (N-CAMERA_CX) * P.z/Camera_fx; $P.Y = (m-camera_cy) * P.z/Camera_fy; -              -             //get its color from an RGB image -             //RGB is a three-channel BGR format diagram, so get the color in the following order Ap.b = rgb.ptr<uchar> (m) [n3]; +P.G = rgb.ptr<uchar> (m) [n3+1]; theP.R = rgb.ptr<uchar> (m) [n3+2]; -  $             //add p to the cloud theCloud->Points.push_back (p); the         } the     //set up and save point clouds theCloud->height =1; -Cloud->width = cloud->points.size (); incout<<"Point Cloud size ="<<cloud->points.size () <<Endl; theCloud->is_dense =false; thePcl::io::savepcdfile ("./DATA/POINTCLOUD.PCD", *cloud); About     //Clear Data and exit theCloud->points.clear (); thecout<<"Point Cloud saved."<<Endl; the     return 0; +}

The program needs data to run. Please store the above two images in the Code root directory/data (without this folder to create a new one).

We use OpenCV's Imread function to read the picture. In OpenCV2, the image is based on a matrix (Cv::mat) as the basic data structure. The mat structure can help you manage memory, pixel information, and also support some common matrix operations, which is a very convenient structure. The color image contains r,g,b three channels, each of which occupies 8 bits (that is, unsigned char), so it is called the 8UC3 (8-bit unsigend char, 3-channel) structure. The depth map is a single-channel image, each pixel consists of 16 bits (that is, unsigned short in C + +), and the pixel value represents the distance from the sensor to that point. Usually the value of 1000 represents 1 meters, so we set the Camera_factor to 1000. In this way, the reading of each pixel in the depth map is divided by 1000, which is the true distance from you.

Next, we traverse the entire depth map in the order of "first row and down". In this double loop:

1  for (int0; m < depth.rows; m++) 2       for (int n=0; n < depth.cols; n++)

M refers to the line of the image, and N is the column of the image. The coordinate system between it and the spatial point is this:

In the depth chart, row m, the data of row n can be obtained using depth.ptr<ushort> (m) [n]. Where Cv::mat's PTR function returns a pointer to the head of the data in the first row of the image. Then with the displacement n, the data that the pointer points to is the data we need to read.

The formula for calculating three-dimensional point coordinates we have already given, the code has been implemented intact. Based on this formula, we have added a space point and put it into the point cloud. Finally, the entire point cloud is stored as a./data/pointcloud.pcd file.

Compile and run

Finally, we add a few lines of code to the Src/cmakelists.txt and tell the compiler that we want to compile the program. Please include the following lines in this file:

# increased dependency of PCL library Find_package (PCL REQUIRED components common IO) # increased OpenCV dependency find_package (OpenCV REQUIRED) # Add header file and library file _definitions (${pcl_definitions}) include_directories (${pcl_include_dirs}  ) Link_libraries (${PCL_LIBRARY_DIRS }) add_executable (Generate_pointcloud generatepointcloud. CPP ) target_link_libraries (Generate_pointcloud ${opencv_libs}     ${pcl_libraries})

Then, compile the new project:

1 CD Build 2 CMake . 3  Make 4 CD.

If the compilation passes, you can find the new write binary in the bin directory: Generate_pointcloud run it:

Bin/generate_pointcloud

You can generate a point cloud file in the data directory. Now, you certainly want to look at the newly generated point cloud. If you have already installed a PCL, you can do so by:

1 pcl_viewer DATA/POINTCLOUD.PCD

To view the newly generated point cloud.

Homework after class

In this lecture, we implemented a conversion program from 2D image to 3D point cloud. Next, we will explore the feature point extraction and registration of images. In the registration process, we need to calculate the spatial position of the 2D image feature points. Therefore, please write a header file with a source file that implements a Point2dto3d function. Please write the declaration of this function in the header file, give its implementation in the source file, and compile it into a library called Slam_base in CMake. (You need to consider how to define a better interface.) So, in the future when we need to calculate it, just call this function.

Turnip: Brother! This job looks a bit difficult!

Brother: Yes, it's too easy for readers to think.

Finally, the source code is still available to download from my git. If you have any questions about this article, please also email me: [email protected] readers ' encouragement is the best support for me!

Do It Together rgb-d SLAM (2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.