Computer Vision is an important field of artificial intelligence technology. For example (not necessarily appropriate), I thinkComputer Vision is the eyes of the AI age, Showing the importance. Computer Vision is actually a very ambitious concept. It is the skill tree that someone has summarized in computer vision.
If you are not familiar with computer vision, Do not be scared by this skill tree. No one can master all the above skills at the same time. This tree just gives you a rough understanding of computer vision.
Let's take a look at computer vision. The following video shows the practical application of computer vision in autonomous driving, it involves key technologies in computer vision, such as stereoscopic vision, optical flow estimation, visual ODPS, 3D object detection and recognition, and 3D object tracking.
Here is a relatively easy way to get started with computer vision.
Macro understanding
Tom often sees so many segments in the brain. Is it learning face recognition, object tracking, or computing photography and 3D reconstruction? I don't know how to proceed. In fact, there is a lot of common knowledge in these subdivisions. My suggestion is that you cannot eat hot tofu in a hurry,Only by having a preliminary and comprehensive understanding of the field of computer vision can you find the research directions you are interested in based on your actual problems.And interest can support a self-taught little white to overcome difficulties and continue.
1. Getting Started books
Since it is an entry, it is not recommended to read a book similar to multiple view geometry in computer vision, which is classic but easy to give up.
Pixel-level image processing knowledge is the underlying knowledge of computer vision. No matter which subdivision of computer vision you will be engaged in the future, these basic knowledge must be understood. Even if you are eager to get started, you must be steadfast. Some people on the Internet say that they can learn from a project directly, so that they can learn quickly. I partially agree with this because of hisIgnoring the importance of basic knowledgeThere is no basic term or concept knowledge in his mind. He does not know how to properly express many problems, nor does he have any idea about problems or how to search. This will seriously slow down the process, it is impossible to do more in-depth research.
The basic knowledge of getting started with image processing is not simply getting rid of books. Otherwise, there may be several formulas and terminologies that may overwrite the world. Two approaches are recommended here, both of which areStarting from practice and integrating with Theory: Opencv and Matlab.
Opencv is based on C ++ and requires a certain programming foundation. It has high portability, fast running, and comparison.Suitable for actual projectsMATLAB is widely used in the company. MATLAB can be quickly used only after a very simple programming Foundation, which is easy to implement and the code is concise. The reference materials are very rich, it is convenient and quick to try out an algorithm and is suitableAcademic Research. Of course, it is better to use the two together. The following is a brief introduction.
Learning image processing with MATLAB
It is recommended that you use the digital image processing (MATLAB version) published in April 2001 and translated in April 2005 ). You don't need to go through it all the time. You just need to use MATLAB to learn the basic principles, image transformation, Morphological Processing, and image segmentation.We strongly recommend that you manually repeat the code in the book.(The effect is completely different from the one you read). You can scan other chapters quickly. However, this book pays more attention to practice and does not explain much about the theory. What the theoretical part does not understand can be found in the related book "Digital Image Processing (second edition)" of the related file, this book is mainly used as a Tool Book. It will be nice to know where to query relevant terms in the future.
Learning image processing with opencv
Opencv (Open Source Computer Vision Library) is an open-source cross-platform computer vision library, mainly prepared by C ++, it contains more than 500 common algorithms for image/video processing and computer vision.
For Learning opencv, refer to learning opencv or opencv 2 computer vision Programming Manual. These two books are mostly practical books with less theoretical knowledge. You can quickly learn about the powerful opencv by following the steps in the book and want to implement a function, as long as you learn to query functions (query the corresponding version at https://www.docs.opencv.org/), you can easily call functions. Since each example has a very intuitive visual image output, it is easier and interesting to learn.
2. Advanced books
After the basic learning of image processing, Tom has learned the basic knowledge of image processing and will use opencv or Matlab to implement a simple function. However, this knowledge is too thin and outdated. There is still a lot of new knowledge in the computer vision field waiting for you.
Two options are also provided for you. Of course, both options are better. A book is Computer Vision: algorithms and application, written by Richard szeliski, University of Washington, USA, in 2010. One is published in 2012 by Simon J., University of Toronto, Canada. d. computer Vision: algorithms and applications written by Prince. The two books have different focuses,The former focuses on visual and geometric knowledge, and the latter focuses on machine learning models.. Of course, the two books also have overlapping parts. Although there are both Chinese versions, if you have a certain degree of Basic English reading, we recommend that you read the original English version (see the retrieval method at the end of the article ). The pictures and examples of books written by foreigners are rich and easy to understand.
Computer Vision: algorithms and Application
This book illustrates the many major directions of computer vision. With the foundation of digital image processing, you are familiar with some of the content in this book, not that strong sense of fear. Compared with the previous basic image processing book, this book adds a lot of new content, such as Feature Detection and matching, motion recovery structure, dense motion estimation, image stitching, computational photography, three-dimensional matching, and 3D reconstruction, these are currently very useful. If you have time, you can browse the book. If you do not have enough time, you can select the desired direction based on your interests. The Chinese version of this book is not very good at translation. It can be viewed in conjunction with the original English version.
Computer Vision: models, learning, and Inference
The book begins with the basic probability model, covering the probability models, regression classification models, graph models, and optimization methods commonly used in the computer vision field, as well as image processing at the underlying layer, multi-angle geometric knowledge, illustrated images, supplemented by many examples and applications, it is very suitable for beginners. On its homepage:
Http://www.computervisionmodels.com/
You can download e-books for free. In addition, there are a wealth of learning resources, including for teachersPPT, open-source projects, code, and dataset links corresponding to each chapter.
In-depth practices
When you have a relatively broad understanding of the computer vision field, the next step is to select a specific field of interest for further exploration. This period is a specific part of programming practice. If you have any questions in the practice process, you can search in the book based on the relevant terms and use Google to solve most of your problems.
So what is the specific direction?
If you have actual projects in your lab or company, you 'd better select the current project to continue working on. If there is no specific direction, continue to look down.
I personally thinkComputer vision can be divided into two areas: learning-based and geometric-based methods. The most popular learning-based method is deep learning, and the most popular geometric method is visual slam.. The following shows a relatively easy starting posture for the two directions.
1. Deep Learning
The concept of deep learning was proposed by Hinton and others in 2006. The earliest and most successful application field was computer vision, and the classic convolutional neural network was developed for processing image data. Deep Learning has been widely used in computer vision, speech recognition, natural language processing, intelligent recommendation, and other fields.
Deep learning requires a certain amount of mathematical foundation, including calculus and linear algebra. When I hear these courses, I think of the nightmare of college. In fact, I only use a very basic concept, no worries. However, if a book is written as soon as it comes up, there may be a strong fear and it is easy to give up early.
Andrew Ng (Andrew Ng)'s deep learning video course I think is a very good entry material. First of all, he is a professor at Stanford University, so he understands students very well and can clearly and deeply start from the basic derivative. This is really rare.
This course can be viewed free of charge on the Netease cloud course, with Chinese subtitles, but no matching exercises. You can also study on Coursera, an online education platform established by Wu NDA. It has related exercises and is free of charge for a limited time. After passing the course, you will receive a certificate.
This course is very popular and you don't have to worry about it. There are countless Study Notes on the Internet for reference. A must-have dish for beginners.
2. Visual slam
Slam (simultaneous localization and Mapping) (for details, see slam. Visual slam uses the camera as the main sensor and the video stream as the input to implement the slam. Visual slam is widely used in cutting-edge fields such as VR/AR, self-driving, intelligent robots, and drones.
The best entry to visual slam is Gao Xiang (Tsinghua doctor, Munich tech PLC)'s visual slam 14th lecture-from theory to practice. Each chapter of this book covers basic theories and code examples. It is very in-depth and focuses on the combination of theories and practices, which greatly reduces the learning threshold of Xiao Bai.
Now, you can start your computer vision learning journey!
Tip: for some of the books mentioned in this article, you can obtain them by replying to "Getting started" at the bottom of the "Computer Vision life" menu bar.
I don't know why I can't watch the inserted video. I want to watch the application video of computer vision. Here I want to see: How can I get started with computer vision?
How to get started with computer vision?