Extraction and recognition of different colors of object recognition system based on color feature
(The above two pictures by a university robot laboratory head and Robot Sky editor Liu Weichao Friendship provided)
With the development of computer science and automatic control technology, more and more different kinds of intelligent robots appear in the production and life, vision system as an important subsystem of intelligent robot system, also more and more people pay attention to.
Vision System is a very complex system, it is necessary to achieve accurate image acquisition and the real-time response to external changes, but also the objective of the external movement to track real-time. Therefore, the visual system of hardware and software systems have put forward a high demand. At present, the more popular soccer robot technology, its vision system belongs to the more typical rapid recognition and response type.
Machine Vision system refers to the use of computer to achieve human visual function, that is, the use of computers to achieve the objective of the three-dimensional recognition of the world. The perception of the human visual system is part of the retina, which is a three-dimensional sampling system. The visible part of a three-dimensional object is projected onto the retina, and people follow a three-dimensional image of the object that is projected onto it (the understanding of the shape, size, distance from the observation point, texture and movement characteristics).
The input device of machine vision system can be camera, drum and so on, they all take three-dimensional image as input source, that is, the input computer is a three-dimensional tube view of the world's two-dimensional projection. If the three-dimensional objective world to two-dimensional projection image as a positive transformation, then the machine vision system to do is from this two-dimensional projection image to the three-dimensional objective world inverse transformation, that is, based on this two-dimensional projection images to reconstruct three-dimensional objective world.
The machine vision system mainly consists of three parts: image acquisition, image processing and analysis, output or display. Image acquisition is actually the visual image and intrinsic characteristics of the object to be measured into a series of data can be processed by computer, it mainly consists of three parts: illumination, image focus formation, image identification and formation of the camera output signal. The processing technology of visual information mainly relies on image processing method, which includes image enhancement, data encoding and transmission, smoothing, edge sharpening, segmentation, feature extraction, image recognition and understanding. After these processing, the quality of the output image has been improved to a great extent, which improves the visual effect of the image and facilitates the analysis, processing and recognition of the image by the computer.
Robot vision system mainly uses color, shape and other information to identify environmental objectives. Take the robot color recognition as an example: when the camera gets a color image, the embedded computer system on the robot will digitize the analog video signal, dividing the pixel into two parts according to the color: the pixel of interest (the target color of the search) and the pixels that are not interested (background color). These pixels of interest are then matched to the RGB color component. In order to reduce the influence of ambient light intensity, the RGB color domain space can be transformed into his color space.
In the color vision system of a soccer robot, the program is based on the color tag on the top of the robot car to identify which team the robot belongs to and what number of players it is. Since there are two colors on the top of each robot trolley, they are the team and the player's mark. Therefore, the first step in identifying the work is to classify each pixel in the image into a set of discrete color classes based on the color.
The common methods of color classification are linear color thresholding, nearest neighbor domain and threshold vector method.
Among them, the linear color threshold method is to divide the color space by linear plane, the threshold value can be determined by the direct threshold and the automatic training to obtain the target color range, and the neural network and multi-parameter decision tree method can be used for self-learning to obtain the appropriate threshold value; When segmenting images with nearest neighbor taxonomy, Then use the membership function, that is, according to the maximum degree of membership to determine which color belongs to which class; the threshold vector method uses a predetermined set of threshold vectors to determine the color value in the color space.
After the color classification, the points of each color class must be processed to identify the position and direction angle of each of the players and the ball on the pitch. When recognizing, it is common practice to scan the sorted pixels once, and the adjacent pixels of the same color will be connected to the colour block.
Color recognition based on threshold vector
One, color space selection
For the use of color image segmentation method to identify the target, the first to choose the appropriate color space, the color space commonly used are RGB, YUV, HSV, CMY and so on. The choice of color space directly affects the image segmentation and the effect of target recognition.
rgb--the most commonly used color space, where brightness equals R, G, and B3 components. The RGB color space is an uneven color space, the perceptual difference between two colors and the Euclidean distance between two points in space is not linear ratio, and R, G, b values are very high correlation, to the same color attribute, under different conditions (light source type, intensity and object reflection characteristics), the RGB value is very scattered, For identifying a particular color, it is difficult to determine its threshold and its distribution within the color space. Therefore, the color space from which the luminance component can be separated is usually chosen, the most common of which is YUV and HSV color space.
hsv--approaches the way the human eye perceives color, H is the hue (hue), S is the color saturation (saturation), and V is the luminance (Value). Hue H can accurately reflect the color type, the sensitivity of the external illumination conditions is low, but H and S are R, G, B nonlinear transformation, there is a singular point, in the vicinity of the singular point even if R, G, b the value of a small change also causes a great jump in the value of the transformation.
Yuv--rgb color space linearly changes to luminance color space. is to solve the problem of compatibility between color TV and monochrome TV sets. Y indicates brightness (luminance), and UV is used to denote chromatic aberration (chrominance). The importance of YUV notation is that its luminance signal (Y) and chroma signal (U, V) are independent of each other. The so-called chromatic aberration refers to the difference of 3 component signals (i.e. R, G, B) and luminance signals in the base color signal.
The YUV format has the following relationship with RGB:
Second, threshold determination and color judgment
In determining the threshold value, first through the acquisition of samples to train, so as to obtain a predetermined number of colors in the YUV space component of the upper and lower thresholds, 2 is shown.
When a determined pixel in the color space in the position of the box, it is considered that the pixel belongs to the color to be found, so as to complete the recognition of the image color. In Y-space, the Y-value represents brightness, because it varies greatly, so only the values of U and V are considered, and in the case of color judgment, the threshold vectors of U and V are established respectively.
After color recognition, image segmentation is performed, and the seed filling algorithm is used in the image segmentation, and the whole seed filling is carried out at the same time as the color of the pixel, at first it is not the processing of all the pixels, but the chunking. When the center point is the color to be recognized, it spreads around the point as a seed and determines the color of the surrounding pixels until the entire block is filled.
A brief introduction of robot vision recognition technology