Video data preprocessing and video data preprocessing

Last Update:2015-01-13 Source: Internet

Author: User

Tags scale image

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Video data preprocessing and video data preprocessing
Video data preprocessing

Video data preprocessing can be divided into three steps: Video lens segmentation, key frame extraction, and feature extraction.

1. Video lens segmentation (lens boundary detection)

Lens segmentation is the first step in video processing and is the basis for subsequent video processing and analysis. The variation of video features in the same lens is mainly caused by two factors: the movement of the Object/camera and the change of light. There are two main conversion methods between lenses: CutTransition and Gradual Transition ).

(1) pixel Difference Method

First, define a pixel difference measure, and then calculate the inter-Frame Difference Between Two Consecutive Frames and compare it with a preset threshold. If the difference exceeds the threshold, the scenario is considered to have changed.

(2) histogram-Based Method

The histogram-based algorithm is the most common scenario segmentation method. It is easy to process and can achieve better results for most videos. The histogram-based method divides the gray level, brightness or color of each pixel of the adjacent frame into N levels, and then counts the number of pixels for each level to make a histogram comparison. This method collects statistics on the total gray scale or color distribution. It can tolerate motion in the camera and slow motion of the camera, it may only cause missed or missed checks when the lens content changes rapidly and the camera gradient.

(3) block matching

Based on the block matching method, each frame is first divided into small area blocks, and the similarity between Consecutive Frames is determined by comparing the corresponding blocks, this method uses the local features of the image to suppress noise and the impact of camera and object motion.

(4) motion-Based Method

The motion-based algorithm fully considers the motion conditions and features of objects and cameras in the same lens, and uses Motion Compensation and other methods to reduce the variation of the intra-lens frame difference caused by the movement of objects and cameras.

(5) contour-Based Method

When a simple video is separated, the contour-based algorithm works well, especially in the detection of gradient lenses. However, the main objects or backgrounds in most videos may have many complicated, subtle, or constantly changing outlines, which will interfere with the judgment on the lens edge and cause false checks; however, when the light is dark and the contour is not obvious (such as evening and fog), it may cause missed detection because it is difficult to detect the contour.

2. Key Frame Extraction

2.1 key frame meaning

A key frame is one or more of the most important and representative images in a lens. Based on the complexity of the lens content, one or more key frames can be extracted from one lens. The key frame selection should contain the main information of the current shot. It cannot be too complex to facilitate processing.

2.2 typical Key Frame Extraction Technology

2.2.1 first and last frame method and intermediate Frame Method

The first frame method uses the first and last images as the key frame, and the middle frame method is used to select an image centered on time as the key frame. Its disadvantage is that it limits the number of key frames of the lens and cannot accurately represent the lens information.

2.2.2 color, texture, and shape features

(1) Color Feature Extraction

Color is a major physical feature of an image. Few objects have similar color features. Color Features include color histograms, dominant colors, and average brightness. Color Feature retrieval is mainly based on color histograms. Color histogram represents the color frequency distribution of images. It is actually a statistical feature like color distribution. To describe the key frame changes of a shot. Two new content descriptors must be introduced: the main color histogram and the spatial structure histogram. The primary color is the color that occupies a relatively large proportion in an image. The primary color histogram can capture the longest colors that last. These colors are the main colors of the objects or backgrounds that interest this video clip from the color block graph. Spatial Structure histograms are a set of features that describe spatial motion information of images. It reflects the average brightness of the image on each axis of the color space. This adaptive Key Frame Structure Based on Motion detection. It comprehensively represents the content changes of the lens.

Simply put, the current frame is compared with the image of the last key frame. If many features change, it is a new key frame. Different video lenses get key frames of different data.

(2) Texture Feature Extraction

Texture is an irregular and macroscopic pattern of some images. Texture features include modularity, directionality, and contrast. Texture features are extracted using a gray-scale co-occurrence matrix during texture feature extraction. The gray level co-occurrence matrix defines the gray level of an image as N. Then, the gray level co-occurrence matrix is NxN, which can be expressed as M delta (I, j ). In the gray-scale co-occurrence matrix, four statistics indicating texture features are selected: contrast (contrast), texture consistency (uniformity), pixel-to-gray correlation (correlation), and entropy (entropy) as a feature vector. The preceding four texture features are extracted from four directions: 0 °, 45 °, 90 °, and 135 ° to form a 16-dimensional feature vector.

(3) Shape Feature Extraction

Contour shape is the main feature of an image, and Shape Feature Extraction relies on edge detection. Edge Detection mainly uses moment. The moment is used to describe the shape and the computation speed is fast. In the shape feature extraction algorithm, the moment of each color after quantification is calculated. Compared with the image segmentation method, the algorithm is more robust and simple. In each moment, the zero and one moment are selected as the spatial features of the image.

2.2.3 Motion Analysis

It is also an important factor for extracting key frames when significant motion information is generated by camera motion. If the focal length of the camera changes, the first and last frames are the key frames. If the angle of the camera changes and the overlap with the previous key frame is less than 30%, the current frame is the key frame.

2.2.4 clustering-Based Method

For a large image database, a clustering algorithm is used to classify images in the image database. Extraction of key frames greatly reduces the computing workload. This method is highly efficient in computing and can effectively obtain visual content with significant video lens changes. For low-activity lenses, a small number of key frames are extracted. Conversely, a large number of key frames are extracted.

3. Video Feature Extraction

The basic features of a video can be classified into static and dynamic features.

3.1 static features

Static features are mainly the image features of key frames. The feature extraction methods for key frames are the same as those for general static images. Static features include color features, texture features, and shape features;

3.1.1 Color Features

(1) Advantages of Color Features: color features have many advantages, including simple operations, stable properties, and some changes

Such as rotation, translation, scale transformation, and so on, so they are very robust.

(2) Color Space: The color is usually defined in three-dimensional color space, including RGB (red, green, blue), HSV

(Tone, saturation, brightness value) or HSB (tone, saturation, brightness ). The most commonly used color space is RGB,

HSV, LUV, and YCrCb. The RGB space structure does not conform to people's subjective judgment on color similarity, While HSV

Color Space is closer to people's subjective understanding of color space. The RGB space can be converted to the HSV space (in

RGB2HSV can be directly used in Matlab ).

(3) color histogram: The main representation of the image color information is the color histogram (with scale immutability and rotation is not

). In the color histogram, the value of the X axis depends on the number of colors in the image.

Number of pixels. Color histograms describe the proportions of different colors in the entire image, regardless

The spatial distribution of colors.

Distance Measurement: the distance between two colors can be measured in different ways. For example, a measure of the HSV space

Method:

This similarity measurement method is equivalent to Euler's distance in a cylindrical color space.

(4) color moment: Any color distribution in the image can be expressed by its moment, and the color distribution information is mainly concentrated.

In the lower-order moment. Compared with color histograms, this method does not require feature vectorization.

(5) color set: it is an approximation of the color histogram method. This method expresses the image as a binary color cable.

To support quick search in large-scale image libraries.

(6) color aggregation vector: it is an evolution of the color histogram. Its core idea is to belong to every bin of the histogram.

The pixels are divided into two parts: aggregated pixels and non-aggregated pixels. Contains the Color Distribution space information.

3.1.2 texture features

The texture contains important information such as the structure of the object surface and describes the object surface and its surrounding environment. Texture features include coarseness, contrast, directionality, linelikeness, regularity, and roughness ).

Common texture analysis and classification methods:

(1) wavelet transformation: wavelet transformation refers to the decomposition of signals into a series of basic functions, such as σmn (x ). Four sub-bands,

The frequency features are called LL, LH, HL, and HH. There are two types of wavelet transformations that can be used for texture analysis,

Here is the wavelet transform (PWT) of the pyramid structure and the wavelet transform (TWT) of the tree pile structure ).

(2) symbiotic matrix: first, a symbiotic matrix based on the directionality and distance between pixels is established, and then extracted from the matrix.

Meaningful statistics are used as texture features.

3.1.3 shape features

A shape can be defined as a surface structure profile feature of an object. It makes it possible for an object area to be different from other objects in its surrounding environment. Shape features can be expressed in two ways: contour features and regional features. The former only applies to the outer boundary of the object, while the latter is related to the entire shape area. The most typical methods to express these two types of shape features are Fourier Descriptor and shape-independent moment.

3.1.4 spatial relationship features

The location of the object in the image and the spatial relationship between the object are also very important features in image search. Spatial Relationship features can be divided into two categories: one method is to automatically divide the image, divide the objects or color areas contained in the image, and then index the image based on these areas; another method is to divide the image evenly into several sub-blocks and index each sub-block for feature extraction.

3.2 Dynamic Features

Dynamic features are unique to video data, including global motion (camera motion, such as shaking, pulling, and tracking) and local motion (movement of objects in the camera, motion Track, relative speed, location change between objects, etc ). Dynamic features are important features of video data. Because it is difficult to describe the motion changes of video sequences by using only the image features that represent frames.

(1) Global Motion

It mainly includes camera translation, rotation, and scaling. You can create a general model of camera motion to portray the camera.

. When estimating the parameters of the camera motion model, we first select enough observation points from the adjacent frames, and then use a certain matching algorithm to find the observed motion vectors of these points, finally, we use the parameter fitting method to estimate the model parameters.

(2) Local Motion

Local motion feature extraction technology: Motion Vector Extraction Technology Based on Optical Flow Field (using gray in Motion Image Sequence

The time-domain changes and correlation of degree data to determine the motion of image pixels ). Optical flow is an apparent motion based on Pixel gray scale and is not equal to the motion vector.

U optical flow constraint equation: the basic idea is to use the Motion Image function as the basic function, establish the optical flow constraint equation based on the gray conservation principle of the image, and calculate the motion vector by solving the optical flow constraint equation.

U Horn-Schunck Optical Flow Field Calculation Method: because each pixel has two unknown values

The flow equation is an uncomfortable problem. Based on the optical flow field caused by the same moving object, Horn should be continuous and smooth, that is, the speed of adjacent points on the same object is similar, then the optical flow changes projected onto the image should also be smooth, this paper proposes a method to convert the computing problem of the optical flow field into an optimization problem by using the additional constraint (the overall smoothing constraint) added to the optical flow field.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More