In the past, we needed to log on to another website to collect data. Because of the verification code, we studied the verification code recognition process.
The main procedures are as follows:
1. image binarization
Binarization implementation methods include:
1.1 grayscale images-median filtering, and background removal
1.2 grayscale images-Based on grayscale values
1.3 perform binarization based on the color range of the image
2. Remove Noise
Noise Removal can also be divided into the following:
2.1 Delete according to the number of 8 points around the noise
2.2 remove the four positive points around the noise
2.3 remove valid link points based on the number of valid link points
3. Remove interference lines
Generally, the interference line is 1 pixel width, either vertical or horizontal.
4. Image repair and filling of some deleted points by mistake, mainly in the following ways:
4.1 filling based on the number of positive valid points
4.2 fill a line based on whether it is part of a line
5. Image Segmentation
Searches for consecutive points, combines arrays, and then combines or separates Characters Based on the split length.
6. character image size Normalization
Set the image size to the same
7. Other adjustments
Bone extraction, gradient, etc.
8. Search for feature points (for example, the valid coordinates of an image, and analyze them based on actual conditions), and establish an image comparison scheme. You can use a neural network or Euclidean distance.
9. Image Learning
Font Library
10. Image Recognition!