Machine Learning Combat: License Plate Recognition system

Source: Internet
Author: User
Tags virtual environment virtualenv

In this tutorial, I'll take you to use Python to develop a license plate recognition system using machine learning technology (License Plate recognition). What we're going to do.

The license plate recognition system uses optical character recognition (OCR) technology to read the characters on the license plate. In other words, the license plate recognition system takes the vehicle image as input and outputs the characters in the license plate. If you're an undercover or a detective, you can imagine how valuable this would be for your work: You can take photos of vehicles to extract almost all the necessary information for a car. What is the relationship between machine learning and license plate recognition?

In fact, the development of license plate recognition system does not necessarily need to use machine learning technology. For example, you can also use non machine learning proprietary techniques such as template matching and feature extraction. However, machine learning enables us to improve the accuracy of the recognition system through training. We will use machine learning to do character recognition, the image of the character map to its actual characters, such as A, B, and so on. does this tutorial suit me?

If you want to build your own Jarvis with the sound of Morgan Freeman in the movie, that's right for you. Well, that's a lot of exaggeration. In fact, this tutorial just shows you how to use image processing and machine learning to solve real-life problems. You will learn some of the concepts of Python, image processing, and machine learning. I will explain these concepts as much as possible, and you can further study them to better understand them. If you want to practice right away, I recommend that you use the Python machine in Huitong network to learn about the online running environment. let's start now.

LPR is sometimes referred to as automatic license plate recognition (ALPR), which includes three processing stages: license plate detection: This is the first and possibly the most important stage. The task of this phase is to determine the location of the license plate, input is the vehicle image, output is the image of the license plate area. Character segmentation: The task at this stage is to split the characters on the image of the license plate area into a separate image. Character Recognition: The task at this stage is to recognize the previously segmented character image as a specific character. At this stage we will use machine learning. enough theory, can you start coding now?

Of course, let's get down to work environment first. First you need to create a virtual work environment. This can simplify project dependencies and package management. You can use the VIRTUALENV package to create a virtual environment:

# Install virtualenv If you don ' t have the package already
pip install virtualenv mkdir license-plate-recognition<
C2/>CD license-plate-recognition
virtualenv lpr
source Lpr/bin/activate

Now, in your project directory, there should be a folder called LPR.

Then we'll install the first package scikit-image. This is a Python package for image processing. To install it, simply run the following command:

Pip Install Scikit-image

Key dependencies for this package include: scipy (Scientific Computing), NumPy (Multidimensional array operations), and Matplotlib (drawing graphics and displaying images). Another important package is pillow, a Python image library. License plate Detection (plate location)

This is the first stage, the goal is to determine the vehicle image of the license plate location. To do this, you first need to read the image file and convert it to a grayscale image. In grayscale images, the value of each pixel is between 0 and 255. Then we need to convert it to a two-value image, where the pixel value is either black or white, with only two possible values.

Run the following code to display two images: one grayscale, one black and white:

From Skimage.io import imread from
skimage.filters import threshold_otsu
import matplotlib.pyplot as Plt

Car_image = Imread ("car.jpg", As_grey=true)
# It should be a 2 dimensional array
print (Car_image.shape)

# The Next line isn't compulsory however, a grey scale pixel # in skimage ranges between
0 & 1. Multiplying it with 25 5
# 'll make it range between 0 & 255 (something we can relate better with

gray_car_image = Car_image * 255
  fig, (ax1, ax2) = Plt.subplots (1, 2)
ax1.imshow (gray_car_image, cmap= "Gray")
Threshold_value = Threshold_ Otsu (gray_car_image)
binary_car_image = gray_car_image > Threshold_value ax2.imshow
(Binary_car_image, Cmap= "Gray")
plt.show ()

Run Result:

We use the Connectivity component analysis (Connected Component analyses) algorithm to identify all connected regions in the image. You can also try other methods such as edge detection and morphological processing. CCA helps us to group and annotate connected areas in the foreground. If two pixels have the same values and are adjacent to each other, they are considered to be connected:

from skimage import measure from skimage.measure import regionprops import Matplotlib.pyplot as PLT Import matplotlib.patches as patches import localization # This gets all connected regions and groups them Er label_image = measure.label (localization.binary_car_image) FIG, (AX1) = Plt.subplots (1) ax1.imshow (

Localization.gray_car_image, cmap= "Gray"); # Regionprops creates a list of properties of all-labelled regions for region in Regionprops (label_image): If Regi On.area < #if the region is so small then it's likely not a license plate continue # the bound ing box coordinates minrow, Mincol, maxrow, Maxcol = region.bbox Rectborder = patches. Rectangle ((Mincol, Minrow), Maxcol-mincol, Maxrow-minrow, edgecolor= "Red", linewidth=2, Fill=false) Ax1.add_patch (rec Tborder) # Let's draw a red rectangle over those regions plt.show () 

We need to import the previous files to access the values in them. The Measure.label method is used to map and annotate all connected regions in a binary image. Calling the Regionprops method on the annotated image returns a list of all connected areas (and their properties, such as area, bounding box, label, and so on). We use patches. The rectangle method draws a rectangle on all mapped areas.

From the result image, we can see that some of the connected areas that do not contain the license plate are also circled. To eliminate these areas, we need to use some typical features of the license plate to filter: the license plate is rectangular. The width of the license plate is greater than the height. The width of the license plate area is between 15% and 40%, compared with the overall image. The ratio between the height of the license plate area and the whole image is between 8% and 20%.

If these characteristics do not match the license plate you are dealing with, you should adjust these features without hesitation or mercy.

The code is as follows:

From skimage import measure from skimage.measure import regionprops import matplotlib.pyplot as PLT import MATPLOTLIB.PATC Hes as Patches import localization # This gets all the connected regions and groups them together = Label_image  Abel (Localization.binary_car_image) # Getting the maximum width, height and minimum width and height that a license plate can be plate_dimensions = (0.08*label_image.shape[0], 0.2*label_image.shape[0], 0.15*label_image.shape[1], 0.4*label_ IMAGE.SHAPE[1]) min_height, Max_height, min_width, max_width = plate_dimensions plate_objects_cordinates = [] Plate_like

_objects = [] fig, (AX1) = Plt.subplots (1) ax1.imshow (localization.gray_car_image, cmap= "Gray"); # Regionprops creates a list of properties of all-labelled regions for region in Regionprops (label_image): If Regi On.area < #if the region is so small then it's likely not a license plate continue # the bound ing box coordinates min_row, Min_col, Max_row, Max_col = Region.bbox Region_height = Max_row-min_row region_width = max_col-min_col # ensuring that the Regi On identified satisfies the condition of a typical license plate if region_height >= min_height and Region_height &
        lt;= max_height and Region_width >= min_width and Region_width <= max_width and Region_width > Region_height: Plate_like_objects.append (Localization.binary_car_image[min_row:max_row, Min_col:ma X_col]) plate_objects_cordinates.append (Min_row, Min_col, Max_row, Max_col)) Rectborder = patches. Rectangle ((Min_col, Min_row), Max_col-min_col, Max_row-min_row, edgecolor= "Red", linewidth=2, Fill=false) ax1.add_ Patch (Rectborder) # Let's draw a red rectangle over those regions plt.show ()

In the above code, the area which is unlikely to be a licence is removed according to the given license plate characteristics. However, there are still areas (such as headlights, etc.) that look exactly like the license plate and may be labeled as a license plate. To eliminate these areas, we need to do a vertical projection: that is, to accumulate all the pixels in each column. Because of the character image in the license plate area, we expect to get a high column pixel accumulation value in the license plate area. Character Segmentation

At this stage, we will extract all the character images on the license plate. We continue to use connected component analysis (CGA).

Import NumPy as NP from Skimage.transform Import resize to skimage import measure from skimage.measure import Regionprop S import matplotlib.patches as patches import matplotlib.pyplot as PLT Import CCA2 # on the image I ' m using, the headlamp s were categorized as a license plate # because their shapes were similar # for now I'll just use the plate_like_objects[2 ] Since I know that ' s the # license plate.  We'll fix this later # The invert is done so as to convert the black pixel to white pixel and vice versa license_plate = Np.invert (cca2.plate_like_objects[2]) labelled_plate = Measure.label (license_plate) FIG, ax1 = plt.subplots (1) ax1.ims How (License_plate, cmap= "Gray") # The next two lines are based on the assumptions this width of # a license plate Shoul D be between 5% and 15% of the license plate, # and height should is between 35% and 60% # This'll eliminate some Ter_dimensions = (0.35*license_plate.shape[0], 0.60*license_plate.shape[0], 0.05*license_plate.shape[1], 0.15*LICENSE_PLATE.SHAPE[1]) min_height, Max_height, min_width, max_width = character_dimensions characters = [] counter  =0 column_list = [] for regions in Regionprops (labelled_plate): y0, x0, y1, x1 = Regions.bbox region_height = y1- Y0 region_width = x1-x0 if region_height > Min_height and Region_height < max_height and Region_width &G T Min_width and Region_width < Max_width:roi = License_plate[y0:y1, x0:x1] # Draw a red bordered Rectan
        GLE over the character. Rect_border = patches.

        Rectangle ((x0, y0), x1-x0, y1-y0, edgecolor= "Red", linewidth=2, Fill=false) Ax1.add_patch (Rect_border) # Resize the characters to 20x20 and then append each character into the characters list Resized_char = Resize (r Oi, (a)) Characters.append (Resized_char) # This is just to keep track of the arrangement of the Char Acters column_list.append (x0) plt.show ()

The list plate_like_objects contains all the candidate plate areas in the vehicle image. In the sample image I used, three areas were selected as the candidate area for the license plate. In this tutorial, in order to save time, I manually specified a second area (the area that really contains the license plate). The final code shared below contains a license area validation feature that automatically excludes areas that do not actually contain a license plate.

Next we do a connectivity component analysis on the license plate to adjust the size of each character to 20px,20px. This is because the size of the character is related to the recognition of the next stage.

To track the order of characters in a license plate, the column_list variable is introduced to record the x-coordinate of each character region. This allows sorting to determine the order of precedence between multiple characters. Character Recognition

This is the last stage of license plate recognition, we first introduce the machine learning. Machine learning can be simply defined as a development branch of artificial intelligence (AI), which processes data in order to find patterns that can be used for forecasting from the data. Machine learning can be divided into supervised learning, unsupervised learning and intensive learning. Supervised learning to use annotated datasets (called Training datasets) for forecasting. We will learn by monitoring because we already know the look of letters A and B. Supervised learning can be divided into two categories: classification and regression. Character recognition belongs to the category.

What we need to do now is to get the training data set select the supervised learning Classifier training model to test the accuracy of the model to use the model to predict.

Let's start the training model. I have two different datasets, one is 10px,20px and the other is 20px,20px. We will use the 20px,20px dataset, because each character has been adjusted by this size before. Each letter except O and I (because they are similar to 0 and 1 respectively, so that the Nigerian licence does not use these letters) has 10 different images.

You can try different classifiers, each of which has its advantages and disadvantages. In this task, we will use the support vector classifier (SVC). SVC is chosen because it performs best in this task. However, this is not to say that Svc is the best classifier.

We need to first install the Scikit-learn package:

Pip Install Scikit-learn

The code for this section is as follows:

Import OS import NumPy as NP from SKLEARN.SVM import SVC/sklearn.model_selection import Cross_val_score from Sklearn.
            Externals Import joblib from Skimage.io import imread from skimage.filters import threshold_otsu letters = [ ' 0 ', ' 1 ', ' 2 ', ' 3 ', ' 4 ', ' 5 ', ' 6 ', ' 7 ', ' 8 ', ' 9 ', ' A ', ' B ', ' C ', ' D ', ' E ', ' F ', ' G ', ' H ', ' J ', ' K ', ' L ', ' M ', ' N ', ' P ', ' Q ', ' R ', ' S ', ' T ', ' U ', ' V ', ' W ', ' X ', ' Y ', ' Z ' def read_training_data (training_director Y): Image_data = [] Target_data = [] for each_letter in letters:for all in range: IM Age_path = Os.path.join (Training_directory, Each_letter, Each_letter + ' _ ' + str (each) + '. jpg ") # Read each I Mage of each character img_details = Imread (Image_path, as_grey=true) # converts each character im Age to binary image binary_image = Img_details < Threshold_otsu (img_details) # 2D array of Each image is flattenedBecause the machine learning # classifier requires that each of the sample is a 1D array # therefore the 20*20 image becomes 1*400 # in machine learning terms so ' s features with each pixel # Repres
            Enting a feature flat_bin_image = Binary_image.reshape ( -1) image_data.append (flat_bin_image) Target_data.append (Each_letter) return (Np.array (image_data), Np.array (Target_data)) def cross_validation (mod El, Num_of_fold, Train_data, Train_label): # This uses the concept of cross validation to measure the accuracy # O f a model, the num_of_fold determines the type of validation # e.g if num_of_fold is 4, then we are performing a 4-fol D Cross Validation # It'll divide the dataset into 4 and use 1/4 of it for testing # and the remaining 3/4 for T He training Accuracy_result = Cross_val_score (model, Train_data, Train_label, CV =num_of_fold) Print ("Cross Validation result for", str (num_of_fold), "-fold") print (Accuracy_result *) Current_dir = Os.path.dirn Ame (Os.path.realpath (__file__)) Training_dataset_dir = Os.path.join (Current_dir, ' train ') image_data, target_data = Read_training_data (training_dataset_dir) # The kernel can be ' linear ', ' poly ' or ' RBF ' # The probability is set to True So as to show # How sure the model is of it ' s prediction Svc_model = svc (kernel= ' linear ', probability=true) cross_validat Ion (Svc_model, 4, Image_data, Target_data) # Let's train the model with all the input data Svc_model.fit (Image_data, Targ Et_data) # We'll use the Joblib module to persist ' model # into files. This is means the next time we are need to # predict, we don ' t need to train the model again save_directory = Os.path.join (c Urrent_dir, ' models/svc/') if not os.path.exists (save_directory): Os.makedirs (save_directory) joblib.dump (Svc_model, save_directory+ '/svc.pkl ')

In the above code, you use each character in the training dataset to train the SVC model. We use 40 percent cross-validation to determine the accuracy of the model and then save the model to the model file for subsequent predictions.

Now that we have a good model, we can try to predict the character images we split before:

Import OS Import segmentation from sklearn.externals import joblib # load the model Current_dir = Os.path.dirname (os.path

. Realpath (__file__)) Model_dir = Os.path.join (Current_dir, ' models/svc/svc.pkl ') model = Joblib.load (Model_dir) Classification_result = [] for each_character in segmentation.characters: # converts it to a 1D array each_charact
    ER = each_character.reshape (1,-1); result = Model.predict (each_character) classification_result.append (Result) print (Classification_result) Plate_stri  ng = ' for eachpredict in classification_result:plate_string + + + eachpredict[0] Print (plate_string) # it ' s possible The characters are wrongly arranged # since that ' s a possibility, the column_list'll be # used to sort the letters in T  Column_list_copy = segmentation.column_list[:] Segmentation.column_list.sort () rightplate_string = ' for Each in Segmentation.column_list:rightplate_string + + Plate_string[column_list_copy.index (each)] Print (RIGHTPLate_string) 
Attention

The most important thing for this system to work properly is to ensure that the incoming vehicle image is clear. In addition, the image should not be too large, 600px wide enough.

If you like this article, remember to pay attention to me: the brain in the new cylinder.

Complete code please visit here.

Original: Developing a License Plate recognition System with Machine Learning in Python

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.