Apply Stroke width Transformation (SWT) to detect text in a natural scene

Source: Internet
Author: User
Tags abs cos sin

Introduction:

Application background: It is an important step in the application of computer vision system such as the Blind assistant system and the machine navigation in urban environment. Getting text can provide contextual clues to many visual tasks, and the performance of the image retrieval algorithm relies heavily on the corresponding text detection module.

Meaning: Traditional OCR is used to scan text, so it relies on the correct separation of text from the background pixels. This is simple for scanning text, but natural images are difficult to separate text from the background due to the presence of color noise, blurring, occlusion.

The article proposes a method: The text has a fixed stroke width, which can be used to restore it from the background. First, the stroke width of the image is changed, that is, each pixel is assigned a stroke width, and then the position with similar width is aggregated into words using flexible geometric reasoning. This similarity is not very strict, it is possible to change within a certain range.

Pros: 1. No extraction separates each pixel characteristic such as: color, gradient, etc., but proposes the characteristics of the pixel group. 2. Instead of using a window that slides on a multiscale pyramid, pixels with similar stroke widths are merged into connected fields based on a bottom-up approach. 3. There is no use of specific language information filtering mechanisms, all of which can be used for multi-language text detection. Previous work: 1. Texture-based

Scan the image on multiple scales, then use some text features such as: High-density edge, low gradient of text up and down, gray change is obvious, waveform distribution, discrete cosine transform coefficient, etc. to classify pixels. Its disadvantage is the large computational volume, the lack of precision. 2. Region-based

Pixels can be organized into connected domains by having similar characteristics such as the same color. Then use geometry or texture information to exclude connected domains that cannot be text. The advantage is that the text is detected at all scales at the same time and is not restricted to horizontal text. The flowchart of the algorithms:

The process first calculates the canny edge of the image, then calculates the SWT of the image based on the direction information of the edge, aggregates the pixels into connected fields according to the stroke width information, uses the geometric inference such as the aspect ratio of the connected domain, the variance of the connected domain strokes, the mean value, the median value, etc. to filter the connected domain, Finally, the lines of text are divided into words. The core of the process is SWT and filtering connected domains. The following is a detailed description of these steps and gives the MATLAB and C + + code. 1. Stroke Width Transform

The input is the original color image, and the output is an image with a stroke width assigned to each pixel.

First Pass:

There are two problems with this flowchart 1. If a point is scanned by multiple paths, the assignment is the minimum path distance, which is the stroke width. 2. There is a problem at the turn of the stroke, such as the right side of the picture above, from top to bottom and two paths from left to right through this, assuming that the stroke width from top to bottom is 50, left to right, 40, then the stroke width is 40. This is clearly not in line with the actual situation. This requires a second scan.

Second Pass:

The second scan process is that for all the first scanned paths, the median value on the path is calculated, and all points above the median on this path are assigned the median value. 2. Pixel Poly connected Domain

From the first step we get the stroke width of the image, this is also a pixel, so we have to according to some rules to merge pixels into a region (that is, a bottom-up process) This is mainly by changing the traditional connected domain analysis of the connection conditions obtained, Traditional connectivity is the pixel around 4 connections or 8 connected pixels are the same value is the same connected domain, and here change this condition, the center pixel and the surrounding pixel SW (stroke width) than 3.0. 3. Connected Domain filtering

There are many areas in the connected domain that are obviously not characters in the second step, and we need to filter these areas according to some prior knowledge. Mainly according to the above rules

VARIANCESW: The variance of the strokes in the connected domain. MEANSW: The mean value of the stroke of the connected domain. Aspectratio: Connected domain aspect ratio. Diameter: connected domain diameter. Width: The width of the connected domain. Height: Connected domain high

1). 10

2). VARIANCESW/MEANSW >. 5

3). DIAMETER/MEDIANSW >= 10

4). Aspectratio < 0.1 && Aspectratio > 10

5). The Boudingbox of a connected domain cannot contain more than two connected parts 4. Word literal text line

As with the third step, from the above is a candidate character (connected domain), we need to match the word and the written bank. According to the following rules

1). The median stroke width ratio of two candidate characters is not more than 2.0.

2). The width of the character is not more than 3 times times the width of the widest character.

3). The color of the character is consistent 5. Line of text divided into words

The horizontal distance histogram of lines of text is segmented (.. )

Paper Address: http://www.math.tau.ac.il/~turkel/imagepapers/text_detection.pdf

The core parts of C + + and Matlab are shown in the next chapter.


Recently has been looking at the work of books, put the paper on hold, the previous commitment to the code has been dragged. Now put the code to send up, only the core part of, are not I write, I am online finishing download, Matlab code effect is poor.

All file network download address: Http://pan.baidu.com/s/1qWwNMfM;

1.c++ Code

Download Address:

The OPENCV and boost libraries need to be installed first.

Boost Library download address: http://www.boost.org/users/download/;

Boost installation: http://www.cnblogs.com/pangxiaodong/archive/2011/05/05/2037006.html;

To install this boost library, I just copied the files to the Include folder in the VS installation directory.

GitHub repository:https://github.com/aperrau/detecttext 2.matlab code

function [Swtmap] = SWT (IM, searchdirection)%swt preforms Stoke width transform on input image% A novel image oper  Ator that seeks to find the value of the stroke width% for each image pixel.
It's use was meant for the task of the text% detection in natural images. % IM = RGB input image of size m x N x 3% SearchDirection = Gradient direction is either 1 to detect dark text on L
Ight% background or-1 to detect light text on dark background. % Swtmap = resulting mapping of stroke withs for image pixels% Convert image to gray scale im = Im2double (Rgb2gray (i
m));

%figure, Imshow (IM), title (' Black and White Image ');
% Find edges using canny edge dector Edgemap = Edge (IM, ' canny ');

%figure, Imshow (Edgemap), title (' Edges Using Canny ');

% Get all edge pixel postitions [Edgepointrows, Edgepointcols] = find (EDGEMAP);
% Find gradient horizontal and vertical gradient Sobelmask = fspecial (' Sobel ');
DX = IMFilter (im,sobelmask);
DY = IMFilter (Im,sobelmask '); %figure, Imshow (dx, []), title (' Horizontal Gradient Image ');

%figure, Imshow (dy, []), title (' Vertical Gradient Image ');

% Initializing matrix of gradient direction theta = zeros (Size (edgemap,1), size (edgemap,2));
% calculating theta, gradient direction, for each pixel on the image. % ***this can be optimized by using Edgepointcols and edgepointrows% instead.*** for i=1:size (edgemap,1) for j=1:size
        (edgemap,2) if Edgemap (i,j) = = 1 Theta (i,j) = atan2 (Dy (i,j), DX (i,j));

End End end% Getting size of the image [M,n] = size (EDGEMAP);
% Initializing Stoke Width array with infinity Swtmap = zeros (m,n);
    For i=1:m for J=1:n swtmap (i,j) = inf; End end% Set The maximum stroke width, this number was variable for now but must was% made to being more dynamic in the Futu

Re maxstrokewidth = 350;
% Initialize container for all stoke points found strokepointsx = zeros (Size (edgepointcols));
Strokepointsy = zeros (Size (STROKEPOINTSX));

sizeofstrokepoints = 0; % Iterate THrough all edge points and compute Stoke widths for i=1:size (edgepointrows) step = 1;
    Initialx = Edgepointrows (i);
    Initialy = Edgepointcols (i);
    Isstroke = 0;
    Initialtheta = Theta (initialx,initialy);
    Sizeofray = 0;
    Pointofrayx = zeros (maxstrokewidth,1);
    
    Pointofrayy = zeros (maxstrokewidth,1);
    % Record First point of the Ray Pointofrayx (sizeofray+1) = Initialx;
    
    Pointofrayy (sizeofray+1) = initialy;
    
    % increase the size of the ray Sizeofray = Sizeofray + 1;  % follow the ray while step < maxstrokewidth NEXTX = round (initialx + cos (initialtheta) * searchdirection *
        STEP);
        
        Nexty = round (initialy + sin (initialtheta) * searchdirection * step);
        
        Step = step + 1;  % break loops if out of bounds.
        For some reason the is really% slow. If NEXTX < 1 | Nexty < 1 | Nextx > M | Nexty > N Break end% Record Next point of the Ray Pointofrayx (sizeofray+1) = NEXTX;
        
        Pointofrayy (sizeofray+1) = Nexty;
        
        % increase size of the ray Sizeofray = Sizeofray + 1; % another edge pixel has been found if Edgemap (nextx,nexty) Oppositetheta = Theta (nextx,n
            
            Exty);
                % Gradient Direction roughtly opposite if ABS (ABS (Initialtheta-oppositetheta)-PI) < PI/2
                Isstroke = 1;
                STROKEPOINTSX (sizeofstrokepoints+1) = Initialx;
                Strokepointsy (sizeofstrokepoints+1) = initialy;
            sizeofstrokepoints = sizeofstrokepoints + 1;
        
        End Break end end% Edge Pixel are part of the stroke if Isstroke
        
        % Calculate Stoke width strokewidth = sqrt ((nextx-initialx) ^2 + (nexty-initialy) ^2);
     % iterate all ray points and populate with the minimum stroke width   For J=1:sizeofray Swtmap (Pointofrayx (j), Pointofrayy (j)) = Min (Swtmap (Pointofrayx (j), Pointofrayy (j)), Strokew
        Idth);

End End End%figure, Imshow (Swtmap, []), title (' Stroke Width transform:first Pass ');  % iterate through all stoke points for a refinement pass.

Refer to figure% 4b in the paper.
    For i=1:sizeofstrokepoints step = 1;
    Initialx = STROKEPOINTSX (i);
    Initialy = Strokepointsy (i);
    Initialtheta = Theta (initialx,initialy);
    Sizeofray = 0;
    Pointofrayx = zeros (maxstrokewidth,1);
    Pointofrayy = zeros (maxstrokewidth,1);
    Swtvalues = zeros (maxstrokewidth,1);
    
    sizeofswtvalues = 0;
    % Record First point of the Ray Pointofrayx (sizeofray+1) = Initialx;
    
    Pointofrayy (sizeofray+1) = initialy;
    
    % increase the size of the ray Sizeofray = Sizeofray + 1;
    % Record The SWT value of first Stoke point swtvalues (sizeofswtvalues+1) = Swtmap (initialx,initialy);
 Sizeofswtvalues = sizeofswtvalues + 1;   
    % follow the ray while step < maxstrokewidth NEXTX = round (initialx + cos (initialtheta) * Searchdir
        Ection * Step);
        
        Nexty = round (initialy + sin (initialtheta) * searchdirection * step);
        
        Step = step + 1;
        % Record Next point of the Ray Pointofrayx (sizeofray+1) = NEXTX;
        
        Pointofrayy (sizeofray+1) = Nexty;
        
        % increase size of the ray Sizeofray = Sizeofray + 1;
        % Record The SWT value of next Stoke Point swtvalues (sizeofswtvalues+1) = Swtmap (nextx,nexty);
        
        Sizeofswtvalues = sizeofswtvalues + 1; % another edge pixel have been found if Edgemap (nextx,nexty) break end end% CALCU
    Late Stoke width as the median value of all swtvalues seen.
    
    Strokewidth = Median (swtvalues (1:sizeofswtvalues)); % iterate all ray points and populate with the minimum stroke width for j=1:sizeofray swtmap (POINtofrayx (j), Pointofrayy (j)) = Min (Swtmap (Pointofrayx (j), Pointofrayy (j)), Strokewidth);

End End%figure, Imshow (Swtmap, []), title (' Stroke Width transform:second Pass '); End



Original address: http://www.cnblogs.com/dawnminghuang/p/3807678.html

http://www.cnblogs.com/dawnminghuang/p/3906622.html#3095998

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.