Java code recognition: training samples based on Jtessboxeditorfx and TESSERACT-OCR

Source: Internet
Author: User
Tags image filter

JAVA Validation Recognition: Training samples based on Jtessboxeditorfx and TESSERACT-OCR

Tool Preparation:

Jtessboxeditorfx Download:Https://github.com/nguyenq/jTessBoxEditorFX

TESSERACT-OCR Download:https://sourceforge.net/projects/tesseract-ocr/

Main steps:

  1. Jtessboxeditorfx,tesseract-ocr(environment variable configuration) download,jar Package preparation (maven, See Pom file below )
  2. Download verification code to local (code)
  3. Convert CAPTCHA Picture format
  4. De-noising The converted verification code, cutting edges (code)
  5. use Jtessboxeditorfx for . Box file proofing (Correcting identification error verification code): https://www.cnblogs.com/ Zhongtang/p/5555950.html
  6. use the tesseract command line to generate the . Traineddata , and then call it in java : https:// Www.cnblogs.com/zhongtang/p/5555950.html

The code is as follows:

 PackageYanzhengmatest.pikachu;ImportJava.awt.image.BufferedImage;ImportJava.io.BufferedInputStream;ImportJava.io.File;Importjava.io.FileNotFoundException;ImportJava.io.FileOutputStream;Importjava.io.IOException;Importjava.net.MalformedURLException;ImportJava.net.URL;ImportJavax.imageio.ImageIO;Importjavax.net.ssl.HttpsURLConnection;ImportOrg.opencv.core.Core;ImportOrg.opencv.core.CvType;ImportOrg.opencv.core.Mat;ImportOrg.opencv.core.Rect;Importorg.opencv.core.Size;ImportOrg.opencv.imgcodecs.Imgcodecs;ImportOrg.opencv.imgproc.Imgproc;Importnet.sourceforge.tess4j.Tesseract;Importnet.sourceforge.tess4j.TesseractException; Public classTest {Static{system.loadlibrary (core.native_library_name); }; //used to invoke the OpenCV library file, you must add     Public Static voidMain (string[] args)throwsFileNotFoundException, IOException, interruptedexception {//folder where the verification code is savedFile Imgfile =NewFile ("C:\\users\\pc\\desktop\\formpic\\unformpic"); //Verification Code Save addressString downaddress = "c:\\users\\pc\\desktop\\formpic\\unformpic\\"; //Verification CodeString Downurl = "https://www.qichamao.com/usercenter/varifyimage?t=0.6488481170232967"; if(Imgfile.listfiles (). length < 400) {             for(inti = 1; I <= 400; i++) {downloadpic (Downurl, downaddress+ i + ". gif"); Thread.Sleep (Ten + (i% 100)); }        }                //get saved verification code and convert to TIF format (Tesseract does not support GIF image recognition)File IMGFILE0 =NewFile ("C:\\users\\pc\\desktop\\formpic\\unformpic");  for(File image:imgFile0.listFiles ()) {Changepicformat ("TIF", Image, "c:\\users\\pc\\desktop\\formpic\\formedpic\\"); } System.out.println ("Picture format conversion succeeded"); //obtain verification code converted to TIF format, and process (image denoising, binary), increase verification code recognition degree        intPicnum = 1; File ImageFile1=NewFile ("C:\\users\\pc\\desktop\\formpic\\formedpic");  for(File image:imageFile1.listFiles ()) {Filterpic (Image.getname (), Picnum+ ". tif"); Picnum++; }        //get the processedFile Resultimgs =NewFile ("C:\\users\\pc\\desktop\\result_cut");  for(File link:resultImgs.listFiles ()) {String Reslut=getresult (link); System.out.println (Link.getname ()+ "Recognition Result:" +Reslut); }    }    //Picture processing and image storage after processing     Public Static voidFilterpic (String imgname, String fileName)throwsFileNotFoundException, IOException {//picture de-noisingMat src = imgcodecs.imread ("c:\\users\\pc\\desktop\\formpic\\formedpic\\" +imgname, imgcodecs.imread_unchanged); Mat DST=NewMat (Src.width (), Src.height (), CVTYPE.CV_8UC1); if(Src.empty ()) {System.out.println ("No Pictures"); } Else{System.out.println ("Image processing Success"); } imgproc.boxfilter (src, DST, src.depth (),NewSize (3.2, 3.2)); Imgcodecs.imwrite ("C:\\users\\pc\\desktop\\filter\\" +FileName, DST); //image threshold processing, binary valueMat Src1 = Imgcodecs.imread ("c:\\users\\pc\\desktop\\filter\\" +FileName, imgcodecs.imread_unchanged); Mat Dst1=NewMat (Src1.width (), Src1.height (), CVTYPE.CV_8UC1); Imgproc.threshold (Src1, Dst1,165, 200, Imgproc.thresh_trunc); Imgcodecs.imwrite ("C:\\users\\pc\\desktop\\process\\" +FileName, Dst1); //Image CaptureMat src2 = Imgcodecs.imread ("c:\\users\\pc\\desktop\\process\\" +FileName, imgcodecs.imread_unchanged); Rect ROI=NewRect (4, 2, Src2.cols ()-7, Src2.rows ()-4);//parameters: x-coordinate, y-coordinate, intercept length, intercept widthMat Dst2 =NewMat (SRC2, ROI); Imgcodecs.imwrite ("C:\\users\\pc\\desktop\\result_cut\\" +FileName, Dst2); }    //Get Verification Code     Public StaticString GetResult (File imagefile) {if(!imagefile.exists ()) {System.out.println ("Picture does not exist"); } tesseract tessreact=Newtesseract (); Tessreact.setdatapath ("F:\\program Files (x86) \\Tesseract-OCR\\tessdata"); Tessreact.setlanguage ("Fontyp");//Set the default library as a library of your own trainingString result; Try{result=TESSREACT.DOOCR (ImageFile); returnresult; } Catch(tesseractexception e) {e.printstacktrace (); return NULL; }    }    /*** Image Format conversion * *@paramOutputFormat * Format of conversion *@paramfile * image to convert *@paramDownaddress * The saved address after conversion * @sourse:http://www.open-open.com/code/view/1453300186683     */     Public Static voidChangepicformat (String outputformat, File image, String downaddress) {Try{bufferedimage Bim=Imageio.read (image); File Output=NewFile (downaddress+ image.getname (). substring (0, Image.getname (). LastIndexOf (".") + 1) +OutputFormat);        Imageio.write (BIM, OutputFormat, Output); } Catch(IOException e) {e.printstacktrace (); }    }    /*** Download Verification code * *@paramPicurl * Verification code get address *@paramaddress * Picture Save location*/     Public Static voiddownloadpic (String picurl, String imgaddress) {Try{URL URL=NewURL (Picurl); Httpsurlconnection Conn=(httpsurlconnection) url.openconnection (); //need to set the header information, otherwise it will be recognized as the machine and not get the captcha pictureConn.setrequestproperty ("User-agent",                    "Mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/68.0.3440.75 safari/537.36 ");            Conn.connect (); intresult =-1; byte[] buf =New byte[1024]; Bufferedinputstream bis=NewBufferedinputstream (Conn.getinputstream ()); FileOutputStream Fos=NewFileOutputStream (imgaddress);  while(Result = Bis.read (BUF))! =-1) {fos.write (BUF);            } fos.flush ();            Fos.close ();            Bis.close (); System.out.println ("Image Download Successful"); } Catch(malformedurlexception e) {System.out.println ("Picture Read failed");        E.printstacktrace (); } Catch(IOException e) {System.out.println ();        E.printstacktrace (); }    }}

Pom file:

        <dependency>            <groupId>net.sourceforge.tess4j</groupId>            <artifactid>tess4j</ artifactid>            <version>4.1.1</version>            <exclusions>                <exclusion>                    < groupid>com.sun.jna</groupid>                    <artifactId>jna</artifactId>                </exclusion>            </exclusions>        </dependency>        <dependency>            <groupid>org.openpnp</ groupid>            <artifactId>opencv</artifactId>            <version>3.2.0-0</version>        </dependency>

Reference article:

Use of OPENSV: https://blog.csdn.net/u012706811/article/details/52779271OPENSV Tutorial: https://www.w3cschool.cn/opencv/opencv-me9i28vh.htmlOPENSV Two value: https://blog.csdn.net/liyuqian199695/article/details/53925046OPENSV's maven address: https://mvnrepository.com/artifact/org.openpnp/opencv/3.4.2-0OPENSV image filter: https://blog.csdn.net/u012393192/article/details/78528550OPENSV Picture pruning: https://blog.csdn.net/sileixinhua/article/details/72811093OPENSV case with tesserate command: https://www.cnblogs.com/zhongtang/p/5555950.htmlattached text: https://blog.csdn.net/lmj623565791/article/details/23960391

Exception handling:

1. Loading Library Exceptions :

Exception in thread "main" Java.lang.UnsatisfiedLinkError:no opencv_java320in Java.library.path Atjava.lang.ClassLoader.loadLibrary (classloader.java:1867) atjava.lang.Runtime.loadLibrary0 (runtime.java:870) Atjava.lang.System.loadLibrary (system.java:1122) atyanzhengmatest.pikachu.test.<clinit> (test.java:38)

Solve:

Set the path of the slice location to: G:\Program Files (x86) \APACHE-MAVEN\REPO\ORG\OPENPNP\OPENCV\3.2.0-0\OPENCV-3.2.0-0\NU\PATTERN\OPENCV \WINDOWS\X86_64 (specified according to your MAVEN OPENCV package address).

2. JDK and OPENCV versions do not match (Exception in thread "main" Java.lang.UnsatisfiedLinkError:no Jniopencv_highgui in Java.library.path)

Workaround: Replace the OPENCV version

3. An exception occurred when generating the. tr file using the command line:

Page 4061 dpi. Using269Error during processing.

Resolution: May be image Conversion format or download error, the picture can be replaced

Java captcha recognition: Training samples based on Jtessboxeditorfx and TESSERACT-OCR

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.