Recently suddenly on the phone of the OCR function is interested in, so studied the Java OCR Technology, found on Google TESSERACT-OCR, finally found its corresponding Java API version tess4j, the intermediate debugging process is abnormal twists and turns, Finally spent half a day to finally put it to debug success, to share with you!
Download the relevant JAR package
1. Download tess4j jar, URL: http://sourceforge.net/projects/tess4j/
2. If you are using a 64-bit JVM, you also need to download the Liblept168.dll, Libtesseract302.dll 64-bit files: https://github.com/charlesw/tesseract/tree/ Master/src/lib/tesseractocr/x64
Second Project integration
First look at the final integration of the successful directory structure, development environment: Win8.1 64-bit + Eclipse 4.2 + JDK 7 64-bit, the project directory structure is as follows:
SRC directory is the tess4j source directory, the test directory is the TESS4J official demo directory, and Will Liblept168.dll, Libtesseract302.dll,gsdll64.dll (PDF conversion needs to use this file) Three files are copied to the SRC root directory.
Three test code
[Java] view plaincopy
/**
* Test of DOOCR method, of class Tesseract1.
*/
@Test
-
public void testdoocr_ File () throws exception {
-
system.out.println ( "Doocr on a png image"
-
file imagefile = new "Eurotext.png " );
String Expresult = "The (quick) [Brown] {fox} jumps!\nover the $43,456.78 <lazy> #90 dog";
String result = INSTANCE.DOOCR (ImageFile);
SYSTEM.OUT.PRINTLN (result);
Assertequals (Expresult, result.substring (0, expresult.length ()));
}
demo:http://download.csdn.net/detail/fx_sky/7988469
Java OCR (using tess4j)