Before the online to see a lot of information about the OCR, their training library material is also a lot, but about the training of the font after the multi-font use of data is less, in fact, after the OCR3.02 has been supported by multi-font joint use, so record down, I hope to be helpful to everyone!
1. First download the latest item tess-two on GitHub., the address is:https://github.com/rmtheis/tess-two;
2. If your computer does not have the NDK installed, you also need to download the NDK, because Tess-two is compiled by the NDK address: https://dl.google.com/android/ndk/android-ndk-r8e-windows-x86.zip
After installing the NDK, execute the command:
[plain] view plaincopy
- CD Tess-two
- Ndk-build
- Android Update Project-t 1--path.
- Ant Release
- Cd..
- CD Eyes-two
- Ndk-build
- Android Update Project-t 1--path.
- Ant Release
3. Call Tesseract to identify the imageEclipse Imports compiled Android project, a total of three projects, Tess-two, Tess-two-test and eyes-two.
Among them, Tess-two and Eyes-two are Android Lib projects that are referenced by other projects. Tess-two encapsulates the Android API of Tesseract's Android Api,eyes-two package Leptonica. Tess-two-test for OCR testing, read the Tessbaseapitest.java code first, and understand how the API is used.
[Java] view Plaincopy
- Private Static FinalString Tessbase_path ="/mnt/sdcard/tesseract/";
- private static final string default_language = " Eng " ;  
- Private Static FinalString chinese_language ="Chi_sim";
- Private Static FinalString Chinese_custom ="Custom";//Custom Font
-   
- tessbaseapi baseapi = new tessbaseapi ();
- baseapi.init (tessbase_path, chinese_language); //Single font use
- baseapi.init (Tessbase_path, chinese_language+chinese_language ); Multi-font use
- baseapi.setpagesegmode (TessBaseAPI.PageSegMode.PSM_AUTO );
- baseapi.setimage (Params[0
-   
- // ensure that the result is correct.   
- final string outputtext = baseapi.getutf8text ();
- Baseapi.end ();
Finally, you can add an activity test to your tess-two-test project.
Note: 1. Multi-font joint use must be in the ocr3.02 version above to achieve, 3.01 is supported, followed by the test when the data is placed in the root directory of the phone tessdata.
2. Code: http://download.csdn.net/detail/u010897392/8649197;
3. custom font Please see another article: http://blog.csdn.net/u010897392/article/details/45339301
4. If there is a problem, can leave a message to discuss, everybody progress together!
Multi-font combination using OCR