I. Introduction to the Framework
Tesseract is a picture recognition tool, you can grab the text in the picture, you can support multiple languages (the default is English), you need to download open source files can be downloaded in the GitHub, if the knowledge application does not want too much scrutiny directly in Google code inside search download can.
Ii. content of Tesseract
Doc: Description Document
Tessdata: Store a variety of text library (chi_sim.traineddata: Chinese, etc.)
Tessseract.exe can start the file start call Tesseract can be called by CMD, first CD to the corresponding directory, and then enter Tesseract.exe picture name export file name (such as: Tesseract.exe 1.jpg 1) The 1.jpg file recognition results of the same directory as tesseract can be stored in 1.txt files. Of course, if you need to support Chinese, you need to add chi_sim.traineddata files to the Tessdata.
Then invoke the example: Tesseract.exe 1.jpg 1-l Chi_sim is based on what text library to identify. Of course, the picture address can refer to the full path, output can also be a full path if just want to tesseract as a tool to do text parsing, not the pursuit of too high success rate users, so you can meet the needs, directly through the cmd invoke EXE implementation. There is a need to play a small partner can directly download the attachment tesseract_ simple use of. rar files
Three. NET Project Advanced use
If. NET projects need to be developed using DLL-style references and can download Tesseract_dll reference packages.
X86 and X64 are reference DLLs for tesseract associated, adaptive configuration based on the number of system digits in the program.
Liblept168.dll This file will not load the error message when IIS is published in Windows Server 2003: Failed to find the library "Liblept168.dll" for platform x86. Tesseract.DLL is added to the corresponding project if published in a later version of the system to work properly. Here is the test Tesseract.dll code:
1 using (var engine = new Tesseractengine (Server.MapPath (@ "~/tessdata"), "Eng", Enginemode.default))
2 {
3 //have to load pix via a bitmap since PIX doesn ' t support loading a stream.
4 using (var image = new System.Drawing.Bitmap (imageFile.PostedFile.InputStream))
5 {
6 using ( var pix = Pixconverter.topix (image))
7 {
8 using (var page = engine. Process (PIX))
9 { meanconfidencelabel.innertext = String.Format ("{0:p}", page. Getmeanconfidence ());
One resulttext.innertext = page. GetText ();
{}}
Tesseractengine: Constructor Parameter considerations = "The first is that the font path must use the Tessdata end, the second if you need to use the Chi_sim that is Chinese
From:https://www.cnblogs.com/cleanboy/p/4617438.html