The OCR verification code below is implemented using. NET, mainly using the Tesseract component.
. NET version Tesseract:
Http://www.pixel-technology.com/freeware/tessnet2/
In addition, this usage is very simple. Note that you need to download the Language Pack. Here I recognize pure letters, so I use the English Language Pack. In addition, in order to improve the verification rate, you can also perform training on your own. Because my requirements are relatively simple, I did not perform this step and used the English Language Pack directly.
Key test code:
Ocr = new tessnet2.Tesseract ();
Ocr. SetVariable ("tessedit_char_whitelist", "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ ");
Ocr. Init (Application. StartupPath @ "\ lng \ eng", "eng", false );
WebClient wc = new WebClient ();
Byte [] oimg = wc. DownloadData ("some url"); // here, the address is hidden. Change it to the address to be recognized.
Bitmap bp = new Bitmap (new MemoryStream (oimg), true );
PictureBox1.Image = bp;
Bp = ImageProcess. RemoveGreen (bp );
Bp = ImageProcess. ToBW (bp );
PictureBox2.Image = bp;
List <tessnet2.Word> result = ocr. DoOCR (bp, Rectangle. Empty );
String txt = "";
Foreach (tessnet2.Word word in result)
{
Txt = word. Text;
}
TextBox1.Text = txt; a simple preprocessing is performed on the image to remove interference and convert it into a binary image. For simple verification codes, the effect is good