Tesseract is an open source OCR engine that complies with the Apache License 2.0 protocol. Here's how to compile Tesseract on the Android platform and how to quickly create a simple OCR application.
Reference Original: Making an Android OCR application with Tesseract
Tesseract Android Tools
To compile the Android platform's tesseract, you need to use the tesseract-android-tools provided by Google.
How to Get code:
git clone https://code.google.com/p/tesseract-android-tools/
Open the READMEand perform the following steps in the command line tool:
cd <project-directory>
curl -O https://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz
curl -O http://leptonica.googlecode.com/files/leptonica-1.69.tar.gz
tar -zxvf tesseract-ocr-3.02.02.tar.gz
tar -zxvf leptonica-1.69.tar.gz
rm -f tesseract-ocr-3.02.02.tar.gz
rm -f leptonica-1.69.tar.gz
mv tesseract-3.02.02 jni/com_googlecode_tesseract_android/src
mv leptonica-1.69 jni/com_googlecode_leptonica_android/src
ndk-build -j8
android update project --target 1 --path .
ant debug (release)
Note: If you are using NDK R9, there will be an error when compiling:
Format not a string literal and no format arguments [-werror=format-security]
The workaround is to add a line to the APPLICATION.MK:
App_cflags + =-wno-error=format-security
Class.jar and some *.so are generated after compilation.
Android OCR Application
Create an Android app to import the generated jar and so.
Create TESSOCR:
public class TessOCR {
private TessBaseAPI mTess;
public TessOCR() {
// TODO Auto-generated constructor stub
mTess = new TessBaseAPI();
String datapath = Environment.getExternalStorageDirectory() + "/tesseract/";
String language = "eng";
File dir = new File(datapath + "tessdata/");
if (!dir.exists())
dir.mkdirs();
mTess.init(datapath, language);
}
public String getOCRResult(Bitmap bitmap) {
mTess.setImage(bitmap);
String result = mTess.getUTF8Text();
return result;
}
public void onDestroy() {
if (mTess != null)
mTess.end();
}
}
The constructor requires a directory Tessdatato be created on the memory card and an error will occur if the program is not created. Because this directory is detected in the source code, the exception is thrown if it does not exist:
public boolean init(String datapath, String language) {
if (datapath == null) {
throw new IllegalArgumentException("Data path must not be null!");
}
if (!datapath.endsWith(File.separator)) {
datapath += File.separator;
}
File tessdata = new File(datapath + "tessdata");
if (!tessdata.exists() || !tessdata.isDirectory()) {
throw new IllegalArgumentException("Data path must contain subfolder tessdata!");
}
return nativeInit(datapath, language);
}
It's so simple. Now there are three ways to get images to do OCR:
Select a picture in the gallery, select Send or Share, choose the OCR application
Add Intentfilter to the Androidmanifest.xml to have the OCR app appear in the share list in the gallery:
<intent-filter>
<action android:name="android.intent.action.SEND" />
<category android:name="android.intent.category.DEFAULT" />
<data android:mimeType="text/plain" />
<data android:mimeType="image/*" />
</intent-filter>
After the
Obtains the URI, decodes the URI and gets bitmap:
if (Intent.ACTION_SEND.equals(intent.getAction())) {
Uri uri = (Uri) intent.getParcelableExtra(Intent.EXTRA_STREAM);
uriOCR(uri);
}
private void uriOCR(Uri uri) {
if (uri != null) {
InputStream is = null;
try {
is = getContentResolver().openInputStream(uri);
Bitmap bitmap = BitmapFactory.decodeStream(is);
mImage.setImageBitmap(bitmap);
doOCR(bitmap);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
if (is != null) {
try {
is.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
}
Start the OCR application, select a picture from the gallery to do OCR
Send Intent Call library, get the returned URI in onactivityresult to do OCR:
Intent Intent = new Intent (Intent.action_pick, Android.provider.MediaStore.Images.Media.EXTERNAL_CONTENT_URI); Startactivityforresult (Intent, Request_pick_photo);
Start OCR applications and do OCR after taking pictures
To get high-quality images, add a picture path to the intent. Once returned, you can decode it directly using the image path:
private void dispatchTakePictureIntent() {
Intent takePictureIntent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
// Ensure that there‘s a camera activity to handle the intent
if (takePictureIntent.resolveActivity(getPackageManager()) != null) {
// Create the File where the photo should go
File photoFile = null;
try {
photoFile = createImageFile();
} catch (IOException ex) {
// Error occurred while creating the File
}
// Continue only if the File was successfully created
if (photoFile != null) {
takePictureIntent.putExtra(MediaStore.EXTRA_OUTPUT,
Uri.fromFile(photoFile));
startActivityForResult(takePictureIntent, REQUEST_TAKE_PHOTO);
}
}
}
finally don't forget to download Language Packs and push it to the Tessdata directory of the memory card.
Source
Https://github.com/DynamsoftRD/android-tesseract-ocr
git clone https://github.com/DynamsoftRD/android-tesseract-ocr.git
How to create an Android OCR app with the Tesseract open source OCR engine