First, the issue of the proposed:
A book scanned, to convert the pictures in the book to a text version of the Word document.
Second, the analysis of the problem:
1. Extraction of text
2. Arrangement of text
Third, the problem of the solution
1. If you are using Adobe Acrobat 8 Professional
So, scan a good PDF document, select a page,
Document →OCR text recognition → using OCR to recognize text
Pop up the Identify Text dialog box, and note to select the primary language to identify
In the popup dialog, there is an Edit button, click "Edit"
POPs up a new dialog box form that selects the primary language for OCR recognition as Simplified Chinese,
Then make sure to exit and select the current page for text recognition in the Identify Text dialog box
The software automatically coordinates the picture on the page, then generates text attached to the picture
You can select text by pressing the text selection icon and dragging the text on the image.
When selected text is copied to a text file, you can see that the generated text is recognized and the punctuation is
Each row is cut off well, but the text and punctuation are sporadic where the recognition error generates additional characters,
Manual corrections are required and are corrected in a text file and copied to a Word document.
If you are using Adobe Acrobat XI professional Chinese version
Well, scanned PDF document, the menu navigation on the right has a tool two words
Click on the relevant Tools menu to complete the list,
There's a text recognition in this list of tools.
You can see these two branch items in this file, in multiple files
Point in this file, you can pop up the same as Adobe Acrobat 8 professional
Recognize the text window, the default setting is Chinese (simplified), so you don't have to change
The same way you identify Adobe Acrobat 8 professional.
2.word document to adjust the format of Sing Woo original scanned book, need to format match,
such as the title of the font size, font type, line spacing, paragraph spacing before and after the page paper size
Small, generally have to go through three pages of adjustment in order to shape a good overall format.
3. Note In the Word document, set the paragraph properties,
There is a Chinese layout in the Paragraph dialog box, remember to change the first option to tick only one line break
is to control the characters according to Chinese habits, the other is not selected.
4. You can notice that the number of words in each line in the printed book is individual with each line of Word document editing
The number of words is not on, either more or less, then you need to select the line to make the Text property changes.
If you edit the text, the original line of text symbol becomes two lines, select the number of text
symbols, right-click the font, select the Advanced tab in the Popup Font dialog box, select the character spacing
Tightening, according to the actual number of pounds adjustment, usually 0.1 pounds can be adjusted in place, individually to 0.2 lbs
or 0.3 pounds, the same as the original line of text symbol to absorb the next line of text symbol, the character spacing
Choose to widen, the next line of the text symbol extrusion This line, generally also 0.1 pounds can be adjusted in place,
Individually 0.2 lbs or 0.3 lbs.
5. pdf document of different page sizes for PDF printer
The size of the general small book is the paper size in the Word document is
32 open (13x18.4 cm) Width 13 cm, height 18.4 cm
Then, when you edit the Word document, set the paper size to 32 in the page setup.
The corresponding margins should be adjusted well, and the original paper books, such as
Upper: 1.5 cm, bottom 1.5 cm
Left: 1.3 cm, right 1.3 cm
Binding line: 0 cm
PDF Printer Properties There is no 32 open paper type, to add the settings themselves
Adobe PDF settings in the Adobe PDF document Properties window
In the Adobe PDF page size, click Add, self-defined to increase the paper type.
Experience with Word and PDF for copy-book editing