from:http://chenlei.is-programmer.com/posts/14419
1) PDF to JPG:
Install a software ImageMagick:
# sudo apt-get install ImageMagick
And then it can be converted, yes!!.
# convert Xxx.pdf xxx.jpg
So the xxx.pdf converted into a whole bunch of xxx-*.jpg, one page and one JPG.
If you want to be clear (in the experiment):
# convert-verbose-colorspace rgb-resize 1800-interlace none-density 300-quality xxx.pdf XXX.jpg
(2) PDF to txt:
We're going to do it with Poppler, and it's a system that comes with it.
Oh, first add a Chinese support:
# sudo apt-get install Poppler-data
hehe, conversion.
# PDFTOTEXT-LAYOUT-NOPGBRK Xxx.pdf
Because Pdftotext does not support working with multiple PDFs at the same time, use batch processing to script, open the terminal, go to the directory where the PDF is placed, run the command below
Find./-name ' *.pdf ' | while read I; Do pdftotext-layout-nopgbrk $i; Done
Soon in the current directory output a lot of txt file, "-layout" parameter means to preserve the page layout, "-nopgbrk" means not output line breaks, you know the difference by comparison.