Python3.x:pdf2htmlex (parsing pdf) Installation and use

Source: Internet
Author: User

Python3.x:pdf2htmlex (parsing pdf) Introduction to installation and use

Pdf2htmlex is a great tool for converting PDFs into HTML.

Download

Windows:http://soft.rubypdf.com/software/pdf2htmlex-windows-version

Installation

Download Pdf2htmlex-win32-0.14.6-with-poppler-data.zip, directly decompression, you can use;

Test

In the DOS window, switch to the Unzip directory:

CD/D D:\pdf2htmlEX-win32-0.14.6

Enter the test command:

Pdf2htmlex-v

The result is that the installation was successful;

pdf2html command Usage
Usage: Pdf2htmlex [options] <input.pdf> [<output.html>]    -F,--first-page <int> start Page to convert (default: 1)    -L,--last-page <int> The last page to convert (default: 2147483647)    --zoom <fp>Zoom ratio--fit-width <fp> Fit Width <fp>Pixel--fit-height <fp> Fit Height <fp>Pixel--use-cropbox <int> Use the Cut box (default:1)    --hdpi <fp> Image Horizontal resolution (default:144)    --VDPI <fp> Image Vertical resolution (default:144)    --embed <string>specifies which elements should be embedded in the output--EMBED-CSS <int> embed a CSS file in the output (default:1)    --embed-font <int> embed font files in the output (default:1)    --embed-image <int> embed the picture file in the output (default:1)    --embed-javascript <int> embed JavaScript files in the output (default:1)    --embed-outline <int> embed the link in the output (default:1)    --split-pages <int>splitting a page into a separate file (default:0)--dest-dir <string> Specify target directory (default:".")    --css-filename <string> The file name of the generated CSS file (default:"")    --page-filename <string> Split page name (default:"")    --outline-filename <string> generated link file name (default:"")    --process-nontext <int> render chart lines, except text (default:1)    --process-outline <int> Show links in html (default:1)    --printing <int> Support Printing (default:1)    --fallback <int>output in Standby mode (default:0)--embed-external-font <int> embed locally matched external fonts (default:1)    --font-format <string> embedded font file suffix (ttf,otf,woff,svg) (default:"Woff")    --decompose-ligature <int> decomposed ligaturesfi (default:0)--auto-hint <int>do not prompt when using fonts on FontForge autohint (default:0)--external-hint-tool <string> font External prompt tool (Overrides--auto-hint) (default:"")    --stretch-narrow-glyph <int>stretch narrow glyphs instead of padding (default:0)--squeeze-wide-glyph <int> shrinks a wide glyph, rather than truncate (default:1)    --override-fstype <int> Clear the fstype bitsinchttf/OTF fonts (default:0)--process-type3 <int> convert Type 3 fonts forWeb (Experimental) (default:0)--heps <fp> merged text horizontal threshold, in pixels (default:1)    --veps <fp> Vertical threshold forMerging text,inchPixels (default:1)    --space-threshold <fp> Hyphenation Threshold (critical value * em) (default:0.125)    --font-size-multiplier <fp> a value greater than 1 increases rendering accuracy (default:4)    --space-as-offset <int>Use the space character as an offset (default:0)--tounicode <int> How to deal with Tounicode CMap (0=auto, 1=force,-1=ignore) (default:0)--optimize-text <int>minimize the number of HTML elements used for text (default:0)--bg-format <string> Specify the background image format (default:"PNG")    -O,--owner-password <string>owner password (in order to encrypt files)-U,--user-password <string>user password (in order to encrypt files)--NO-DRM <int>override DRM settings for a document (default:0)--clean-tmp <int> Delete temporary files after conversion (default:1)    --data-dir <string> specified data directory (default:". \share\pdf2htmlex")    --debug <int>Print debug Information (default:0)-V,--Version print copyright and release information-H,--Help print Usage assistance information
Example of calling Pdf2htmlex in Python3

Python3.x:pdf2htmlex (parsing pdf) Installation and use

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.