Ways to generate Word and PDF documents using Python programs

Source: Internet
Author: User
Tags wkhtmltopdf
This article mainly introduces the use of Python program to generate word and PDF documents, the text gives a detailed introduction and sample code, I believe that we have a certain reference value, the need for friends below to see it together.

I. How to export Word documents by program

Exporting web/html content as a world document, there are many solutions in Java, such as using Jacob, Apache POI, Java2word, Itext and so on, and using a template engine like Freemarker. There are some methods in PHP, but there are few ways to generate a world document for web/html content in Python. The most difficult solution is how to use the JS code to get the populated data asynchronously, the picture is exported to the Word document.

1. Unoconv

Function:

1. Support the conversion of local HTML documents to a DOCX-formatted document, so you need to save the HTML file in your Web page locally before calling Unoconv for conversion. The conversion effect is also good, the use of the method is very simple.

\# installing sudo apt-get install unoconv\# using unoconv-f PDF *.odtunoconv-f doc *.odtunoconv-f html *.odt

Disadvantages:

1. Only static HTML can be converted, and there is no place for the page to get data asynchronously using AJAX (mainly to ensure that there is data in the HTML file saved from the Web page).

2. Only the HTML can be converted, if the page has the use of JS code such as echarts,highcharts generated pictures, it is not possible to convert these pictures into a Word document;

3. The resulting Word document content format is not easy to control.

2. Python-docx

Function:

1.python-docx is a Python library that can read and write Word documents.

How to use:

1. Get the data from the Web page and add it to the Word document using Python manual typesetting.

From docx import documentfrom docx.shared Import inchesdocument = document () document.add_heading (' document Title ', 0) p = d Ocument.add_paragraph (' A Plain paragraph having some ') P.add_run (' bold '). Bold = Truep.add_run (' and some ') P.add_run (' Italic. '). Italic = truedocument.add_heading (' Heading, Level 1 ', level=1) document.add_paragraph (' intense quote ', style= ' Intensequote ') document.add_paragraph (' first item in unordered list ', style= ' Listbullet ') document.add_paragraph (' First item in ordered list ', style= ' Listnumber ') document.add_picture (' Monty-truth.png ', width=inches (1.25)) Table = Document.add_table (Rows=1, cols=3) hdr_cells = Table.rows[0].cellshdr_cells[0].text = ' Qty ' hdr_cells[1].text = ' Id ' HDR _cells[2].text = ' Desc ' for item in recordset:row_cells = Table.add_row (). Cells Row_cells[0].text = str (item.qty) Row_cell S[1].text = str (item.id) Row_cells[2].text = Item.descdocument.add_page_break () document.save (' Demo.docx ')

From docx import documentfrom docx.shared Import inchesdocument = Document () for row in range (9): t = document.add_table (RO Ws=1,cols=1,style = ' Table Grid ') T.autofit = False #很重要! W = float (row)/2.0 T.columns[0].width = Inches (w) document.save (' Table-step.docx ')

Disadvantages:

function is very weak. There are many restrictions, such as not supporting templates and so on, only the simple format of Word documents can be generated.

Ii. Procedures for exporting PDF document methods

1.pdfkit

Function:

1.wkhtmltopdf is used primarily for HTML-generated PDFs.

2.pdfkit is a wkhtmltopdf-based Python package that supports the conversion of URLs, local files, text content to PDFs, and ultimately calls the Wkhtmltopdf command. Python is currently exposed to generate PDF effect is better.

Advantages:

1.wkhtmltopdf: Using the WebKit kernel to convert HTML to PDF

WebKit is an efficient, open-source browser kernel that is used by browsers, including Chrome and Safari. Chrome prints the functionality of the current page, with one option being to "Save as PDF" directly.

2.wkhtmltopdf uses the WebKit kernel's PDF rendering engine to convert HTML pages to PDFs. High Fidelity, the conversion quality is very good, and the use is very simple.
How to use:

\# install pip install pdfkit\# using import pdfkitpdfkit.from_url (' http://google.com ', ' out.pdf ') pdfkit.from_file (' test.html ') , ' Out.pdf ') pdfkit.from_string (' hello! ', ' out.pdf ')

Disadvantages:

1. The icon generated for JS code such as Echarts,highcharts cannot be converted to PDF (because it functions primarily to convert HTML to PDF instead of converting JS to pdf). The conversion effect for a purely static page is still good.

2. Other

Other plugins that generate PDFs are: WEASYPRINT,REPORTLAB,PYPDF2 and so on, the simple test is not as good as the pdfkit effect, and some usage is complex.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.