The python program is used to generate word and PDF documents,

Source: Internet
Author: User
Tags wkhtmltopdf

The python program is used to generate word and PDF documents,

I. Procedure for exporting Word documents

Export web/html content as world documents, and there are many solutions in java, such as using Jacob, Apache POI, Java2Word, iText, and other methods, and use a template engine like freemarker. Php also has some corresponding methods, but there are few methods for generating world documents from web/html content in python. The most difficult solution is how to asynchronously retrieve the filled data using js Code and export the images to the Word documents.

1. unoconv

Function:

1. You can convert a local html document to a document in docx format. Therefore, you need to save the html file on the webpage to your local computer before calling unoconv for conversion. The conversion effect is also good, and the usage is very simple.

\ # Install sudo apt-get install unoconv \ # Use unoconv-f pdf *. odtunoconv-f doc *. odtunoconv-f html *. odt

Disadvantages:

1. Only static html can be converted. ajax cannot be used to obtain data asynchronously on the page (mainly to ensure that there is data in the html file saved from the web page ).

2. Only html can be converted. If images generated by js Code such as echarts and highcharts are used on the page, they cannot be converted to Word documents;

3. The format of the generated Word documents is not easy to control.

2. python-docx

Function:

1. python-docx is a python library that can read and write Word documents.

Usage:

1. Obtain the data on the webpage and add it to the Word documents manually using python layout.

from docx import Documentfrom docx.shared import Inchesdocument = Document()document.add_heading('Document Title', 0)p = document.add_paragraph('A plain paragraph having some ')p.add_run('bold').bold = Truep.add_run(' and some ')p.add_run('italic.').italic = Truedocument.add_heading('Heading, level 1', level=1)document.add_paragraph('Intense quote', style='IntenseQuote')document.add_paragraph( 'first item in unordered list', style='ListBullet')document.add_paragraph( 'first item in ordered list', style='ListNumber')document.add_picture('monty-truth.png', width=Inches(1.25))table = document.add_table(rows=1, cols=3)hdr_cells = table.rows[0].cellshdr_cells[0].text = 'Qty'hdr_cells[1].text = 'Id'hdr_cells[2].text = 'Desc'for item in recordset: row_cells = table.add_row().cells row_cells[0].text = str(item.qty) row_cells[1].text = str(item.id) row_cells[2].text = item.descdocument.add_page_break()document.save('demo.docx')
From docx import Documentfrom docx. shared import Inchesdocument = Document () for row in range (9): t = document. add_table (rows = 1, cols = 1, style = 'table grid') t. autofit = False # very important! W = float (row)/2.0 t. columns [0]. width = inches(ww.document.save('table-step.docx ')

Disadvantages:

Weak functions. There are many restrictions, for example, templates are not supported. You can only generate WORD Documents in simple format.

Ii. Procedure for exporting PDF documents

1. Development Kit

Function:

1. wkhtmltopdf is mainly used to generate PDF in HTML.

2.pdf kit is a python package based on wkhtmltopdf. It supports conversion from URL, local files, text content to PDF. In the end, it still calls the wkhtmltopdf command. Is currently exposed to python to generate pdf better results.

Advantages:

1. wkhtmltopdf: convert HTML to PDF using webkit Kernel

Webkit is an efficient and open-source browser kernel, which is used by browsers including Chrome and Safari. Chrome prints the current webpage. One option is to directly "Save as PDF ".

2. wkhtmltopdf converts HTML pages to PDF using the PDF rendering engine of webkit kernel. High Fidelity, good conversion quality, and easy to use.
Usage:

\ # Install pip install Development Kit \ # Use import Export kitkit. from_url ('HTTP: // google.com ', 'out.}'{kit.from_file('test.html', 'output') to install kit. from_string ('Hello! ', 'Output ')

Disadvantages:

1. The icons generated by js Code such as echarts and highcharts cannot be converted to pdf (because its function is mainly to convert html to pdf rather than JavaScript to pdf ). The pure Static Page conversion effect is good.

2. Miscellaneous

Other pdf INS that generate pdf files include weasyprint, reportlab, and PyPDF2. After a simple experiment, these plug-ins are not as effective as the development kit and have complicated usage.

Summary

The above is all about this article. I hope this article will help you in your study or work. If you have any questions, please leave a message.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.