Combat Experience 1--java third-party tools needed to process docx

Source: Internet
Author: User

In the near future with a project dealing with docx documents, the guide package is inevitable, the following is due to the project contact, can process the docx tool (including but not only the following). Through a brief introduction to the tool and a personal insight into it, the link to Maven repository is given:

1.Apache POI

Poi is the first tool I found, perhaps preconceived, poi is the most information I can find on the web about working with Office documents, but personally it seems that poi is more inclined to work with Excel documents, and it has less support for Word documents. POI online Documentation: http://poi.apache.org/apidocs/index.html

2.Aspose Words

Reference Aspose words is because when writing a set of code to deal with the docx format document, do not want to write to the DOC format document again, so it is necessary to format the DOC format conversion work, the query aspose words to achieve this function, and then try to Found defects: First in the writing period of this article Aspose words Toolkit or testing phase, the doc to docx has a word or what restrictions, anyway, after the format conversion generated in the Docx format document content and source document is not the same; second, during the writing period of this article, it is said that the toolkit is chargeable, So consider a friend who wants to be a commercial, or wait for it to open up (a little unrealistic).

3.JAVA COM Bridge

For short, Jacob, can perfectly complete the task of doc to docx, surprise found that in fact, Jacob also support more format conversion, such as Excel to Pdf,word to PPT, and so on, each format is represented by a constant, call is very simple, strongly recommended, But its shortcomings are also fatal: The Linux environment cannot run (T t).

4.Apache poi xwpf Converter Core + Apache POI xwpf Converter XHTML

These two toolkits are also Apache out, enabling this combination is to realize the requirements of the docx-to-HTML file, specific applications can be consulted: Https://github.com/jeckeyLiu/word2Html/blob/master/src/main /java/com/abc/word2html/util/word2html.java, on the network can find "more" with core+xhtml implementation of docx to HTML code, but debugging will occur Nosuchmethod exception, I have not solved the problem so far ...

5.docx4j

This really is to deal with the docx artifact, it is directly hit the docx nature of the tool, the Office document is actually the bottom of the XML, so docx4j is actually parsing XML, the support of Docx strong, powerful recommendation

6.itext

No use, only heard can also be processed, listed here as a future backup information

Above, there is a new update

Combat Experience 1--java third-party tools needed to process docx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.