PDF-Generated solution (H2P) h2p technical details

Source: Internet
Author: User
Tags character set

H2P in many people's eyes is an exciting solution, and the Javaei site to provide h2p file resources is quite a natural thing, the thought of h2p, I am very happy, because this project is my proposed, although the implementation of H2P technology is very mature. In this article, talk about implementing the technology involved in H2P, both the core of the J2SE application, as well as the use of open source framework.

(1) DTD application, in order to standardize the user correct editing h2p file, I defined the h2p file DTD, of course, I still have a hope, hoping to become a norm.

(2) using a DTD to validate XML (h2p files), H2p-tool need to parse the XML (h2p file) to extract the URL to generate the PDF and merge, the illegal XML file will certainly not generate the PDF correctly, so the XML file must be validated.

(3) XML parsing, editing h2p, using JTree to show the tree structure of XML, which is based on DOM parsing (specifically Jdom), getting the bookmark directory structure and building the bookmark's data structure, which is based on sax parsing, and with the help of stack. The XML file is saved with Dom.

(4) Swing application, H2p-tool editing features swing, showing and editing the tree structure with JTree

(5) C # components, generate a single PDF based on the URL, with C # components, and on the Vs.net development platform for a simple development. The h2p file provided by Javaei usually contains dozens of URLs, it takes a bit of time to generate PDFs for each URL, not to mention dozens of, so using multithreading is the way to study C # 's Multithreading (C # Multithreading is also very interesting, feel more simple than Java), but others Components in multithreaded performance, under the Rogue, had to adopt a single thread. I would like to use JNI to encapsulate the C # call, but after research, it is said to be more troublesome, it gave up, so the use of a simple method, using batch processing to call.

(6) The application of Itext, the generation of a single PDF page and merged PDF generation bookmarks are itext. At the time of merging, another framework has been studied, namely PDFBox. Itext is really tough, theoretically, the application of itext can achieve arbitrary output, should be able to achieve the effect of the browser, but more cumbersome. Itext in the form of a bookmark, do pretty well, bookmarks can point to any part of any page, you can also set the bookmark corresponding to the page open effect, the development of bookmarks is also very convenient, the direct construction of the bookmark tree-type data structure I think is the simplest, itext also supports the structure of using XML to describe bookmarks.

(7) The application of ClassLoader, H2p-tool operation relies on a lot of jar packages, usually when we develop, the jar package is placed in the specified directory, the application server will load the dependent jar package. In H2p-tool, you need to handle the jar package yourself. Because this is a tool that is provided to the user, users should not be required to configure variables other than JDK environment variables. So the usual way to solve the problem with Jar pack loading is to write the relative path of the jar package in the MANIFEST.MF file in the jar package where the main class (the class with the Main method) is located. This approach is less flexible, so in H2p-tool, class loading is overwritten so that it automatically loads the jar package under the specified directory.

(8) Application of JVM parameters, because there are many PDFs to be merged, which will take up a lot of memory when merging, so it is easy to out of the memery, so in the batch file, you need to configure the appropriate JVM parameters, mainly two:-xmx512m-xms512m, The meaning of these two parameters is no longer to repeat, there are many online.

To sum up, the technical details of the implementation of H2p-tool seemingly many, in fact very few, in addition to Jdom and Itext, and then j2se the core of things, columns listed above, but also including flow, character set processing.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.