PDF document generation technology based on PHP and XML

Source: Internet
Author: User
Tags date format define array functions pdflib php and php pdf
XML Digest

This paper briefly introduces the principles of PHP, XML, PDF and their application. This paper tries to construct a set of online PDF document generation system based on PHP and XML by using the object-oriented features of PHP. In this paper, the components of the whole system and their realization process are discussed in detail. At the end of this paper, we give an example of dynamic creating report which is implemented by this system.

This article introduced the fundamentls of Php,xml and PDF and their application in situation to build a php&xml-based Dynamic PDF documents creating system through the PHP ' s OO features. Furthermore,we discussed in detail on the components of the whole system and their respective. Finally,we represented an example of creating reports the using this system.



1. Introduction

In the era of rapid development of information technology, regardless of government, enterprises or individuals, they have a strong interest in how to improve their work efficiency and save expenses through information technology. They are eager to find a good technology, the traditional paper-media documents, statements, ticket documents, manuals, applications, etc. into a very convenient online and internal network automatically generated, disseminated, downloaded, browsed, printed electronic documents. And now the most popular "paperless office", "E-commerce" and so on will be based on this.

This document format, which is the Adobe PDF (Portable Document Format), is the open and practical standard for worldwide electronic document distribution. Any browser can freely browse, download, and print a PDF document simply by installing the plugin for Acrobat Reader 5.0. PDF has no doubt the superiority of other electronic document format incomparable.

We know that b/s system as the current and the future of the most popular software architecture, can be very good to implement a variety of browser-based Web applications, and PHP as a good web programming language, especially suitable for processing user form input, query database, such as browser users of front-end applications. Because PHP is open source, this makes it more widely used than other similar web scripting languages, and its functions are constantly expanding and improving. Now the latest version of PHP has been able to support pdf,xml and so on. Through the API provided by the system, we can produce PDF documents very quickly, and the most fascinating thing is that we can use PHP, query the database or XML data files and insert the results into the generated PDF documents, to form a variety of excellent browsing and printing effects of reports, documents, manuals and so on.

It is not difficult to see that, combined with PHP, XML, pdf Three technology, it is very practical to construct a system that can dynamically generate PDF documents online, which is mainly manifested in:

? documents can be generated on the network and distributed over the network. Save a lot of manpower and resources. With accurate and beautiful printing effect, the real paperless office.

? e-commerce transactions in the process of various bills, vouchers can be generated through the PHP script program online, and into a PDF format to send to customers.

? Enterprise MIS System in a variety of print-oriented report generation, and can be directly obtained through the browser, without installing any client, the use of extremely convenient.

The previous document circulation is "first print, after Distribution", the annual cost of printing is the government, enterprises a heavy burden. The PDF document "First distributed, after printing", the person can browse, and then print as needed. The cost of printing is greatly reduced. Moreover, it is very beneficial to the cause of environmental protection.

2. Topic Introduction

In the development of some software projects, we encountered a key problem is a large number of print-oriented reports, document generation. We know that HTML is suitable for browsing, but not for format specification printing. Therefore, you must find a document format that can be dynamically generated by PHP and has a good printing effect. And this is the most direct requirement for me to study this subject. With this in mind, it's natural to think of PDF and PHP PDF Support Library pdflib. With the set of APIs provided by Pdflib, we can easily create PDF documents dynamically in PHP scripts. But this is just a very basic set of functions that can only do some simple output, such as lines, text, rectangles, and so on, and specify coordinates for each object before outputting it. If you use this set of functions directly to do some practical applications, such as the creation of complex reports, the difficulty is unimaginable. It is not possible for us to create such a report, but to calculate the coordinates of each element in advance, and to draw the cell in a rectangular frame.

So, our first step is to use PHP object-oriented programming method to encapsulate this basic API, to produce a number of practical independent function of the object modules (such as Page object, Table object, text object, etc.). It should be said that this piece is the most basic and important part of this project. I refer to and partially adopt

Some foreign similar open source program, on this basis developed a set of more powerful class library. Greatly simplifies the production of PDF documents, especially the Table object, can be as arbitrary as the table tags in HTML nesting, easy and quick to implement a variety of complex table rendering (which is very useful for dynamic generation of reports).

After the issue of PDF generation was solved, we faced a new problem, for example, how does a database query page pass a result set containing a lot of information and other information to the PDF generation page? The first idea we had was to pass the text file, which was to write the data into a text file on the database query page and

Different categories of data define a set of distinguishing tags, which the PDF generation page reads and inserts the content into the PDF. But this is not a reliable thing to do. Because in this text file, we use a specific character (or space) to separate the data, if it is useful data contains the same characters or spaces? It can be seen that passing data in this way is a hidden danger. In fact, we mentioned that in a text file, different kinds of data are distinguished from different tags. And this is the idea of XML technology. Why not step into the use of XML as a means of data transmission? Moreover, PHP is very good for XML and XSLT, through the expat parser, we can arbitrarily extract the data in the XML document, or through the PHP XSLT engine sablotron to the XML document for arbitrary conversion.

First, the XML Builder places data (from database or user input, and so on) in an XML document that conforms to a predefined DTD that describes the contents of the data and does not contain any formatting information. The XML converter then converts the XML document into another XML document that contains display style information. The document is then read by the PDF builder, and the corresponding PDF document is generated based on the content and display style. In this process, I'm going to use the object-oriented features of PHP again to build reusable classes: XMLWriter (generating XML files), Xmlparser (parsing XML files), and Xmltransformer (encapsulation of XSLT functions).

After the successful construction of the system, it is the concrete application. Invoicing system is mainly a variety of reports, document Dynamic generation.

3. Feasibility analysis

The development of a powerful, adaptable PDF document online generation system, must be flexible, flexible

High development model. The online production technology of PDF document based on PHP and XML is proposed, which provides new ideas for various printing applications, such as statements, bills, manuals, etc. We use PHP to query the database, process user input, build the original XML document on this basis, and then add the XML document to the display layer information through XSLT to generate another new XML document. Finally, the PDF Builder is used to convert this new XML document into a PDF document of the appropriate format. For the XML document that was originally generated, I can use it for two degrees because the document contains all the useful information that can be easily handled by other applications. If we want to change the way information is displayed in a PDF document, it can be very easy to implement. As long as a specialist modifies the corresponding XSL style sheet file, there is no need to make any changes to the other links, and it has great flexibility. In addition, PHP, XML, pdf are very good portability, can be used across platforms. The study of the system is not imagined, it is based on direct demand. So far, this technology has been put into practical use, and has received extremely satisfactory results. Practice has proved that using PHP and XML to develop a set of online PDF document generation system has a broad and very practical application prospects.

4 Overall design

This thesis mainly completes the design and programming of four basic modules. These four modules are PDFCreator, XMLWriter, Xmltransfomer, and Xmlparser respectively. They are distributed in various links of the system, have their own independent functions and functions, are the core components of the system (see the following figure).

System composition Diagram

As can be seen from the diagram, the four are closely related to the organic whole in this system. XmlWriter as a system of losing

Into the interface, which is responsible for generating the original XML data file. The format specification (DTD) for the file is written by us, and XmlWriter generates the corresponding XML document according to the DTD. This XML document is then referred to Xmltransfomer, Xmltransfomer is actually a encapsulation of the XSLT function provided by PHP, which generally accepts two parameters, one of which is the XML document that needs to be converted, and the other is the corresponding XSL style sheet file. Xmltransfomer the original XML document into another XML document that conforms to this style sheet style based on the style sheet file (the format that contains the information placed in the PDF document). The new XML file is then continued to be processed by the PDF builder. And this process is divided into two parts: first of all, it is necessary to parse the XML document, extract the required data, this step is xmlparser to complete, xmlparser the XML document to parse, convert it into an object tree, XML document Each node is an object, Each object has its own attributes (that is, all information about the corresponding node). In this way, we can easily access any content of this XML document. The next thing to do is to convert the information read in the XML document, including formatting and content information, into the output of the final PDF document using PDFCreator.

5. application Example

Here, we use the above system to create a print-oriented report?? Inventory History Thing

Table ". The information contained in this report is: report Name (Concord Inventory History Transaction table), units, build date, etc., in addition to the data extracted from the database, name (Llprod), Lot (Lloc), Grade (LCLS), Warehouse (LWHS), location (lloct), Quantity (Lnum), Date (ldate) and so on. Let's say we've generated the following raw XML document (Report.xml) with XmlWriter:

<?xml version= "1.0" encoding= "gb2312"?>



<title> Inventory History Transaction Table </title>

<unit> sqm </unit>
























This document contains all the useful information for this report, and we need to add formatting information to it with a specific XSL style sheet. Xmltransformer the code to perform the conversion is as follows:


$XSLT = new Xmltransformer ("Report.xsl", "Report.xml");

$xslt->apply ("Pdfreport.xml");


The new XML document generated after the conversion is as follows:

<?xml version= "1.0" encoding= "gb2312"?>

<pdfreport pagetype= "A4" pagesize= "top=" bottom= "" left= "right=" >

<line top= "5" bottom= "5" size= "50%" linetype= "single" show= "false"/>

<text fontsize= "fontlaguage=" "CN" align= "Center" > Inventory History Transaction Table </text>

<line top= "5" bottom= "size=" 80% "linetype=" Double "show=" true "/>

<text fontsize= "A" fontlaguage= "cn" align= "left" > Unit: square meters </text>



<tr><th> Product </th><th> Batch </th><th> grade </th><th> warehouse </th><th > Location </th><th> quantity </th><th> Date </th></tr>

<tr><td>W2308</td><td>1234</td><td>a</td><td>01</td>< Td>0001</td><td>200</td><td>20020609</td></tr>

<tr><td>W2307</td><td>4321</td><td>a</td><td>01</td>< Td>0001</td><td>100</td><td>20020609</td></tr>




<line top= "5" bottom= "5" size= "50%" linetype= "single" show= "false"/>

<text fontsize= "" "Fontlaguage=" cn "align=" Center "> Date:20020611</text>



After parsing the XML document with Xmlparser, we get a tree of objects containing all the information, and we can easily access the contents. The resulting PDF report is shown in the following figure:

The program fragment is as follows:

? Include (".. /include/pc_init.inc ");? >

? Include ("Xmlparser.inc");


$xmlobject =getrootnode ("Report.xml");

Get the attrs of root element

$pageSet = $xmlobject->attrs;

Get the Report-head

$head = $xmlobject->nodes[0];

Code ignored ...



Function Draw_line (& $parent, $line) {

$line = &pc_create_object ($parent, "line");

$line->pc_set_linestyle ($line->attrs["Linetype"));

$line->pc_set_width ($line->attrs["SIZE"));

$line->pc_set_alignment ("center");

if ($line->attrs["Show"]==false) {

$line->pc_set_linecolor ("white");


$line->pc_set_margin (Array ("Top" => $line->attrs["Top"), "bottom" => $line->attrs["Bottom"], "left" =& Gt 0, "right" => 0));


Function Draw_text (& $parent, $text) {

Code ignored ...


Function draw_table (& $parent, $table) {

Code ignored ...


function AddHead (& $parent, $head) {

For ($i =0 $i < $head->n; $i + +) {

Switch ($head->nodes[$i]->name) {

Case "line":d raw_line ($parent, $head->nodes[$i]);

Case "TEXT":d raw_text ($parent, $head->nodes[$i]);







Create a PDF Document

$PDF = &pc_create_pdf (Array ("Author" => "Cyman", "Title" => "A-A-example"));

Create an A4-format page

$Page 1 = &pc_create_page ($PDF, $pageSet ["PageType"]);

AddHead ($Page 1, $head);

$PDF->pc_draw ();


6. Summary

In a few months of graduation design process, although busy, but very substantial. Through the analysis of a practical subject, research, demonstration, realization. Feel a lot of harvest. At present, the system has been put into use, received a very satisfactory effect, you can easily make beautiful and practical reports, documents and so on. However, due to the haste of time and the limitation of its own level, there are still many deficiencies in this system. The most regrettable of all is that there is no definition of an XML tag that is common to various documents (including reports, documents, manuals, and so on) and a common program to convert the XML document into PDF, just as the browser parses HTML. This eliminates the need for each document to define their own XML tags and write the corresponding conversion program, can greatly improve productivity.

Although the graduation project is over, I will continue to study the subject in the coming days.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.