Use PHP to read and write XMLDOM implementation code

Source: Internet
Author: User
There are many techniques that can be used to read and write XML in PHP. This article provides three methods to read XML: using the DOM Library, using the SAX parser, and using regular expressions. It also describes how to use DOM and PHP text templates to write XML. Reading and Writing Extensible Markup Language (XML) in PHP may seem a little scary. In fact, XML and all its related technologies may be terrible, but reading and writing XML with PHP is not necessarily a terrible task. First, you need to learn a little about XML-what it is and what it is used. Then, you need to learn how to read and write XML in PHP, and there are many ways to do this.
This article provides a brief introduction to XML and explains how to use PHP to read and write XML.
What is XML?
XML is a data storage format. It does not define what data to save or define the data format. XML only defines the attributes of tags and these tags. Well-formatted XML markup looks like this:
Jack Herrington
This Mark contains some text: Jack Herrington.
XML tags that do not contain text look like this:

There are more than one way to write something in XML. For example, the output of this tag is the same as that of the previous one:

You can also add attributes to the XML tag. For example Tag contains first and last attributes:

You can also use XML to encode special characters. For example, the & symbol can be encoded like this:
&
If the XML file containing tags and attributes is formatted as in the example, the format is good, which means that the tags are symmetric and the characters are correctly encoded. Listing 1 is an example of a well-formatted XML.

Listing 1. XML library list example

The code is as follows:




Jack Herrington
PHP Hacks
O 'Reilly


Jack Herrington
Podcasting Hacks
O 'Reilly



XML in listing 1 contains a list of books. Parent tag Contains a group Tag, each Mark and include, And <publisher>. <BR> after the XML document's tag structure and content are verified by the external mode file, the XML document is correct. Mode files can be specified in different formats. For this article, all we need is a well-formatted XML. <BR> If you think XML looks like HTML, that's right. XML and HTML are both tag-based languages with many similarities. However, it is important to note that although XML documents may be well-formed HTML documents, not all HTML documents are well-formed XML documents. Line Feed mark (br) is a good example of the difference between XML and HTML. This line feed mark is HTML in good format, but not XML in good format: <BR> <p> This is a paragraph <br> <BR> With a line break </p> <BR> The line feed mark is a well-formatted XML and HTML: <BR> <p> This is a paragraph <br/> <BR> With a line break </p> <BR> If you want to write HTML into XML in the same format, follow the W3C standards for Extensible HyperText Markup Language (XHTML. All modern browsers can render XHTML. In addition, you can use XML tools to read XHTML and find the data in the document, which is much easier than parsing HTML. <BR> <STRONG> Using the DOM library to read XML </STRONG> <BR> The easiest way to read well-formed XML files is to compile it into some PHP-installed document object models (DOM) library. The DOM library reads the entire XML document into the memory and uses the node tree to represent it, as shown in 1. <BR> Figure 1. xml dom tree of library XML <BR> The books node at the top of the tree has two book subtags. In each book, there are several nodes: author, publisher, and title. The author, publisher, and title nodes contain text subnodes. <BR> The code used to read the XML file of a book and display the content in DOM is shown in list 2. <BR> List 2. use DOM to read the XML of a book <BR> <p class = "codetitle"> <U> </U> the code is as follows: </p> <p class = "codebody" id = "code41945"> <BR> <? Php <BR> $ doc = new DOMDocument (); <BR> $ doc-> load ('books. XML'); <BR> $ books = $ doc-> getElementsByTagName ("book"); <BR> foreach ($ books as $ book) <BR >{< BR> $ authors = $ book-> getElementsByTagName ("author"); <BR> $ author = $ authors-> item (0)-> nodeValue; <BR> $ publishers = $ book-> getElementsByTagName ("publisher"); <BR> $ publisher = $ publishers-> item (0)-> nodeValue; <BR> $ titles = $ book-> getElementsByTa GName ("title"); <BR> $ title = $ titles-> item (0)-> nodeValue; <BR> echo "$ title-$ author-$ publisher \ n"; <BR >}< BR >?> <BR> </p> <BR> The script first creates a new DOMdocument object and loads the library XML into the object using the load method. Then, the script uses the getElementsByName method to obtain a list of all elements under the specified name. <BR> in the cycle of the book node, the script uses the getElementsByName method to obtain the nodeValue marked by author, publisher, and title. NodeValue is the text in the node. The script then displays these values. <BR> You can run a PHP script like this on the command line: <BR> % php e1.php <BR> PHP Hacks-Jack Herrington-O 'Reilly <BR> Podcasting Hacks-Jack Herrington-O 'Reilly <BR> % <BR>, each piece of books outputs a row. This is a good start. However, what should I do if I cannot access the XML DOM library? <BR> another way to read XML with the SAX parser <BR> is to use the XML Simple API (SAX) parser. Most PHP installations contain the SAX parser. The SAX parser runs on the callback model. Each time a tag is opened or closed, or each time the parser sees the text, it calls back the user-defined function with the node or text information. <BR> The advantage of the SAX parser is that it is truly lightweight. The parser does not keep content for a long time in the memory, so it can be used for very large files. The disadvantage is that it is very troublesome to write the callback of the SAX parser. Listing 3 shows the code for reading the XML file of a book using SAX and displaying the content. <BR> List 3. use a SAX parser to read the XML of a book <BR> <p class = "codetitle"> <U> </U> the code is as follows: </p> <p class = "codebody" id = "code19919"> <BR> <? Php <BR> $ g_books = array (); <BR> $ g_elem = null; <BR> function startElement ($ parser, $ name, $ attrs) <BR >{< BR> global $ g_books, $ g_elem; <BR> if ($ name = 'book') $ g_books [] = array (); <BR> $ g_elem = $ name; <BR >}< BR> function endElement ($ parser, $ name) <BR >{< BR> global $ g_elem; <BR> $ g_elem = null; <BR >}< BR> function textData ($ parser, $ text) <BR >{< BR> global $ g_books, $ g_elem; <BR> if ($ g_elem = 'Author' | <BR> $ g_elem = 'Her her' | <BR> $ g_elem = 'TITLE ') <BR >{< BR> $ g_books [count ($ g_books)-1] [$ g_elem] = $ text; <BR >}< BR> $ parser = xml_parser_create (); <BR> xml_set_element_handler ($ parser, "startElement", "endElement "); <BR> xml_set_character_data_handler ($ parser, "textData"); <BR> $ f = fopen ('books. XML', 'r'); <BR> while ($ data = fread ($ f, 4096) <BR >{< BR> xml_par Se ($ parser, $ data); <BR >}< BR> xml_parser_free ($ parser); <BR> foreach ($ g_books as $ book) <BR >{< BR> echo $ book ['title']. "-". $ book ['author']. "-"; <BR> echo $ book ['Her her']. "\ n"; <BR >}< BR >?> <BR> </p> <BR> The script first sets the g_books array, which contains all books and book information in the memory. the g_elem variable saves the name of the tag currently being processed by the script. Then the script defines the callback function. In this example, the callback functions are startElement, endElement, and textData. The startElement and endElement functions are called respectively when the tag is enabled or disabled. Call textData on the text between the start and end tags. <BR> In this example, the startElement tag finds the book tag and starts a new element in the book array. Then, the textData function checks the current element to see if it is a publisher, title, or author flag. If yes, the function puts the current text into the current book. <BR> In order for the parsing to continue, the script uses the xml_parser_create function to create a parser. Then, set the callback handle. Then, the script reads the file and sends the large part of the file to the parser. After the file is read, the xml_parser_free function deletes the parser. The content of the g_books array is output at the end of the script. <BR> as you can see, this is much more difficult than writing the same DOM function. What should I do if there is no DOM library and no SAX library? Is there any alternative? <BR> parse <BR> Back to the top <BR> parse XML using regular expressions <BR> I am certain that some engineers will criticize me even when I mention this method, however, you can use regular expressions to parse XML. Listing 4 shows an example of using the preg _ function to read a library file. <BR> List 4. read XML using regular expressions <BR> <p class = "codetitle"> <U> </U> the code is as follows: </p> <p class = "codebody" id = "code59440"> <BR> <? Php <BR> $ xml = ""; <BR> $ f = fopen ('books. XML', 'r'); <BR> while ($ data = fread ($ f, 4096) {$ xml. = $ data ;}< BR> fclose ($ f); <BR> preg_match_all ("/\ <book \> (. *?) \ <\/Book \>/s ", <BR> $ xml, $ bookblocks); <BR> foreach ($ bookblocks [1] as $ block) <BR >{< BR> preg_match_all ("/\(. *?) \ <\/Author \>/", <BR> $ block, $ author); <BR> preg_match_all ("/\ <title \> (.*?) \ <\/Title \>/", <BR> $ block, $ title); <BR> preg_match_all ("/\ <publisher \> (.*?) \ <\/Publisher \>/", <BR> $ block, $ publisher); <BR> echo ($ title [1] [0]. "-". $ author [1] [0]. "-". <BR> $ publisher [1] [0]. "\ n"); <BR >}< BR >?> <BR> </p> <br> note how short the code is. At the beginning, it reads the file into a large string. Then, use a regex function to read each book project. At last, use the foreach loop to cycle between each library block and extract author, title, and publisher. <BR> Where are the defects? The problem with reading XML using the regular expression code is that it is not checked first to ensure that the XML format is good. This means that before reading, you cannot know whether the XML format is good. In addition, some correctly formatted XML may not match the regular expression, so you must modify them later. <BR> I never recommend using regular expressions to read XML, but sometimes it is the best compatibility, because regular expression functions are always available. Do not use regular expressions to read XML directly from the user, because the format or structure of such XML cannot be controlled. You should always use the DOM Library or the SAX parser to read the XML from the user. <BR> ------------------------------------------------------------------------------ <BR> Back to the top <BR> writing XML with DOM <BR> reading XML is only part of the formula. How to write XML? The best way to write XML is to use DOM. Listing 5 shows how to build a library XML file by using DOM. <BR> List 5. use DOM to compile the library XML <BR> <p class = "codetitle"> <U> </U> code as follows: </p> <p class = "codebody" id = "code40877"> <BR> <? Php <BR> $ books = array (); <BR> $ books [] = array (<BR> 'title' => 'php hacks ', <BR> 'author' => 'Jack Herrington ', <BR> 'Her her' => "O 'Reilly" <BR> ); <BR> $ books [] = array (<BR> 'title' => 'podcasting hacks', <BR> 'author' => 'Jack Herrington ', <BR> 'Her her' => "O 'Reilly" <BR>); <BR> $ doc = new DOMDocument (); <BR> $ doc-> formatOutput = true; <BR> $ r = $ doc-> createElement ("books"); <BR> $ doc-> appendChil D ($ r); <BR> foreach ($ books as $ book) <BR >{< BR> $ B = $ doc-> createElement ("book "); <BR> $ author = $ doc-> createElement ("author "); <BR> $ author-> appendChild (<BR> $ doc-> createTextNode ($ book ['author']) <BR> ); <BR> $ B-> appendChild ($ author); <BR> $ title = $ doc-> createElement ("title "); <BR> $ title-> appendChild (<BR> $ doc-> createTextNode ($ book ['title']) <BR> ); <BR> $ B-> appendChild ($ title); <BR> $ pu Blisher = $ doc-> createElement ("publisher"); <BR> $ publisher-> appendChild (<BR> $ doc-> createTextNode ($ book ['Her her ']) <BR>); <BR> $ B-> appendChild ($ publisher); <BR> $ r-> appendChild ($ B ); <BR >}< BR> echo $ doc-> saveXML (); <BR >?> <BR> </p> <br> at the top of the script, the books array is loaded with some example books. This data can be from the user or the database. <BR> after the example book is loaded, the script creates a new DOMDocument and adds the root node books to it. Then, the script creates a node for the author, title, and publisher of each book, and adds a text node to each node. The last step of each book node is to add it to the root node books again. <BR> at the end of the script, use the saveXML method to output XML to the console. (You can also use the save method to create an XML file .) The script output is shown in listing 6. <BR> List 6. DOM build script output <BR> <p class = "codetitle"> <U> </U> code is as follows: </p> <p class = "codebody" id = "code23123"> <BR> php e4.php <BR> <? Xml version = "1.0"?> <BR> <books> <BR> <book> <BR> Jack Herrington </author> <BR> <title> PHP Hacks
O 'Reilly


Jack Herrington
Podcasting Hacks
O 'Reilly



The real value of using DOM is that the XML it creates is always in the correct format. But what should I do if I cannot use DOM to create XML?
--------------------------------------------------------------------------------
Back to top
Write XML in PHP
If the DOM is not available, you can use the PHP text template to write XML. Listing 7 shows how PHP builds a library XML file.
Listing 7. compiling Library XML with PHP

The code is as follows:


$ Books = array ();
$ Books [] = array (
'Title' => 'php hacks ',
'Author' => 'Jack Herrington ',
'Her her' => "O 'Reilly"
);
$ Books [] = array (
'Title' => 'podcasting hacks ',
'Author' => 'Jack Herrington ',
'Her her' => "O 'Reilly"
);
?>

Foreach ($ books as $ book)
{
?>

<? Php echo ($ book ['title']);?>





}
?>



The top of the script is similar to the DOM script. Open the books tag at the bottom of the script, and iterate in each book to create the book tag and all the internal title, author, and publisher tags.
The problem with this method is to encode the object. To ensure that the entity code is correct, you must call the htmlentities function on each project, as shown in listing 8.
Listing 8. Using the htmlentities function to encode an object

The code is as follows:



Foreach ($ books as $ book)
{
$ Title = htmlentities ($ book ['title'], ENT_QUOTES );
$ Author = htmlentities ($ book ['author'], ENT_QUOTES );
$ Publisher = htmlentities ($ book ['Her her '], ENT_QUOTES );
?>

<? Php echo ($ title);?>




}
?>



This is the annoyance of writing XML in PHP. You think you have created perfect XML, but when trying to use data, you will immediately find that the encoding of some elements is incorrect.
--------------------------------------------------------------------------------
Conclusion
There is always a lot of exaggeration and confusion around XML. However, it is not as difficult as you think-especially in a good language like PHP. After understanding and correctly implementing XML, you will find that many powerful tools are available. XPath and XSLT are two tools worth studying.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.