http://trash.chregu.tv/phpconf2003/examples/
New XML features of PHP5
Author Christian Stocker Translator Ice_berg16 (The Scarecrow looking for a dream)
For the reader
This article is targeted at all levels of PHP developers interested in PHP5 's new XML functionality. We assume that the reader is familiar with the basics of XML. However, if you have already used XML in your PHP, this article will also benefit you.
Introduced
In today's Internet world, XML is no longer a buzzword, it has been widely accepted and standardized to use. Therefore, the support for XML is more important than PHP4,PHP5. In PHP4, you are facing almost all non-standard, API interrupts, memory leaks, and other incomplete features. Although some deficiencies have been improved in PHP4.3, developers have decided to discard the original code and rewrite all the code in PHP5.
This article describes all the exciting new features of XML in PHP5.
XML for PHP4
Earlier versions of PHP have started to support XML, which is just a sax-based interface that can easily parse any XML document. With the addition of the Domxml extension module in PHP4, XML is better supported. Later XSLT was added as a supplement. During the entire PHP4 phase, other features such as HTML,XSLT and DTD validation were added to the domxml extension, unfortunately, because the XSLT and Domxml extensions were always in the experimental phase and the API parts were modified more than once, they could not be installed by default. In addition, the domxml extension does not follow the DOM standard developed by the consortium, but has its own naming method. While this part of the PHP4.3 has been improved and many memory leaks and other features have been repaired, it has not developed into a stable phase, and some in-depth problems have been almost impossible to fix. Only sax extensions have been installed by default, and some of the other extensions have never been widely used.
For all these reasons, PHP's XML developers decided to rewrite all the code in PHP5 and follow the usage criteria.
XML for PHP5
Almost all of the XML-enabled sections in PHP5 are rewritten. All XML extensions are now based on the GNOME project's LIBXML2 library. This allows for interoperability between different extension modules, and the core developer only needs to be developed on one of the underlying libraries. For example, complex memory management can only be implemented once to make all XML-related extensions better.
In addition to inheriting the SAX parser known in PHP4, PHP5 supports both the standard-based DOM and the LIBXSLT-engined XSLT. PHP's unique SimpleXML extensions and compliant SOAP extensions are also included. As XML becomes more and more valued, PHP developers have decided to add more support for XML to the default installation method. This means that you can now use Sax,dom and simplexml, and these extensions will be installed on more servers. The support for XSLT and SOAP is then explicitly configured at PHP compile time.
Support for Data flow
All XML extensions now support PHP data streams, even if you don't access them directly from PHP. For example, in PHP5 you can access the data stream from a file or from an instruction. Basically you can access the PHP data stream anywhere you can access the normal files.
PHP4.3 briefly introduces the data flow, which has been further improved in PHP5, including file access, network access and other operations, such as sharing a set of function functions. You can even use PHP code to implement your own data streams, so that data access becomes very simple. Please refer to the PHP documentation for more details on this section.
Sax
The full name of Sax is the simple API for XML, which is the interface for parsing an XML document and is based on the callback form. Sax has been supported since PHP3 and has not changed much in the present. In PHP5, the API interface has not changed, so your code can still run. The only difference is that it is no longer based on the expat library, but on the LIBXML2 library.
This change has led to a number of issues with namespace support that have been addressed in the LIBXML2.2.6 release. However, the previous version of LIBXML2 is not resolved, so if you use Xml_parse_create_ns (), it is highly recommended to install LIBXML2.2.6 on your system.
Dom
The DOM (Document Object model) is a set of standards for accessing XML document trees by the consortium. In PHP4 you can use Domxml to manipulate this, the main problem with Domxml is that it does not conform to the standard naming method. And there is a memory leak problem for a long time (PHP4.3 has fixed the problem).
The new DOM extension is done based on the criteria of the standard, including method and property names. If you are familiar with Dom in other languages, such as in JavaScript, it will be very easy to write similar functionality in PHP. You don't have to look at the document every time because the methods and parameters are the same.
The Domxml-based code will not work because of the new standards. There is a big difference in the API in PHP. However, porting is not very difficult if your code uses a similar method of naming. You only need to modify the load function and save function to remove the underscore in the function name (the DOM standard is capitalized with the first letter). The rest of the adjustment is certainly necessary, but the main logical part can remain the same.
Read Dom
I will not explain all the features of the DOM extension in this article, and that is not necessary. Perhaps you should bookmark the HTTP://www.w3.org/DOM document ... Bran only old Fender joyriding?/a>
In most of the examples in this article we will use the same XML file, which has a very simple RSS version on the zend.com. Paste the following text into a text file and save it as Articles.xml.
http://www.zend.com/zend/week/week172.php
http://www.zend.com/zend/tut/tut-hatwar3.php
To load this example into a DOM object, first create a DOMDocument object and then load the XML file.
$dom = new DomDocument ();
$dom->load ("Articles.xml");
As mentioned above, you can use the PHP data stream to load an XML document, which you should write:
$dom->load ("File:///articles.xml");
(or other types of data streams)
If you want to output an XML document to a browser or as a standard mark, use:
Print $dom->savexml ();
If you want to save it as a file, please use:
Print $dom->save ("Newfile.xml");
(Note that doing so will send the file size to stdout)
Of course This example does not have much functionality, let's do something more useful. Let's get all the title elements. There are many ways to do this, and the simplest is to use getElementsByTagName ($tagname):
$titles = $dom->getelementsbytagname ("title");
foreach ($titles as $node) {
Print $node->textcontent. "\ n";
}
The Textcontent property is not a standard, it allows us to easily read all the text nodes of an element quickly, using the standard reading of the user is as follows:
$node->firstchild->data;
(At this point you have to make sure that the FirstChild node is the text node you need, otherwise you have to traverse all the sub-nodes to find it).
Another problem to be aware of is that getElementsByTagName () returns an Domnodelist object instead of returning an array as Get_elements_by_tagname () in PHP4. But as you can see in this example, you could easily traverse it with a foreach statement. You can also access nodes directly using $titles->item (0). The method returns the first TITLE element.
Another way to get all the title elements is to traverse from the root node, and you can see that this method is more complex, but this method is more flexible if you need more than the title element.
foreach ($dom->documentelement->childnodes as $articles) {
If the node is an element (NodeType = = 1) and the name is item, continue looping
if ($articles->nodetype = = 1 && $articles->nodename = = "Item") {
foreach ($articles->childnodes as $item) {
If the node is an element, and the name is title, print it.
if ($item->nodetype = = 1 && $item->nodename = = "title") {
Print $item->textcontent. "\ n";
}
}
}
}
Xpath
XPAHT is like XML SQL, using XPath you can query a particular node in an XML document that conforms to some pattern syntax. To use XPath to get all the title nodes, you only need to do this:
$XP = new Domxpath ($dom);
$titles = $xp->query ("/articles/item/title");
foreach ($titles as $node) {
Print $node->textcontent. "\ n";
}
?>
This is similar to using the getElementsByTagName () method, but XPath is much more powerful, for example, if we have a TITLE element that is a child of article (not the child element of item), getElementsByTagName ( ) will return it. Using the/articles/item/title syntax, we will only get the title element at the specified depth and position. This is just a simple example, and then a little bit more, it might be:
/articles/item[position () = 1]/title Returns all of the first item element
/articles/item/title[@id = ' 23 '] returns all title with id attribute and value 23
/articles//title returns the title under all articles elements (translator Note://for any depth)
You can also query for points that contain special sibling elements, elements that contain special text content, or use namespaces, and so on. If you have a lot of query XML documents, proper learning using XPath will save you a lot of time, it is easy to use, executes faster, and requires less code than the standard DOM.
Writing data to the DOM
The Document Object model is not only read and queried, you can also manipulate and write. (DOM standards are a bit verbose because the writer wants to support every conceivable environment, but it works very well.) Take a look at the following example, which adds a new element to our Article.xml file.
$item = $dom->createelement ("item");
$title = $dom->createelement ("title");
$titletext = $dom->createtextnode ("XML in PHP5");
$title->appendchild ($titletext);
$item->appendchild ($title);
$dom->documentelement->appendchild ($item);
Print $dom->savexml ();
First, we create all the required nodes, an item element, a TITLE element, and a text node that contains the item title, and then we link all the nodes, add the text nodes to the title element, add the title element to the item element, Finally, we insert the item element into the articles root element. Now, we have a new list of articles in our XML document.
Extension classes (Class)
Well, the above example can be done under PHP4 with the domxml extension (only the API has some differences), the ability to extend the DOM class itself is a new feature of PHP5, which makes it possible to write more readable code. The following is the entire example of the DOMDocument class:
Class Articles extends DomDocument {
function __construct () {
Must be called!
Parent::__construct ();
}
function Addarticle ($title) {
$item = $this->createelement ("item");
$titlespace = $this->createelement ("title");
$titletext = $this->createtextnode ($title);
$titlespace->appendchild ($titletext);
$item->appendchild ($titlespace);
$this->documentelement->appendchild ($item);
}
}
$dom = new articles ();
$dom->load ("Articles.xml");
$dom->addarticle ("XML in PHP5");
Print $dom->save ("Newfile.xml");
Html
One of the often unnoticed features of PHP5 is the LIBXML2 Library's support for HTML, which allows you to load not only XML documents that are well-formed (well-formed), but also unstructured (not-well-formed) HTML documents with DOM extensions. Use it as a standard DOMDocument object, using all the methods and features you can use, such as XPath and SimpleXML.
HTML performance is useful when you need to access content that you don't have control over your site. With the help of XPath, XSLT, or simplexml, you omit a lot of code, like using regular expressions to compare strings or sax parsers. This approach is especially useful when the HTML document structure is not very good (this is a frequent problem!). )。
The following code obtains and parses the first page of the php.net, returning the contents of the title element.
$dom = new DomDocument ();
$dom->loadhtmlfile ("http://www.php.net/");
$title = $dom->getelementsbytagname ("title");
Print $title->item (0)->textcontent;
Note that when the specified element is not found, your output may contain errors. If your site is still using PHP output HTML4 code, there is a good news to tell you that DOM extensions can not only load HTML documents, but also save them as files in HTML4 format. After you have finished adding the DOM document, use $dom->savehtml () to save it. It is important to note that in order for the output of the HTML code to conform to the standard, it is best not to use a neat extension? (Tidy extension). The HTML supported by the LIBXML2 library does not take into account every possible occurrence, nor does it handle input in a non-generic format.
Verify
Validation of XML documents is becoming increasingly important. For example, if you get an XML document from some foreign resources, you need to verify that it conforms to a certain format before you process it. Fortunately, you don't need to write your own validation program in PHP, because you can use one of the three most widely used standards (Dtd,xml Schema or relaxng) to do it.
A DTD is a standard that arises in the era of SGML, lacks some new features of XML (such as namespaces), and is difficult to parse and transform because it is not written in XML.
XML Schemai is a standard developed by the broad-based consortium that is widely used and contains almost all the content needed to validate XML documents.
Relaxng is the enemy of the complex XML Schema standard, created by the free-people organization, and because it is easier to implement than the XML schema, more and more programs are starting to support Relaxng.
If you don't have a legacy planning document or a very complex XML document, then use Relaxng. It is simple to write and read, and more and more tools support it. There is even a tool called Trang, which can automatically create a Relaxng document from an XML template. and only Relaxng (and aging DTDs) are fully supported by LIBXML2, although LIBXML2 is also about to fully support the ML Schema.
Validating the syntax of an XML document is fairly straightforward:
$dom->validate (' articles.dtd ');
$dom->relaxngvalidate (' articles.rng ');
$dom->schemavalidate (' articles.xsd ');
Currently, all of these will simply return true or FALSE, and the error will be made as a PHP warning output. Obviously want to return user-friendly information This is not a good idea and will be improved in later versions of PHP5.0. How this is going to happen is still under discussion, but the error report will certainly do better.
SimpleXML
SimpleXML is the last member to be added to the XML family of PHP, and the purpose of joining the SimpleXML extension is to provide an easier way to access XML documents using standard object properties and iterators. The extension does not have many methods, although it is still quite powerful. Getting all the title nodes from our documentation requires less code than the original.
$sxe = simplexml_load_file ("Articles.xml");
foreach ($sxe->item as $item) {
Print $item->title. " \ n ";
}
What the hell is this? First load the articles.xml into a SimpleXML object. It then gets all the item elements in the $SXE, and finally $item->title returns the contents of the title element, that's it. You can also use the associative Array query property, using: $item->title[' id '].
See, this is really amazing, there are many different ways to get the results we want, for example, $item->title[0] return and the same result in the example, on the other hand, the foreach ($sxe->item->title as $ Item) returns only the first title, not all of the title elements in the document. (as I expected in XPath).
SimpleXML is actually the first extension that uses the new features of the Zend Engine 2. So it's also a test point for these new features, and you know that bugs and unpredictable errors are not a few in the development phase.
In addition to the method used to traverse all nodes in the example above, there is an XPath interface in SimpleXML, which provides an easier way to access a single node.
foreach ($sxe->xpath ('/articles/item/title ') as $item) {
Print $item. "\ n";
}
Admittedly, this code is not shorter than in the previous example, but provides more complex or deeper nested XML documents, and you'll find that using XPath with SimpleXML saves you a lot of input.
Writing data to a SimpleXML document
Not only can you parse and read simplexml, but you can also change simplexml documents. At least we add some extensions:
$sxe->item->title = "XML in PHP5"; The new content of the title element.
$SXE->item->title[' id '] = 34; The new property of the title element.
$xmlString = $sxe->asxml (); Returns the SimpleXML object as a serialized XML string
Print $xmlString;
Interoperability Collaboration
Since SimpleXML is also based on the LIBXML2 library, you can easily convert SimpleXML objects into DOMDocument objects with little or no impact on speed. (The document doesn't have to be copied internally), because of this mechanism, you have the best part of two objects, using a tool that works for you, and it's used like this:
$sxe = Simplexml_import_dom ($dom);
$dom = Dom_import_simplexml ($SXE);
Xslt
XSLT is the language used to transform XML documents into other XML documents, and XSLT itself is written in XML, which is a functional language family, which differs in program processing and in the face of object language (like PHP). There are two types of XSLT processors in PHP4: Sablotron (in widely used XSLT extensions) and libxslt (in Domxml extensions), which are not compatible with each other and use different methods. PHP5 only supports the LIBXSLT processor, it is chosen because it is LIBXML2 based, and therefore more in line with the XML concept of PHP5.
It is also possible to theoretically bind Sablotron to PHP5, but unfortunately no one is doing it. Therefore, if you are using Sablotron, you will have to switch to the LIBXSLT processor in the PHP5. Libxslt is a sablotron with JavaScript exception handling support and can even use PHP's powerful data stream to re-implement Sablotron's exclusive plan processing (scheme handlers). In addition, LIBXSLT is one of the fastest XSLT processors, so you get the speed boost for free. (Execution speed is twice times that of Sablotron).
As with the other extensions discussed in this article, you can exchange XML documents between XSL extensions, Dom extensions, and vice versa, and you actually have to do this because the ext/xsl extension does not have an interface for loading and saving XML documents, only using DOM extensions. When you start to learn the XSLT transformation, you don't need to master too much content, and there's no standard for it, because the API is "borrowed" from Mozilla.
First you need an XSLT stylesheet, paste the following text into a new file and save the gray articls.xsl
Then call it with a PHP script:
/* Load XML and XSL documents into the DOMDocument object */
$xsl = new DomDocument ();
$xsl->load ("articles.xsl");
$inputdom = new DomDocument ();
$inputdom->load ("Articles.xml");
/* Create an XSLT processor and import the style sheet */
$proc = new Xsltprocessor ();
$xsl = $proc->importstylesheet ($xsl);
$proc->setparameter (NULL, "titles", "titles");
/* Convert and output the XML document */
$newdom = $proc->transformtodoc ($inputdom);
Print $newdom->savexml ();
?>
The above example first uses the DOM method load () to load the XSLT stylesheet articles.xsl, and then creates a new Xsltprocessor object, which takes the XSLT style sheet object to be used later, and the parameter can be set Setparameter (NamespaceURI, name, value), and the last Xsltprocessor object uses Transformtodoc ($inputdom) to start the conversion and return a new DOMDocument object.
. The advantage of this API is that you can use the same stylesheet to transform many XML documents, just load it once and reuse it, because the Transormtodoc () function can be applied to different XML documents.
In addition to Transormtodoc (), there are two methods for conversion: Transformtoxml ($dom) returns a string that Transformtouri ($dom, $uri) saves the converted document to a file or to a PHP data stream. Note If you want to use an XSLT syntax such as or indent= "yes", you cannot use Transformtodoc () because the DOMDocument object cannot save the information, only if you save the converted results directly to a string or file.
Calling PHP functions
The last new feature of the XSLT extension is the ability to invoke any PHP function within the XSLT stylesheet, which is very useful in some places where Orthodox XML supporters must not like this feature (a style sheet is a bit complex, confusing logic and design). XSLT becomes very limited when it comes to functions, even though it is cumbersome to implement a date in a different language. But with this feature, it's as easy as using PHP. Here is the code to add a function to the XSLT:
function Datelang () {
Return strftime ("%A");
}
$xsl = new DomDocument ();
$xsl->load ("datetime.xsl");
$inputdom = new DomDocument ();
$inputdom->load ("Today.xml");
$proc = new Xsltprocessor ();
$proc->registerphpfunctions ();
Load the document and use $xsl to process
$xsl = $proc->importstylesheet ($xsl);
/* Convert and output the XML document */
$newdom = $proc->transformtodoc ($inputdom);
Print $newdom->savexml ();
?>
Here is the XSLT stylesheet datetime.xsl, which calls this function.
The following is an XML document to be converted using a stylesheet, Today.xml (and Articles.xml will get the same result).
The above style sheet, PHP script, and all XML files will output the name of the week in the language of the current system settings. You can add more parameters to Php:function (), and the added parameters will be passed to the PHP function. Here's a function php:functionstring (), which automatically converts all input arguments to strings, so you don't need to convert in PHP.
Note that you need to call $xslt->registerphpfunctions () before the conversion, otherwise the PHP function call will not be executed for security reasons (do you always trust your XSLT stylesheet?). )。 The current access system has not been implemented and may be implemented in future versions of PHP5.
Summary
PHP support for XML has taken a big step forward, it is standard, powerful, interoperable, and is installed as the default option, has been authorized to use. The newly added SimpleXML extension provides a quick and easy way to access XML documents, saving you a lot of code, especially if you have structured documents or you can use powerful XPath.
Thanks to the underlying library used by the LIBXML2-PHP5 XML extension, validating the XML document with the Dtd,relaxng or XML schema is now supported.
XSL support has also been refurbished and now uses the LIBXSLT library, which has a much higher performance than the original Sablotron library, and invoking PHP functions inside the XSLT stylesheet allows you to write more powerful XSLT code.
If you have already used XML in PHP4 or other languages, you will like PHP5 XML features, XML has changed a lot in PHP5, conforms to the standard, and the language is equal to other tools.
Link
PHP 4 Related
Domxml Extension: http://www.php.net/domxml/
Sablotron Extension: http://www.php.net/xslt/
libxslt:http://www.php.net/manual/en/functi...-stylesheet.php
PHP 5 Related
simplexml:http://www.php.net/simplexml/
streams:http://www.php.net/manual/en/ref.stream.php
Standard
Dom:http://www.w3.org/dom
Xslt:http://www.w3.org/tr/xslt
Xpath:http://www.w3.org/tr/xpath
XML Schema:http://www.w3.org/xml/schema
relaxng:http://relaxng.org/
xinclude:http://www.w3.org/tr/xinclude/
Tools
LIBXML2, the underlying library:http://xmlsoft.org/
Trang, a schema/relaxng/etc converter:http://www.thaiopensource.com/relaxng/trang.html
About the author
Christian Stocker is the founder and CEO of Zurich Bitflux GmbH, a co-author of Xsl,dom and Imagick extended maintenance staff, German books PHP de Luxe, while working on other open source projects such as Bitflux Editor and Popoon. can be generalized chregu@php.net. Contact him.
http://www.bkjia.com/PHPjc/317086.html www.bkjia.com true http://www.bkjia.com/PHPjc/317086.html techarticle Http://trash.chregu.tv/phpconf2003/examples/PHP5 's new XML feature author Christianstocker translator Ice_berg16 (a dream-seeking scarecrow) for the reader The object-oriented of this article is all to the ph ...