A tree-based parser. It stores the entire document in the data structure of the tree, that is, it must be loaded into the memory to work. Therefore, when processing large XML documents, the performance decreases dramatically. SimpleXML and DOM extensions belong to this type of parser. There are two main XML parser in PHP
1) tree-based parser. It stores the entire document in the data structure of the tree, that is, it must be loaded into the memory to work. Therefore, when processing large XML documents, the performance decreases dramatically. SimpleXML and DOM extensions belong to this type of parser.
View: php uses simplexml to parse xml
2) stream-based parser. Instead of loading the entire document to the memory at a time, it reads one of the nodes and allows real-time interaction with them (when moving to the next node, the previous node is discarded, but also set to keep ). Obviously, it is highly efficient and occupies a small amount of memory. the inconvenience is caused by a large amount of code.
Therefore, the XMLReader extension scheme (stream-based parser) can be used to process large XML documents in PHP ). PHP 5.1 is enabled by default.
$xmlData = <<
http://www.php230.com/
2013-06-13 01:20:01
always
1.0
http://www.php230.com/category/
2013-06-13 01:20:01
always
0.8
XML;$xml = new XMLReader();// $url = 'http://www.php230.com/baidu_sitemap1.xml';// $xml->open($url);$xml->XML($xmlData);$assoc = xml2assoc($xml);$xml->close();function xml2assoc($xml) { $tree = null; while($xml->read()) switch ($xml->nodeType) { case XMLReader::END_ELEMENT: return $tree; case XMLReader::ELEMENT: $node = array('tag' => $xml->name, 'value' => $xml->isEmptyElement ? '' : xml2assoc($xml)); if($xml->hasAttributes) while($xml->moveToNextAttribute()) $node['attributes'][$xml->name] = $xml->value; $tree[] = $node; break; case XMLReader::TEXT: case XMLReader::CDATA: $tree .= $xml->value; } return $tree;}
We can view the final returned results:
echo '';print_r($assoc);echo '
';
Output result:
Array( [0] => Array ( [tag] => urlset [value] => Array ( [0] => Array ( [tag] => url [value] => Array ( [0] => Array ( [tag] => loc [value] => http://www.php230.com/ ) [1] => Array ( [tag] => lastmod [value] => 2013-06-13 01:20:01 ) [2] => Array ( [tag] => changefreq [value] => always ) [3] => Array ( [tag] => priority [value] => 1.0 ) ) ) [1] => Array ( [tag] => url [value] => Array ( [0] => Array ( [tag] => loc [value] => http://www.php230.com/category/ ) [1] => Array ( [tag] => lastmod [value] => 2013-06-13 01:20:01 ) [2] => Array ( [tag] => changefreq [value] => always ) [3] => Array ( [tag] => priority [value] => 0.8 ) ) ) ) [attributes] => Array ( [xmlns] => http://www.sitemaps.org/schemas/sitemap/0.9 ) ))
We can see that the final returned result is an array, so that we can easily obtain the value of none of the items in XML.