Because of the project needs, these two days in C + + to do XML file parsing work. Under Linux there is a handy repository for manipulating XML files,--LIBXML2, which provides a set of C-language interfaces for creating and querying XML files. This blog mainly describes how to use LIBXML2 to read and parse XML files.
Download and install LIBXML2
: ftp://xmlsoft.org/libxml2/
Download the latest version, I downloaded the libxml2-2.9.1.tar.gz. After downloading, unzip the file to the appropriate location and enter the extracted directory.
The compile command is very simple (note: If the Configure file does not have executable permissions, add executable permissions):
./configure
Make
Make install
At this point the LIBXML2 related header file should be in the/USR/LOCAL/INCLUDE/LIBXML2 directory, libxml2 the relevant library file should be in the/usr/local/lib directory.
Two ways to parse an XML document
When using LIBXML2 for parsing XML documents, it is highly recommended to use the XPath language implementation, if the XML file as a database, then XPath can be considered as SQL, we can only construct a certain format of the statement to query the relevant results, The use of XPath in LIBXML2 is very simple. Of course, we can also directly through the LIBXML2 related interface from the node, according to the whole XML parent-child node relationship to the relevant node to query. I will describe each of these two ways.
We use the following XML test case:
<?xml version= "1.0" encoding= "iso-8859-1"? ><bookstore><book><title lang= "Eng" >Harry Potter </title><price>29.99</price></book><book><title lang= "Eng" >Learning XML</ Title><price>39.95</price></book></bookstore>
Parsing XML documents directly using the LIBXML2 interface
#include <stdio.h> #include <stdlib.h> #include <libxml/parser.h> #include <libxml/tree.h>int Main (int argc, char **argv) {xmldocptr pdoc = null;xmlnodeptr proot = NULL, pcur = null;/***************** Open XML document ******** /xmlkeepblanksdefault (0);//must be added to prevent the program from taking a blank text symbol before and after the element as a nodepdoc = Xmlreadfile ("Test.xml", "UTF-8", Xml_ Parse_recover);//libxml can only parse UTF-8 format data if (Pdoc = = NULL) {printf ("Error:can ' t Open file!\n"); exit (1);} /***************** Gets the root section object of the XML Document Object ********************/proot = Xmldocgetrootelement (Pdoc), if (proot = = NULL) {printf (" Error:file is empty!\n "); exit (1);} /***************** find the name of all books in the bookstore ********************/pcur = Proot->xmlchildrennode;while (pcur! = NULL) {// Like the char type in standard C, Xmlchar also has dynamic memory allocations, string manipulation, and other related functions. For example, Xmlmalloc is a function of dynamically allocating memory, Xmlfree is a matching release memory function, XMLSTRCMP is a string comparison function, and so on. For char* ch= "book", Xmlchar* xch=bad_cast (CH) or xmlchar* xch= (const XMLCHAR *) (CH)//For xmlchar* xch=bad_cast ("book"), char* ch= (char *) (XCH) if (!xmlstrcmp (Pcur->name, Bad_cast ("book"))) {XMlnodeptr nptr=pcur->xmlchildrennode;while (pcur! = NULL) {if (!xmlstrcmp (Nptr->name, Bad_cast ("title"))) { printf ("title:%s\n", ((char*) xml_get_content (Nptr->xmlchildrennode))); Pcur = Pcur->next;} /***************** Release Resources ********************/xmlfreedoc (Pdoc); Xmlcleanupparser (); Xmlmemorydump (); return 0;}
The specific process I have in the code detailed comments, this is not to be taken out alone to explain.
Using the XPath language to parse the XML document's basic knowledge about XPath, you can access the http://www.w3school.com.cn/xpath/index.asp
#include <stdio.h> #include <stdlib.h> #include <libxml/parser.h> #include <libxml/tree.h># Include <libxml/xpath.h> #include <libxml/xpathInternals.h> #include <libxml/xmlmemory.h> #include <libxml/xpointer.h>xmlxpathobjectptr Getnodeset (xmldocptr pdoc,const Xmlchar *xpath) {xmlXPathContextPtr Context=null;//xpath context Pointer xmlxpathobjectptr result=null; XPath result pointer context = Xmlxpathnewcontext (pdoc), if (pdoc==null) {printf ("Pdoc is null\n"); return NULL;} if (XPath) {if (context = = NULL) {printf ("context is null\n"); return NULL;} result = Xmlxpathevalexpression (XPath, context); Xmlxpathfreecontext (context); Releases the context pointer if (result = = NULL) {printf ("Xmlxpathevalexpression return null\n"); return NULL;} if (Xmlxpathnodesetisempty (Result->nodesetval)) {Xmlxpathfreeobject (Result);p rintf ("NodeSet is empty\n"); return NULL;}} return result;} int main (int argc, char **argv) {xmldocptr pdoc = null;xmlnodeptr proot = null;/***************** Open XML document ***************** /xmlkeEpblanksdefault (0);//must be added to prevent the program from the blank text symbol before and after the element as a nodepdoc = Xmlreadfile ("Test.xml", "UTF-8", xml_parse_recover);// Libxml can only parse UTF-8 format data if (Pdoc = = NULL) {printf ("Error:can ' t Open file!\n"); exit (1);} /***************** Gets the root section object of the XML Document Object ********************/proot = Xmldocgetrootelement (Pdoc), if (proot = = NULL) {printf (" Error:file is empty!\n "); exit (1);} /***************** find the name of all books in the bookstore ********************/xmlchar *xpath = Bad_cast ("//book"); The XPath statement xmlxpathobjectptr result = Getnodeset (Pdoc, XPath); Query the XPath expression to get a query result if (result = = NULL) {printf ("result is null\n"); exit (1);} if (result) {xmlnodesetptr nodeset = result->nodesetval;//Gets the collection of node pointers to the query xmlnodeptr cur;//nodeset-> NODENR is the total set of elements for (int i=0; i < nodeset->nodenr; i++) {cur = Nodeset->nodetab[i];cur = cur->xmlchildrennode; while (cur! = NULL) {//Like the char type in standard C, Xmlchar also has a correlation function such as dynamic memory allocation, string manipulation, and so on. For example, Xmlmalloc is a function of dynamically allocating memory, Xmlfree is a matching release memory function, XMLSTRCMP is a string comparison function, and so on. For char* ch= "book", Xmlchar* xch=bad_cast (CH) or xmlchar* xch= (const XMLCHAR *) (CH)For xmlchar* xch=bad_cast ("book"), char* ch= (char *) (XCH) if (!xmlstrcmp (Cur->name, Bad_cast ("title"))) {printf (" Title:%s\n ", ((char*) xml_get_content (Cur->xmlchildrennode))); cur = cur->next;}} Xmlxpathfreeobject (result);//release result pointer}/***************** release resource ********************/xmlfreedoc (Pdoc); Xmlcleanupparser (); Xmlmemorydump (); return 0;}
The specific process I have in the code detailed comments, this is not to be taken out alone to explain.
More detailed LIBXML2 interface with access to http://xmlsoft.org/html/libxml-tree.html
Compile the program and run it to compile the above program
g++ Search1.cpp-i/usr/local/include/libxml2-l/usr/local/lib-lxml2-o Search1
g++ Search2.cpp-i/usr/local/include/libxml2-l/usr/local/lib-lxml2-o SEARCH2
Run the program and run the results
Run./search1
The following results are displayed:
Title:harry Potter
Title:learning XML
Run./search2
The following results are displayed:
Title:harry Potter
Title:learning XML
From: http://blog.csdn.net/l_h2010/article/details/38639143
Using LIBXML2 for reading and querying XML files under Linux