PHP XML Analysis function to introduce this PHP XML analysis function of the article is not very well. Read this should be clear point ... _php tutorial

Source: Internet
Author: User
Tags cdata mysql functions processing instruction xml example xml parser
XML parsing functions of PHP first I have to admit that I like computer standards. If everyone complies with the standards of the industry, the Internet will be a better medium. The use of standardized data interchange formats enables an open and platform-independent computing model to be practical. That's why I'm a fan of XML. Fortunately, my favorite scripting language supports XML and is constantly strengthening its support. PHP allows me to quickly publish XML documents to the Internet, collect statistical information from XML documents, and convert XML documents to other formats. For example, I often use PHP's XML processing power to manage the articles and books I write in XML. In this article, I'll discuss any expat parser built into PHP to work with XML documents. Using the example, I will demonstrate how the expat is handled. At the same time, an example can tell you how to: Build your own processing function to transform an XML document into your own PHP data structure to introduce a parser for expat XML, also known as an XML processor, that allows programs to access the structure and content of an XML document. Expat is the XML parser for the PHP scripting language. It is also used in other projects, such as Mozilla, Apache, and Perl. What is an event-based parser? Two basic types of XML parsers: Tree-based parsers: Transforming XML documents into tree structures. Such parsers parse the entire article and provide an API to access each element of the resulting tree. Its common standard is DOM (Document object mode). Event-based parser: treats an XML document as a series of events. When a particular event occurs, the parser invokes the function provided by the developer to process it. The event-based parser has a data-set view of an XML document, which means it concentrates on the data part of the XML document, not its structure. These parsers process the document from start to finish, and will resemble the beginning of the element, the end of the element, the start of the feature data, and so on-the event is reported to the application through the callback (callback) function. Here is an example of a "Hello-world" XML document: Hello World The event-based parser is reported as three events: Start element: Greeting CDATA The start of the item, the value is: Hello World End element: Greeting Unlike a tree-based parser, event-based parsers do not produce a structure that describes the document. In a CDATA item, the event-based parser does not let you get information about the parent element greeting. However, it provides a lower level of access, which makes it possible to make better use of resources and faster access. In this way, it is not necessary to put the entire document into memory, and in fact, the entire document can even be larger than the actual memory value. Expat is a kind of event-based parser. Of course, if you use expat, it can generate a full native tree structure in PHP as necessary. Examples of the above Hello-world include the full XML format. However, it is not valid because there is no DTD (document type definition) associated with it, and there is no inline DTD. For expat, this makes no difference: Expat is a parser that does not check for validity, and therefore ignores any DTD associated with the document. It should be noted, however, that the document still requires a full format, otherwise expat (as with other XML-compliant parsers) will stop with the error message. As a parser that does not check validity, the exapt and lightness of the device make it ideal for Internet applications. Compiling expat expat can be compiled into the PHP3.0.6 version (or above). From Apache1.3.9 onwards, expat has been part of Apache. In Unix systems, PHP is configured with the-with-xml option, which you can compile into PHP. If you compile PHP as an Apache module, expat will default as part of Apache. In Windows, you have to load the XML dynamic connection library. XML example: One way xmlstats understand the functions of expat is through examples. The example we are going to discuss is using expat to collect statistical data for XML documents. For each element in the document, the following information is output: the number of times that the element is used in the document the child elements of the element's parent element in the amount of character data in the elements note: In order to demonstrate, we use PHP to produce a structure to hold the parent and child elements of the element ready The function used to produce an XML parser instance is Xml_parser_create (). The instance will be used for all future functions. This idea is very similar to the connection token for MySQL functions in PHP. Before parsing a document, an event-based parser typically requires you to register a callback function-called when a particular event occurs. Expat there are no exceptions, it defines seven possible events: Object XML parsing function description element xml_sThe start and end character data of the Et_element_handler () element Xml_set_character_data_handler () character data begins outside the entity Xml_set_external_entity_ref_handler () external entity appears unresolved external entity xml_set_unparsed_entity_decl_handler () unresolved external entity appears processing instruction Xml_set_processing_instruction_handler () The occurrence notation of the processing instruction declares the occurrence of the default Xml_set_default_handler () notation Declaration of Xml_set_notation_decl_handler () other events that do not specify a handler function All callback functions must have an instance of the parser as their first parameter (plus other parameters). For the sample script at the end of this article. You need to be aware that it uses both the element handler function and the character data processing function. The callback handler for the element is registered by Xml_set_element_handler (). This function requires three arguments: the instance of the parser handles the name of the callback function of the starting element to handle the name of the callback function of the end element when the XML document is parsed, the callback function must exist. They must be defined as consistent with the prototypes described in the PHP manual. For example, expat passes three parameters to the handler function for the start element. In the script example, it is defined as follows: function start_element ($parser, $name, $attrs) The first argument is the parser flag, the second argument is the name of the start element, and the third parameter is an array containing all the attributes and values of the element. Once you start parsing the XML document, expat will call your start_element () function and pass the arguments past when it encounters the start element. The case folding option for XML closes the case folding option with the Xml_parser_set_option () function. This option is turned on by default so that the element name passed to the handler function is automatically converted to uppercase. However, XML is sensitive to capitalization (so the case is very important for statistical XML documents). For our example, the case folding option must be closed. Parsing the document after all the preparation, now the script can finally parse the XML document: Xml_parse_from_file (), a custom function, open the file specified in the parameter, and parse it in 4KB size Xml_parse () and Xml_parse_From_file (), False is returned when an error occurs, that is, when the XML document is not fully formed. You can use the Xml_get_error_code () function to get the number code of the last error. Pass this numeric code to the Xml_error_string () function to get the wrong text message. Outputs the current number of rows in the XML, making debugging easier. In the process of parsing, call the callback function. Describe the structure of a document when parsing a document, the question for expat is: How do you keep a basic description of the document structure? As mentioned earlier, the event-based parser itself does not produce any structural information. The tag structure, however, is an important feature of XML. For example, the sequence of elements represents a different meaning than <figure><title>. In other words, any author will tell you that the title is not related to the name of the map, although they all use the term "title". Therefore, you must use your own stack (stacks) or list (lists) to maintain the structure information of the document in order to use the event-based parser more efficiently for XML processing. To create a mirror image of the document structure, the script needs to know at least the parent element of the current element. Using the EXAPT API is not possible, it only reports the events of the current element, and does not have any information about the relationship. Therefore, you need to build your own stack structure. The script paradigm uses the advanced back-out (FILO) stack structure. With an array, the stack will hold all the start elements. For the start element handler function, the current element will be pushed to the top of the stack by the Array_push () function. Accordingly, the end element handler removes the topmost element by Array_pop (). for sequence <book><title> , the stack is populated as follows: Start element book: assigns "book" to the first element of the stack ($stack [0]). Start element title: assigns "title" to the top of the stack ($stack [1]). End element Title: Removes the topmost element from the stack ($stack [1]). End element Title: Removes the topmost element from the stack ($stack [0]). PHP3.0 implements an example by manually controlling the nesting of elements with a $depth variable. This makes the script look more complex. PHP4.0 uses the Array_pop () and Array_push () two functions to make the script look more concise. Collect data in order to gather information about each element, the script needs to remember the events for each element. Save all the different elements in the document by using a global array variable $elements. An array of items is an instance of an element class, with 4 attributes (a variable of the class) $count-the number of times that the element was found in the document $chars-the number of bytes in a character event in the element $parents-parent element $childs-child element as you can see, it is easy to save the class instance in an array. Note: One of the features of PHP is that you can traverse the entire class structure through the while (list () = every ()) loop as if you were traversing the entire corresponding array. All class variables (when you use PHP3.0 with the method name) are exported as strings. When an element is found, we need to add its corresponding register to track how many times it appears in the document. The count element in the corresponding $elements item is also added. We also want the parent element to know that the current element is its child element. Therefore, the name of the current element will be added to the project of the parent element's $childs array. Finally, the current element should remember who is its parent element. Therefore, the parent element is added to the current element $parents array of items. Show statistics the remainder of the code loops through the $elements array and its sub-arrays to display its statistical results. This is the simplest nested loop, although the output of the correct result, but the code is neither concise nor any special tricks, it is just a cycle you may use him to complete the work every day. The script paradigm is designed to be invoked via the command line of the CGI method of PHP. Therefore, the output of the statistical results is in text format. If you want to apply the script to the Internet, you need to modify the output function to produce the HTML format. Summary exapt is an XML parser for PHP. As an event-based parser, it does not produce a document's structure description. But by providing the underlying access, this makes it possible to make better use of resources and faster access. As a parser that does not check for validity, expat ignores the DTD that is connected to the XML document, butThe format of the document is incomplete and will stop with the error message. Provides event handlers to work with documents to establish their own event structures such as stacks and trees to get the benefits of XML Structure information markup. New XML programs appear every day, and PHP support for XML is increasing (for example, adding support for DOM-based XML parser libxml). With PHP and expat, you can prepare for the upcoming effective, open, and platform-independent standards. Example

http://www.bkjia.com/PHPjc/531750.html www.bkjia.com true http://www.bkjia.com/PHPjc/531750.html techarticle XML parsing functions of PHP first I have to admit that I like computer standards. If everyone complies with the standards of the industry, the Internet will be a better medium. Use standardized data to turn ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.