I. Overview and installation of XML (Extensible Markup Language, eXtensibleMarkupLanguage) is a data format used for structured document interaction on the Internet. It is a standard defined by the Internet Association (W3C. XML and its related...
I. Overview and installation
XML (eXtensible Markup Language, eXtensible Markup Language) is a data format used for structured document interaction on the Internet. It is a standard defined by the Internet Association (W3C. Information about XML and related technologies can be accessed #.
This PHP extension supports the expat written by James Clark using PHP. This toolkit parses (but cannot validate) XML documents. It supports three character encodings provided by PHP: US-ASCII, ISO-8859-1, and UTF-8. UTF-16 is not supported.
This extension creates an XML parser and defines different XML Events.Handler). Each XML parser has a few adjustable parameters.
This extension requires libxml PHP extension. This indicates that you need to use-- Enable-libxml, Though this is done implicitly because libxml is enabled by default.
By default, this extension uses expat compat layer. You can also use expat, which is located #. The Makefile in the expat library does not build the warehouse file by default. you can use the following build rules to build the database:
libexpat.a: $(OBJS) ar -rc $@ $(OBJS) ranlib $@
The source code RPM installation package of expat can be found in.
This extension is enabled by default and can be disabled by the following options during compilation:-- Disable-xml
These functions are valid by default and use the bundled expat library. You can use parameters-- Disable-xmlTo block XML support. If you compile PHP into a module of Apache 1.3.9 or later, PHP automatically uses the expat library bound with Apache. If you do not want to use the bound expat library, use the parameter when running the PHP configure configuration script.-- With-expat-dir = DIR, DIR should point to the root directory installed by expat.
PHP Windows has built-in support for this extension. You do not need to load additional extensions to use these functions.
II. event processor
The XML event processor is defined as follows:
Supported XML processors
PHP processor functions |
Event description |
Xml_set_element_handler () |
Element events are triggered when the XML parser encounters a start or end tag. The start tag and end tag have different processors. |
Xml_set_character_data_handler () |
The character data field refers to all unlabeled content in the XML document, including spaces between tags. Note: The XML parser does not add or delete any spaces. the application (you) determines whether spaces are meaningful. |
Xml_set_processing_instruction_handler () |
PHP programmers must be familiar with processing commands (PI ). Is the processing instruction, where php is called the "processing instruction object ". Except that all processing instruction objects starting with "XML" are reserved by the system, other processing functions are specified by the application. |
Xml_set_default_handler () |
If no other processing functions are executed, the default processing functions are executed. You can obtain information such as XML and document type declarations in the default processing functions. |
Xml_set_unparsed_entity_decl_handler () |
The unresolved object declaration (NDATA) calls this processing function. |
Xml_set_notation_decl_handler () |
The symbolic declaration calls this processing function. |
Xml_set_external_entity_ref_handler () |
This processing function is called when the XML parser finds a reference to a common external entity that has been parsed. For example, reference a file or URL. For examples, see XML external entity routines. |
III. capital conversion
The element processing function converts an element name to case-folded (uppercase letter. Case-folding is defined as a string operation to replace non-capital letters with the corresponding capital letters ". In other words, in XML, case-folding is converted to uppercase.
By default, all element names that pass the processing function are converted to uppercase letters. Each XML parser can query and control this function through the xml_parser_get_option () and xml_parser_set_option () functions.
IV. Error code
The following constants are XML-related error codes (returned values of the xml_parse () function ):
XML_ERROR_NONE
XML_ERROR_NO_MEMORY
XML_ERROR_SYNTAX
XML_ERROR_NO_ELEMENTS
XML_ERROR_INVALID_TOKEN
XML_ERROR_UNCLOSED_TOKEN
XML_ERROR_PARTIAL_CHAR
XML_ERROR_TAG_MISMATCH
XML_ERROR_DUPLICATE_ATTRIBUTE
XML_ERROR_JUNK_AFTER_DOC_ELEMENT
XML_ERROR_PARAM_ENTITY_REF
XML_ERROR_UNDEFINED_ENTITY
XML_ERROR_RECURSIVE_ENTITY_REF
XML_ERROR_ASYNC_ENTITY
XML_ERROR_BAD_CHAR_REF
XML_ERROR_BINARY_ENTITY_REF
XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF
XML_ERROR_MISPLACED_XML_PI
XML_ERROR_UNKNOWN_ENCODING
XML_ERROR_INCORRECT_ENCODING
XML_ERROR_UNCLOSED_CDATA_SECTION
XML_ERROR_EXTERNAL_ENTITY_HANDLING
V. Character encoding
Php xml extensions support Unicode character sets through several different character encodings. There are two types of character encoding, the original encoding and the target encoding. in the internal representation of PHP, the document is always encoded using UTF-8.
After the XML is parsed, the original encoding is complete. When creating an XML parser, you can specify the original encoding (this encoding cannot be modified in the subsequent lifecycle of the XML parser ). Supported primitive encodings include ISO-8859-1, US-ASCII, and UTF-8. The first two are single-byte encodings, that is, each character is represented as a byte. The UTF-8 encodes a string of up to 21 bits into 1 to 4 bytes. The default original encoding used in PHP is ISO-8859-1.
When PHP passes data to the XML processing function, the target encoding is complete. When creating an XML processor, the target encoding is set to be the same as the original encoding, but can be modified at will. The target encoding affects the character data, tag names, and processing command targets.
If the XML parser encounters characters out of the original encoding range, an error is returned.
If PHP encounters a character that cannot be expressed by the specified target encoding in the parsed XML document, the character will be "degraded ". Generally, those characters are replaced with question marks (?).
The above is the XML operation for PHP extension (2) -- content of the XML parser installation and overview. For more information, see PHP Chinese website (www.php1.cn )!