Before analyzing XML syntax, it is necessary to understand the basic rules of XML syntax: Lexical features: 1) XML case sensitive, if the element name must be case-insensitive before enabling or disabling the tag, you must first understand the basic rules of the XML syntax:
Lexical features: 1) XML is case sensitive. for example, the element names must be case-sensitive when opening or disabling tags. ... , The reserved string of XML should comply with the case requirements ....
2) XML reserved markup characters: <> &, # wiki/75.html "target =" _ blank "> reserved characters cannot appear in the element name, element text, attribute name, or attribute value. <user opens a tag,> this parameter is used to close the tag, & for conversion, common conversion means & lt generate <, & gt generate>, & amp generate &, & apos generate ', & quot generate"
3) an element name can contain letters, numbers, periods, hyphens, underscores, colons, and extended characters in other languages, element names cannot contain space characters (grid operators, jump characters, line breaks, and carriage returns). element names can be prefixed by domain names. For example: Element text can be a character set combination other than XML reserved characters, such My money is $2000.
4) the attribute name rules are the same as the element name. the attribute values are enclosed by single quotation marks or double quotation marks and can be composed of strings other than XML reserved characters, for example: . The attribute name has the xmlns prefix, indicating that the attribute defines a domain name, for example:
Syntax features: 1) an XML document consists of an XML description, multiple optional document descriptions, multiple optional XML instructions, multiple optional XML annotations, and a data body of a root element, in addition, CDATA segments can be embedded into statements, such:
/* XML description *//* XML document description */
/* XML comment */
/* XML command */
/* Root data element */
...…
2) XML description Mark is disabled, which includes the version, encoding, and other optional instructions, such:
3) description of the XML document Close, for example:
4) XML commands Close, for example:
5) XML annotation Close, for example:
6) The XML element consists <元素名> Open, by/>, or Close. The opening and closing tags of elements match each other, as shown in figure Or ...XML elements can be nested. Therefore, layers must be matched, as shown in figure .. .
7) CDTATA>打开,由Disables the statement to avoid XML parsing rules. For example: select * from mytable where thefield <= ‘100’
Based on the preceding XML syntax features, you can construct a regular expression for lexical analysis and a push-down structure for syntax analysis.
XML lexical regular expression:
# Define digit [1, 2 ,..., 9]/* digit character */
# Define letter [a, B ,..., Z, A, B ,..., Z]/* letter */
# Define signs [~, ! , @, #, %, ^ ,&,*,(,),?, :,;, ", ',.,/,-, _, +, =, |,/]/* Symbol character */
# Define ascii2 [0x80 ,..., 0xFF]/* ASCII chart2 extension character */
# Define space [0x20,/t,/r,/n]/* space character, grid character, carriage return character, line break */
# Define reserve [<,>, &]/* XML reserved characters */
1) regular expression of element name:
element_name -> (_ | letter | ascii2) (ε| _ | - | : | . | digit | letter | signs | ascii2)*
2) regular expression of element text:
element_text -> (ε| not reserve)*
3) regular expression of the attribute name:
proper_name -> (_ | letter | ascii2) (ε| _ | - | : | . | digit | letter | signs | ascii2)*
4) regular expression of attribute text:
proper_value -> (ε| not reserve)*
XML syntax structure:
Xml_document-> xml_header (ε | xml_declare | xml_instruct | xml_comments) * xml_element xml_header-> [
] Xml_declare-> [
] Xml_instruct-> [
] Xml_comments-> [