When working with XML data, special characters are handled specially and cannot be confused with node characters.
All text in the XML document is parsed by the parser.
Only the text in CDATA sections (CDATA section) is ignored by the parser.
PCDATA
PCDATA refers to the parsed character data (parsed Character).
The XML parser usually parses all the text in the XML document.
When an XML element is parsed, the text between its tags is also parsed:
<message> This text will also be parsed </message>
The parser does this because the XML element can contain other elements, as in this example, where the <name> element contains another two elements (first and last):
<name><first>Bill</first><last>Gates</last></name>
And the parser breaks it down into sub-elements like this:
<name>
<first>Bill</first>
<last>Gates</last>
</name>
Escape character
Illegal XML characters must be replaced with entity reference.
If you place a character like "<" in an XML document, the document generates an error because the parser interprets it as the beginning of the new element. So you can't write like this:
<message>if Salary < Then</message>
To avoid this type of error, you need to replace the character "<" with an entity reference, like this:
<message>if Salary < Then</message>
in the XML in a 5 a pre-defined entity reference:
< |
< |
Less than |
> |
> |
Greater than |
& |
& |
and number |
' |
‘ |
Single quotation marks |
" |
" |
Double quotes |
Note: strictly speaking, only the characters "<" and "&" are illegal in XML. ellipses, quotes, and greater-than numbers are legal, but it's a good practice to replace them with entity references.
Cdata
The term CDATA refers to textual data (unparsed Character data) that should not be parsed by the XML parser.
In XML elements, "<" and "&" are illegal.
"<" generates an error because the parser interprets the character as the beginning of the new element.
"&" also generates an error because the parser interprets the character as the beginning of the character entity.
Some text, such as JavaScript code, contains a large number of "<" or "&" characters. To avoid errors, you can define the script code as CDATA.
All content in the CDATA section is ignored by the parser.
CDATA part by "<![ cdata["Start, End with"]]>":
<script><! [cdata[ function matchwo (A,b) { if (a < b && a < 0) then { c7/>return 1; } Else { return 0; } ]] ></script>
In the example above, the parser ignores all the content in the CDATA section.
about the CDATA section of the note:
CDATA sections cannot contain the string "]]>". Also, nested CDATA sections are not allowed.
"]]>" that marks the end of a CDATA section cannot contain spaces or lines.
XML special character processing and CDATA