These days in cooperation with a company project, the other side provides the RSS interface, through this interface, in our website, but the other side of the RSS appeared some trouble characters, such as &,®,& #8482; Wait These characters are placed in XML, and if you do not do special processing, there will be errors. For example, the following XML:
<item>&</item>
Parsing this fragment in IE, and some parsing DOM, creates an error.
In the technical specification of the consortium, you can also see that such a character is not allowed to occur:
http://www.w3.org/TR/2001/REC-xml-c14n-20010315
For example: The characters allowed for Text Nodes are as follows: The string value, except all ampersands are replaced by &, all open angle brackets (&L T are replaced by &LT;, all closing angle brackets (>) are replaced by >, and all #xD characters are Replac Ed by & #xD;.
Because of the number of these special characters and the amount of work we replace in the XML, we can define them in the DTD file:
For example, add the following sections to the DTD file:
<!--PERCENT SIGN-->
<! ENTITY amp "& #38; #38;" >
<!--COPYRIGHT SIGN-->
<! ENTITY reg "& #x00AE;" >
<!--REG TRADE MARK SIGN-->
<! ENTITY trade "& #x2122;" >
and defining this XML file in XML requires this DTD support:
<! DOCTYPE headcount SYSTEM "EULA.DTD" >
This appears in the XML file & ® & #8482; Such special characters will no longer be an error.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.