When writing or reading an XML file, you need to be aware of filtering for illegal characters
According to the standards of the consortium, the following 16 characters are not allowed to appear in the XML file, even if the <! In the [cdate[]]>, not to be spared.
x00-//x08
x0b-//x0c
x0e-//x1f
So, you need to exclude characters from these 3-range segments
can be used. NET, the Replace method of the Regex in the string is substituted for the characters in the 3-range segment, such as:
String content = "as FAs fasfadfasdfasdf<234234546456";
Content = Regex.Replace (content, "[//x00-//x08//x0b-//x0c//x0e-//x1f]", "*");
Response.Write (content);
This is OK, if you are not comfortable, you can <,>,&, ', ' the 5 symbols for special treatment, namely:
< <
> >
& &
' '
""
Java processing Code
/**
* Use D instead of illegal characters.
* @param text
* @param D
* @return
*/
public static string Replaceinvaldatecharacter (string text, char d) {
if (text!= null) {
char[] data = Text.tochararray ();
for (int i = 0; i < data.length; i++) {
if (!isxmlcharacter (Data[i]))
Data[i] = D;
}
return new String (data);
}
Return "";
}
/**
* Use spaces instead of illegal characters.
* @param text
* @return
*/
public static string Replaceinvaldatecharacter (string text) {
return Replaceinvaldatecharacter (text, (char) 0x20);
}
/**
* Check if the character is a valid XML character
* The allowable character range (Http://www.w3.org/TR/REC-xml#dt-character) is specified in the XML specification:
* Char:: = #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
* @param c
* @return
*/
private static Boolean isxmlcharacter (int c) {
if (c <= 0xd7ff) {
if (c >= 0x20)
return true;
Else
return c = = '/n ' | | c = = '/R ' | | c = = '/t ';
}
Return (c>=0xe000 && c<= 0xFFFD) | | (c>=0x10000 && c<= 0x10ffff);
}
Messages that contain illegal XML characters can be read correctly after an illegal character replacement is processed.
To sum up, the following processing is required for WML content:
1, replace the illegal XML characters
2. For characters that may appear ' < ', '/', ' & ', ' > ', escape character replacement, or use CDATA to wrap