Today encountered a IE7 under the Json.parse failure problem. A troubleshooting Discovery: The server-side profile encoding is UTF-8 + BOM
that the output string starts with the BOM character, not the legitimate JSON.
IE7 does not support native JSON, we are using json2.js in our project, but JSON that does not parse the BOM character at the beginning is not Json2 's fault, and other browsers are normal because they ignore the BOM at the beginning of the response body. If you write as follows, each browser will throw an exception:
<script>var a = ' {' A ': 1} '; Try { catch(e) { alert (e.message);} </script>
By pasting this piece of code into a powerful codemirror, you can easily discover this invisible BOM character:
In terms of today's scenario, although the problem is with the interface provider, it is possible to trim the string and then json.parse it, given the robustness of the code.
String.prototype.trim
Is the ES5 added method, for old browsers, but also to use their own trim to achieve. Let's take a look at Qwrap and JQuery's implementation of Trim:
// jQuery 1.7.2:trimleft =/^[\s\xa0]+/=/[\s\xa0]+$/; return null ? "" : text.tostring () "") "");
JQuery 1.7.2 Filters the and at both ends of the string \s
\xA0
. For IE low version, \s
equivalent to [ \t\v\f\r\n]
. The meanings of these characters are shown in the following table:
name |
Unicode encoding |
string representation |
description |
<sp> |
u+0020 |
"", "\x20", "\u0020" |
half-width whitespace, keyboard spacebar |
<tab> |
u+0009 |
"\ T", "\x09", "\u0009" |
tab, keyboard TAB, |
<vt> |
u+000b |
"\v", "\x0b", "\u000b" |
vertical Tab |
<ff> |
u+000c |
"\f", "\x0c", "\u000c" |
page Break |
<cr> |
u+000d |
"\ r", "\x0d", "\u000d" |
carriage return |
<lf> |
u+000a |
"\ n", "\x0a", "\u000a" |
line break |
<nbsp> |
u+00a0 |
"\xa0", "\u00a0" |
no-break Space Prohibit wrap whitespace |
The last "Disable automatic line breaks" <NBSP>
is actually used frequently in HTML
. In HTML, consecutive whitespace characters (half-width spaces, line breaks, tab, and so on) are combined into a single space, and
are not compatible with other adjacent white-space words.
As you can see, at least in the low version of IE, JQuery 1.7.2 cannot filter BOM characters at both ends of a string.
// jQuery 1.8.1RTrim =/^[\s\ufeff\xa0]+| [\s\ufeff\xa0]+$/G,returnnull ? "] : "" );
JQuery 1.8.1 on the basis of the previous, but also increased \uFEFF
. It is ES5 new whitespace character, called "Byte order mark character (byte order mark)", which is mentioned earlier BOM
.
name |
Unicode Encoding |
string Representation |
<BOM> |
U+feff |
"\ufeff" |
Unicode3.2 before, \uFEFF
said "0 wide non-newline space (Zero width no-break space)";unicode3.2 new \u2060
to represent the 0-wide non-newline space \uFEFF
, only to represent the byte order mark.
As you can see, the JQuery 1.8.1 can filter the BOM. In addition, given that some browser-implemented trim does not filter <NBSP>
or <BOM>
, JQuery adds a layer of detection, not the presence of native trim must be native.
// Qwrap 1.1.6 return s.replace (/^[\s\ufeff\xa0\u3000]+|[ \ufeff\xa0\u3000\s]+$/g, "");
The trim of the qwrap also increased \u3000
. It is "ideographs space", used in CJK unified ideographic text, can simply think of it is usually we encounter the Chinese full-width space ideographic.
Unicode Encoding |
string Representation |
Description |
u+3000 |
"", "\u3000" |
Ideographic SPACE,CJK Full-width space |
Take a look at what characters the browser trim should handle, as described in the ES5 documentation:
Let-T is a String value, which is a copy of the S with both leading and trailing white space removed. The definition of white space is the union of whitespace and LineTerminator.
This means that both ends of the string are WhiteSpace
LineTerminator
removed.
The ES5 documentation stipulates that, WhiteSpace
in addition to the above mentioned <SP>、<TAB>、<VT>、<FF>、<NBSP>
and <BOM>
<USP>
other whitespace characters defined, USP represents the characters in "separator, space" classification in Unicode, as follows (Source: 1, 2):
Unicode Encoding |
Description |
u+0020 |
Space,<sp> |
U+00a0 |
No-break space,<nbsp> |
u+1680 |
Ogham SPACE MARK, Augan |
u+180e |
Mongolian vowel Separator, Mongolian vowel delimiter |
u+2000 |
EN QUAD |
u+2001 |
EM QUAD |
u+2002 |
EN space,en spaces. Same width as en (half of EM) |
u+2003 |
EM space,em space. Same width as em |
u+2004 |
Three-per-em Space,em One-third Spaces |
u+2005 |
Four-per-em Space,em One-fourth Spaces |
u+2006 |
Six-per-em Space,em one-sixth Spaces |
u+2007 |
Figure space, numeric space. Same width as single digit |
u+2008 |
Punctuation space, punctuation space. Width with narrow punctuation of the same font |
u+2009 |
THIN space, narrow spaces. EM one-sixth or one-fifth wide |
u+200a |
HAIR space, more narrow spaces. Narrower than narrow spaces |
u+200b |
Zero width space,<zwsp>, 0 wide spaces |
u+200c |
Zero width Non joiner,<zwnj>, 0 wide hyphenation space |
u+200d |
Zero width joiner,<zwj>, 0 wide hyphen space |
u+202f |
NARROW no-break Space, narrow non-newline space |
u+205f |
MEDIUM mathematical space, medium math space. For mathematical equations |
u+2060 |
Word Joiner, same as u+200b, but does not break the line. Unicode3.2 new, instead of U+feff |
u+3000 |
Ideographic space, ideographic space. That is, full-width spaces |
U+feff |
byte order mark,<bom>, byte order tag character. No line-wrapping function abolished in Unicode3.2 |
Take a look at the definition of the document pair, LineTerminator
in addition to the previously described <LF>
(line Feed, newline character) and <CR>
(carriage return, carriage return), there are two:
name |
Unicode Encoding |
Description |
<LS> |
u+2028 |
Row delimiter |
<PS> |
u+2029 |
Paragraph separator |
As you can see, the trim method defined by ES5 is very powerful. Browser implementation, I tested Chrome and Firefox, most of the above invisible characters can be filtered, but also can be matched in the regular \s
.
Qwrap or JQuery implements trim, it only handles common characters and is usually sufficient. If you need more consistent trim with ES5, you can look at Es5-shim this project:
varWS = "\x09\x0a\x0b\x0c\x0d\x20\xa0\u1680\u180e\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a \u202f\u205f\u3000\u2028\u2029\ufeff ";if(! String.prototype.trim | |Ws.trim ()) { //Http://blog.stevenlevithan.com/archives/faster-trim-javascript //http://perfectionkills.com/whitespace-deviations/WS = "[" + WS + "]"; varTrimbeginregexp =NewRegExp ("^" + ws + WS + "*"), Trimendregexp=NewREGEXP (ws + WS + "*$")); String.prototype.trim=functiontrim () {if( This===void0 | | This===NULL) { Throw NewTypeError ("Can ' t convert" + This+ "to Object"); } returnString ( This). Replace (Trimbeginregexp,""). Replace (Trimendregexp,""); };}
Trim in BOM and JavaScript