這是一個建立於 的文章,其中的資訊可能已經有所發展或是發生改變。
原文:
Hi there,
I just discovered Go and decided to port a little program to Go.
The program reads JSON-Data from an URL and process the Data. The Go
port works well till now.
I dont have any influence on the JSON data and so sometimes there are
control character in it and my program crashes with "invalid character
'\x12' in string literal"
here the code sample of my program:
http_return, err := http.Get(newurl) var http_body []byte; if err == nil { http_body, err = ioutil.ReadAll(http_return.Body) http_return.Body.Close() } param_info := make(map[string]interface{}) param_err := json.Unmarshal(http_body, ¶m_info)
This Unmarshal call crashes with "invalid character '\x12' in string
literal"
How do I remove this or any similar control character from the JSON
data. I dont need those characters so they can simply be removed.
Thanks
Nico
==============================================================
意思大概就是說,json是通過網路傳輸或者其它方法擷取的,然后里面可能包含特殊字元,這樣呢,在go裡面用json 解析就會報錯。
於是有個哥們就出了一招。
========================
If you don't mind them being replaced with spaces, you could do something like:
for i, ch := range http_body { switch { case ch > '~': http_body[i] = ' ' case ch == '\r': case ch == '\n': case ch == '\t': case ch < ' ': http_body[i] = ' ' }}
~K========================================具體什麼原理呢?還不清楚。大概是根據字元集來判斷哪些是json裡面的非法字元,然後替換成空格。 電腦,字元集。。。。。後面需要研究一下了。 ===================================看一下ASCII可顯示字元表,就明白了。 空格開始,~結束。加上\r\n\t這3個字元。
ASCII可顯示字元
二進位 |
十進位 |
十六進位 |
圖形 |
0010 0000 |
32 |
20 |
(空格)(␠) |
0010 0001 |
33 |
21 |
! |
0010 0010 |
34 |
22 |
" |
0010 0011 |
35 |
23 |
# |
0010 0100 |
36 |
24 |
$ |
0010 0101 |
37 |
25 |
% |
0010 0110 |
38 |
26 |
& |
0010 0111 |
39 |
27 |
' |
0010 1000 |
40 |
28 |
( |
0010 1001 |
41 |
29 |
) |
0010 1010 |
42 |
2A |
* |
0010 1011 |
43 |
2B |
+ |
0010 1100 |
44 |
2C |
, |
0010 1101 |
45 |
2D |
- |
0010 1110 |
46 |
2E |
. |
0010 1111 |
47 |
2F |
/ |
0011 0000 |
48 |
30 |
0 |
0011 0001 |
49 |
31 |
1 |
0011 0010 |
50 |
32 |
2 |
0011 0011 |
51 |
33 |
3 |
0011 0100 |
52 |
34 |
4 |
0011 0101 |
53 |
35 |
5 |
0011 0110 |
54 |
36 |
6 |
0011 0111 |
55 |
37 |
7 |
0011 1000 |
56 |
38 |
8 |
0011 1001 |
57 |
39 |
9 |
0011 1010 |
58 |
3A |
: |
0011 1011 |
59 |
3B |
; |
0011 1100 |
60 |
3C |
< |
0011 1101 |
61 |
3D |
= |
0011 1110 |
62 |
3E |
> |
0011 1111 |
63 |
3F |
? |
|
|
二進位 |
十進位 |
十六進位 |
圖形 |
0100 0000 |
64 |
40 |
@ |
0100 0001 |
65 |
41 |
A |
0100 0010 |
66 |
42 |
B |
0100 0011 |
67 |
43 |
C |
0100 0100 |
68 |
44 |
D |
0100 0101 |
69 |
45 |
E |
0100 0110 |
70 |
46 |
F |
0100 0111 |
71 |
47 |
G |
0100 1000 |
72 |
48 |
H |
0100 1001 |
73 |
49 |
I |
0100 1010 |
74 |
4A |
J |
0100 1011 |
75 |
4B |
K |
0100 1100 |
76 |
4C |
L |
0100 1101 |
77 |
4D |
M |
0100 1110 |
78 |
4E |
N |
0100 1111 |
79 |
4F |
O |
0101 0000 |
80 |
50 |
P |
0101 0001 |
81 |
51 |
Q |
0101 0010 |
82 |
52 |
R |
0101 0011 |
83 |
53 |
S |
0101 0100 |
84 |
54 |
T |
0101 0101 |
85 |
55 |
U |
0101 0110 |
86 |
56 |
V |
0101 0111 |
87 |
57 |
W |
0101 1000 |
88 |
58 |
X |
0101 1001 |
89 |
59 |
Y |
0101 1010 |
90 |
5A |
Z |
0101 1011 |
91 |
5B |
[ |
0101 1100 |
92 |
5C |
\ |
0101 1101 |
93 |
5D |
] |
0101 1110 |
94 |
5E |
^ |
0101 1111 |
95 |
5F |
_ |
|
|
二進位 |
十進位 |
十六進位 |
圖形 |
0110 0000 |
96 |
60 |
` |
0110 0001 |
97 |
61 |
a |
0110 0010 |
98 |
62 |
b |
0110 0011 |
99 |
63 |
c |
0110 0100 |
100 |
64 |
d |
0110 0101 |
101 |
65 |
e |
0110 0110 |
102 |
66 |
f |
0110 0111 |
103 |
67 |
g |
0110 1000 |
104 |
68 |
h |
0110 1001 |
105 |
69 |
i |
0110 1010 |
106 |
6A |
j |
0110 1011 |
107 |
6B |
k |
0110 1100 |
108 |
6C |
l |
0110 1101 |
109 |
6D |
m |
0110 1110 |
110 |
6E |
n |
0110 1111 |
111 |
6F |
o |
0111 0000 |
112 |
70 |
p |
0111 0001 |
113 |
71 |
q |
0111 0010 |
114 |
72 |
r |
0111 0011 |
115 |
73 |
s |
0111 0100 |
116 |
74 |
t |
0111 0101 |
117 |
75 |
u |
0111 0110 |
118 |
76 |
v |
0111 0111 |
119 |
77 |
w |
0111 1000 |
120 |
78 |
x |
0111 1001 |
121 |
79 |
y |
0111 1010 |
122 |
7A |
z |
0111 1011 |
123 |
7B |
{ |
0111 1100 |
124 |
7C |
| |
0111 1101 |
125 |
7D |
} |
0111 1110 |
126 |
7E |
~ |
|