Spring mvc3.1 @ ResponseBody annotation generates a large number of Accept-Charset
Spring 3 MVC uses @ ResponseBody and generates a large response header (the Accept-Charset will reach 4 K +), because StringHttpMessageConverter by default.
It is mainly used for character encoding to prevent garbled characters.
The function of string. getbytes () is to use the default Character Set of the platform to encode the string into a byte sequence and store it in a new byte array.
String.
With UTF-16 encoded characters in Java (see Bowen Java correctly traversing strings), the CharSet class establishes mappings between UTF-16 encoded byte sequences and byte sequences of other character encodings. When reading from the outside of the
Spring3 MVC uses @responsebody to produce very large response headers (accept-charset will reach 4k+). The reason is that stringhttpmessageconverter.writeinternal () writes all available character sets back to the response response header by default:
Use java. nio. charset. CharsetDecoder to automatically recognize character sets, charsetdecoder
The methods for automatically recognizing character sets that can be found on the Internet are studied. The effective method is to use the third-party
I believe you must have met, open a Web page, but show a heap of like garbled, such as "бїяазъся", "????????"? Remember the message header fields Accept-charset, accept-encoding, Accept-language, content-encoding, content-language in HTTP? And
The GetBytes () method of the string is to get a string of byte arrays, which is well known. However, it is important to note that this method returns a byte array of the operating system's default encoding format. If you do not take this into
Using
System;
Using
System. net;
Using
System. text;
Using
System. Text. regularexpressions;
Class
Program
{ // Obtains the HTML content of a webpage and automatically determines the Encoding Based on the charset of the
The original question is as follows:
Http://topic.csdn.net/u/20080902/02/a6445aa1-2e6b-45c6-a47c-79009718c0fa.html
The contents of an HTML Web page are roughly as follows:
CSDN首页 ... .....
I use the following statement to crawl a page
1, request.setcharacterencoding () is the setting of the value taken from the request or the value taken from the databaseOnce specified, the correct string can be obtained directly through GetParameter (), and if not specified, the ISO8859-1
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.