WeChat public account development tutorial Article 1-how to limit the length of text messages

Source: Internet
Author: User

I believe many of my friends have encountered this problem: when the content of the sent text message is too long, no response will be made. What is the maximum length of text messages allowed? How can we calculate the length of the text? Why does it seem that the maximum length of supported text messages is over 1300? This article will completely lift your questions.

 

The message length limit in the API documentation is 2048.

As you can see, the interface documentation clearly states that the length of the reply message cannot exceed 2048 bytes. So why don't many people respond when the message content length is over 1300 characters? I think this problem should not be clarified in this section how to calculate the number of bytes of text.

 

How to correctly calculate the number of bytes occupied by text

The first thought of calculating the number of bytes occupied by text (String) is the getBytes () method of the String class. This method returns the byte array corresponding to the String, then, the length of the array is calculated to obtain the number of bytes occupied by the string. For example:

 

Public static void main (String [] args) {// running result: 4System. out. println ("Liu Feng". getBytes (). length );}
In the preceding example, the number of bytes occupied by two Chinese characters is 4, that is, one Chinese Character occupies 2 bytes. Is that true? In fact, we ignored a problem: for different encoding methods, the number of Chinese characters is different! What exactly does this mean? In the preceding example, the encoding method is not specified, so the default encoding method of the operating system is used. Let's take a look at three conclusions:

 

1) if the above example runs on the operating system platform with the default encoding method of ISO8859-1, the calculation result is 2;

2) If the preceding example runs on the operating system platform with the default encoding method gb2312 or gbk, the calculation result is 4;

3) if the preceding example runs on the operating system platform with the default encoding method UTF-8, the calculation result is 6;

If so, does it mean that the String. getBytes () method uses gb2312 or gbk encoding on our system platform by default? Let's look at an example:

 

Public static void main (String [] args) throws UnsupportedEncodingException {// running result: 2System. out. println ("Liu Feng ". getBytes ("ISO8859-1 "). length); // running result: 4System. out. println ("Liu Feng ". getBytes ("GB2312 "). length); // running result: 4System. out. println ("Liu Feng ". getBytes ("GBK "). length); // running result: 6System. out. println ("Liu Feng ". getBytes ("UTF-8 "). length );}
Does this example prove the three conclusions I have given above? That is to say, the use of ISO8859-1 encoding, a medium/English are only one byte; the use of GB2312 or GBK encoding, a Chinese occupies two bytes; and the use of UTF-8 encoding, A Chinese Character occupies three bytes.

 

 

The encoding method used by the platform and the calculation of the number of bytes occupied by the string

So what encoding method should I use when returning a message to the server? Of course it is UTF-8, because we have adopted the following code in the doPost method to avoid Chinese garbled characters:

// Set the request and response encoding to UTF-8 (to prevent Chinese garbled characters) request. setCharacterEncoding ("UTF-8"); response. setCharacterEncoding ("UTF-8 ");
To verify what I said, I wrote an example to test:

 

 

Private static String getMsgContent () {StringBuffer buffer = new StringBuffer (); // each line contains 70 Chinese characters. A total of 682 Chinese characters are added with an English exclamation point buffer. append I will accompany you with any difficulties "); buffer. append I will accompany you with any difficulties "); buffer. append I will accompany you with any difficulties "); buffer. append ("I don't know when to like each At night, I will come here to see how beautiful you are, so that I can't help but see you, and I will lose myself and want to lead your hand through the ups and downs, and I will accompany you "); buffer. append I will accompany you with any difficulties "); buffer. append I will accompany you with any difficulties "); buffer. append I will accompany you with any difficulties "); buffer. append what I will always accompany you "); buffer. append I will accompany you with any difficulties "); buffer. append! "); Return buffer. toString ();} public static void main (String [] args) throws Exception {// 1365 bytes of System are occupied when gb2312 encoding is used. out. println (getMsgContent (). getBytes ("gb2312 "). length); // when UTF-8 encoding is used, the System occupies 2047 bytes. out. println (getMsgContent (). getBytes ("UTF-8 "). length );}

The content returned by the getMsgContent () method is exactly what the text message can support at most, that is, when the UTF-8 encoding method is used, the text message content supports up to 2047 bytes, that is, the reply message content in the public platform interface document cannot exceed 2048 bytes, even if it is equal to 2048 bytes. You can try () if you add an English symbol to the content of the method, it will not respond.

At the same time, we also found that if the gb2312 encoding method is used to calculate the number of bytes of the text returned by the getMsgContent () method, the result is 1365, this is why many friends say that the maximum length of text messages seems to only support more than 1300 bytes, not the 2048 bytes mentioned in the interface documentation. In fact, the encoding method is ignored, the getBytes () method of the String class is simply used instead of the getBytes ("UTF-8") method to calculate the number of occupied nodes.

 

Calculation method encapsulation of the number of bytes occupied by UTF-8 encoding in Java

 

/*** Calculate the number of bytes occupied by the String when UTF-8 encoding is used ** @ param content * @ return */public static int getByteSize (String content) {int size = 0; if (null! = Content) {try {// when Chinese characters are UTF-8 encoded, the size of the three bytes = content. getBytes ("UTF-8 "). length;} catch (UnsupportedEncodingException e) {e. printStackTrace () ;}} return size ;}

 

Well, the content of this chapter is here. I think you should not only learn the number 2047, but also have a new understanding of character encoding methods.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.