Explore n in char/varchar (n): number of characters or bytes, number of varchar bytes

Source: Internet
Author: User

Explore n in char/varchar (n): number of characters or bytes, number of varchar bytes

[Problem source] When I display the database table designed, yu asked me, why is the top_info field of your top_info defined type varchar (100) 100, how many Chinese characters can be stored in the length of 100?

At that time, the idea was that there was a correlation between how much text can be stored in this 100 database and the encoding method of the database. I tried Baidu to find out the series of questions, but the answers on the Internet were so strange that I had to find out for myself.

The first thing to note is that in utf8 encoding, a Chinese character occupies three bytes, while in gbk encoding, a Chinese character occupies two bytes. We can use the varchar type to store a field vname, add a Chinese character and an English word to it, and then select length (vname) from tablename; then we can see at a glance how many bytes the added data occupies. If you use select char_length (vname) from tablename, you will find that the relationship between the two is the above content.

Back to that question, how many Chinese characters can be saved in the length of 100. Since a Chinese character occupies three bytes, can it only store 100/3 to 33 Chinese characters ?? In other words, is the number 100 in bytes or the number of characters !? For this reason, I did the following test:

The following table is defined in mysql on linux.

Create table 'test '(

'V' varchar (4) default NULL,

'C' char (4) default NULL

) ENGINE = MyISAM default charset = utf8;

Example

 

Then insert into test values ('Hello bib hao', 'Hello bib hao'); there are 2 warnings, which will be discovered after select

Only one new word is saved. This indicates that 4 in varchar (4) and char (4) is the number of bytes, that is, 100 is the number of bytes, and only 33 Chinese characters can be saved. (Note: if the length of the stored string exceeds the predefined length, mysql can still store the string successfully, but it will prompt warning and automatically truncate the excess part .)

 

The answer seems clear, but then the question comes again!

 

I have heard that n in varchar (n) and char (n) refers to the number of characters stored. Isn't this totally different from the test I just conducted? So I performed the same operation in mysql in the local windows environment, created the same table, inserted the same data, and saw the results. I am dumb !!

Insert into test values ('Hello bib ', 'Hello bib'); query OK instead of warning! After the select statement, we found that all four Chinese characters were saved. This means that 4 in the char (4) and varchar (4) We just defined is the number of characters, not the number of bytes, in this case, 100 is capable of saving 100 Chinese characters. Does this not completely overturn the conclusion just now ?? The two results are correct, but the environment is different.

 

In this case, we will define the char and varchar fields and determine what the (n) field represents in this environment? Is it a byte or a character?

 

Unwilling to figure out the reason, why is there such a difference? So yu told me such a command.

Show variable like '% char % ';

This makes it clear that the encoding definitions of mysql variables in linux and windows are perfectly presented.

 

 

In fact, character_set_client, character_set_connection, and character_set_results can be changed. set names gbk/utf8 can be used to ensure consistency with the client encoding page without garbled characters, if the document set on the editing page on the client is utf8, set names utf8. The same applies to gbk. This command will only change the display problem. The encoding of the database at the bottom will not be affected by this command. Server and system won't change. Then we compare the two tables. Their server and system and database encoding found that Latin1 is used in Linux, in windows, this is utf8,

 

We know that Latin1 stores characters in bytes, such as numbers and letters. A Latin character only occupies one byte, while utf8 stores a wide variety of types, and the number of characters is uncertain, utf8 has a unified function. In fact, most of the Internet said that char (n) Stores n characters, because the encoding here is utf8. utf8 shields the display differences between Chinese and English characters and numbers, they are all one character, so the n represents the number of characters, the specific number of bytes is determined according to their own String Length, such as char (100) 100 Chinese characters are stored, accounting for 300 bytes, and 100 letters are stored, accounting for 100 bytes.

However, it does not seem to indicate that n represents a character rather than a byte!

Under what circumstances does it represent a byte, and under what circumstances does it represent a character? What does this have to do with the encoding of the variables we just listed ?...... Continue exploring. Please advise!

 

Note: Some descriptions may have problems and may provide valuable suggestions and guidance. 

 

Appendix: Summary and comparison of char and varchar (not related to this Article)

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.