About some of the character set concepts in MySQL

Source: Internet
Author: User

Recently encountered the problem of MySQL garbled, find some information, first save, the back slowly summed up their processing methods.

Notes:

Problem Environment Summary:

1. The front desk PHP code has not changed

2. The original database, all the tables are UTF8

Mysql> Show variables like '%char% ';
+--------------------------+----------------------------------+
| variable_name | Value |
+--------------------------+----------------------------------+
| character_set_client | Latin1 |
| character_set_connection | Latin1 |
| Character_set_database | Latin1 |
| Character_set_filesystem | binary |
| Character_set_results | Latin1 |
| Character_set_server | Latin1 |
| Character_set_system | UTF8 |
| Character_sets_dir | /pub/mysql/share/mysql/charsets/|
+--------------------------+----------------------------------+

The encoding of the backup library (MySQL master-slave copy comes from):

Mysql> Show variables like '%char% ';
+--------------------------+----------------------------------+
| variable_name | Value |
+--------------------------+----------------------------------+
| character_set_client | UTF8 |
| character_set_connection | UTF8 |
| Character_set_database | UTF8 |
| Character_set_filesystem | binary |
| Character_set_results | UTF8 |
| Character_set_server | UTF8 |
| Character_set_system | UTF8 |
| Character_sets_dir | /opt/mysql/share/mysql/charsets/|
+--------------------------+----------------------------------+
8 rows in Set (0.00 sec)

Another database (database three), the data is completely copied from the backup repository of the file.

Here's the problem:

1. The existing PHP foreground connection database Three read out the data is garbled, the connection backup library and the original number library are normal

2. Change a PHP environment (that is, the foreground server), connect three libraries are normal

Transferred from: http://blog.csdn.net/martinkro/article/details/5352474

1 Character set concepts in MySQL
There are two concepts in the MySQL character set, one is "Character set (character set)" and the other is "collations".
1.1 Collations
Collations translated into Chinese is "calibration", in the course of web development, the word, only used in MySQL, the main role is to guide the MySQL comparison of characters, for example, in the ASCII character set, collations specified a less than b,a equals A, And whether a is equal to a or something. In general, you can basically ignore the existence of collations, because each character set has a default collations, which is usually the default collations to use.
1.2 Character Set
In contrast, the character set is a broader concept, even if it is a common text file under Windows, it also infiltrates into character set problems. Different character sets, which specify how different characters are encoded. A character set (character set) is a set of symbols and encodings, such as the ASCII character set, which includes characters such as numbers, uppercase and lowercase letters, semicolons, line breaks, and so on, encoded by a 7bit representation of a character (the encoding of A is the 65,B encoding is 98). ASCII only specifies the English alphabet encoding, non-English language can not be expressed in ASCII code, for this reason, different countries have to encode their own language, for example, our country, there is GB2312 code. But the coding between each country is different, there are some cross-platform problems, for this reason, some international standards organization, has developed some internationally common code, the most commonly used is UTF8. ASCII only to the English symbols and English alphabet coding, GB2312 on the English symbols, English letters, Chinese characters have been encoded, UTF8 all the language of the world to encode, so, GB1212 characters contain ASCII characters, UTF8 contains GB2312 characters. Thus, UTF8 is the character set with the widest character, so in some multilingual web systems, the UTF8 character set (phpMyAdmin using UTF8 encoding) is generally used.
The storage of any text is infiltrated into the concept of a character set. Includes a database, as well as regular text files.
The two concepts of encoding and character set are very confusing, because in general, the names of the encoded name and character set are the same, such as: GB2312 is both a character set name and a coded format name.
Characters: Chinese characters, English letters, punctuation, Latin and so on.
Encoding: Converts a character to a computer-stored format, for example, a is represented by 65.
Character set: A set of characters and the corresponding encoding method.

A visual character set and encoding are two different concepts. A character set can be encoded in many ways, such as UTF-8, UTF-16, UTF-32, and so on, in the Unicode character set. Charset=utf-8, which in the Web page means the page is in Unicode character set and is UTF-8 encoded.
1.3 MySQL's character set
MySQL currently supports multiple character sets, and it supports conversion between different character sets (portability and support for multiple languages).
MySQL can set the server-level character set, the database-level character set, the data table-level character set, the table column of the character set, in fact, the final use of the character set is the column where the character is stored, for example, you set Table1 col1 column is a character type, col1 used the character set, If the col2 column of the Table1 table is of type int, col2 does not use the concept of a character set.
The server-level character set, the database-level character set, and the data-table-level character set are the default options for the column's character set.
Mysql must have a character set that can be specified at start-up parameters, at compile time, or in a configuration file. MySQL server character set, just as the default value for database level. When you create a database, you can specify a character set, and if not specified, use the server's character set. Similarly, when you create a table, you can specify a table-level character set, and if not specified, use the database's character set as the table's character set. When you create a column, you can specify the character set of a column and, if not specified, the character set of the table.
Typically, you only need to set the server-level character set, other database-level, table-level, and column-level character sets, all inherited from the server-level character set.
Since UTF8 is the most extensive character set, in general, we set the MySQL server-level character set to UTF8.
Inheritance relationship of character set in MySQL
Server character sets (how to set the character set in the configuration file: My.ini [mysqld] Default-character-set=utf8)
|
|
|
Database-level character set, if the database is created with a character set specified, the specified character set is used. If not, use the server-level character set)
|
|
|
Table-level character set, which specifies a character set when the table is created, using the specified character set. If not, the database-level character set is used)
|
|
|
The column-level character set, which specifies the character set of the column when the table is created, uses the specified character set. If not, use the table-level character Set)
The final use of the character set is visible in the column where the text is stored.
2 Character set issues for plain text
Any text storage, there is a character set problem, ordinary text file is no exception.
windows2000+ system, open Notepad, "Save As ..." dialog box, there is an option that lets you choose how to store text encoding.
Usually, everyone uses the windows2000+ system, all using the default encoding, so that the problem of character sets is not encountered.
Under Windows, when you save a text file, you can choose how you encode it, but when you open a text file, it is automatically encoded. Online there is a windows2000+ Notepad to play mobile, unicom jokes, we can search, is because of windows in the Open text file, coding errors caused by error.
Because auto-judgment coding can sometimes be wrong, there are text files that specify how to identify the encoding that you use. The HTML file is one such example.
HTML is a text file. When storing an HTML file, you need to use an encoding, and in the HTML file, you also use HTML syntax, which specifies the encoding (for example) used by the file. If the HTML file does not specify an encoding, the browser automatically recognizes the encoding of the file. If the HTML specifies an encoding, the browser uses the encoding specified by the HTML.
Usually, the HTML file specified by the CharSet and HTML file encoding is consistent, but there are inconsistencies, if inconsistent, will cause the Web page garbled (garbled here, only related to text files, and database-independent.) Using specialized Web page editing tools (such as Dreamwave), the files are automatically encoded according to the CharSet values in the page.
Example: test.html
The contents are as follows:
No matter what encoding is used to save the above file, it will be normal to open it with a browser without garbled characters. This is because the browser can automatically recognize the encoding format.
The contents of the above file shall be:
<meta http-equiv= "Content-type" content= "text/html; Charset=utf-8 ">
At this time test.html do not use UTF-8 encoding format to save, browser open must be garbled. Because the browser uses UTF-8 encoding to parse the received data.

3 Php+mysql Character Set issues
PHP eventually generates a text file, but he takes the text from the database, or saves the text in the database.
Because MySQL supports multiple character sets, by default, MySQL does not know what coded characters PHP sends to him, so MySQL asks the client (PHP) to tell him what character sets are being accessed.
By setting character_set_client, PHP tells mysql,php what encoding is stored in the database.
PHP sets Character_set_results to tell mysql,php what encoding data needs to be taken.
PHP sets Character_set_connection, tells mysql,php the text in the query, and uses what encoding.
MySQL stores text using the encoding of the settings.
Assuming that MySQL uses Setserver to store text, PHP character_set_client is setclient,php character_set_results is setresult. Then, MySQL will send php text, from setclient encoding, converted to setserver encoding, and then into the database, if PHP to take text, MySQL to convert the text from Setserver to Setresult, and then sent to PHP.
PHP file (the resulting HTML file) itself has a code, if the MySQL passed the code, and the PHP file itself, the code is different, then, the entire Web page, must be garbled. So, PHP generally will own the encoding method, tell MySQL.
To ensure that the code is not garbled, you must unify the three: one is the encoding of the Web page itself, the second is the encoding specified in the HTML, and the third is the PHP code to tell MySQL (including character_set_client and Character_set_results).
The first and second encodings, if using an editor such as DW, are usually consistent, but pages written in Notepad may not be consistent.
The third encoding requires manual notification of MySQL. This can be done by using mysql_query ("Set names Characterx") in PHP.

About some of the character set concepts in MySQL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.