PHP and MySQL coding issues

Source: Internet
Author: User
Tags php and mysql

http://blog.csdn.net/martinkro/article/details/5352474

1 Character set concepts in MySQL
There are two concepts in the MySQL character set, one is "Character set (character set)" and the other is "collations".
1.1 Collations
Collations translated into Chinese is "calibration", in the course of web development, the word, only used in MySQL, the main role is to guide the MySQL comparison of characters, for example, in the ASCII character set, collations specified a less than b,a equals A, And whether a is equal to a or something. In general, you can basically ignore the existence of collations, because each character set has a default collations, which is usually the default collations to use.
1.2 Character Set
In contrast, the character set is a broader concept, even if it is a common text file under Windows, it also infiltrates into character set problems. Different character sets, which specify how different characters are encoded. A character set (character set) is a set of symbols and encodings, such as the ASCII character set, which includes characters such as numbers, uppercase and lowercase letters, semicolons, line breaks, and so on, encoded by a 7bit representation of a character (the encoding of A is the 65,B encoding is 98). ASCII only specifies the English alphabet encoding, non-English language can not be expressed in ASCII code, for this reason, different countries have to encode their own language, for example, our country, there is GB2312 code. But the coding between each country is different, there are some cross-platform problems, for this reason, some international standards organization, has developed some internationally common code, the most commonly used is UTF8. ASCII only to the English symbols and English alphabet coding, GB2312 on the English symbols, English letters, Chinese characters have been encoded, UTF8 all the language of the world to encode, so, GB1212 characters contain ASCII characters, UTF8 contains GB2312 characters. Thus, UTF8 is the character set with the widest character, so in some multilingual web systems, the UTF8 character set (phpMyAdmin using UTF8 encoding) is generally used.
The storage of any text is infiltrated into the concept of a character set. Includes a database, as well as regular text files.
The two concepts of encoding and character set are very confusing, because in general, the names of the encoded name and character set are the same, such as: GB2312 is both a character set name and a coded format name.
Characters: Chinese characters, English letters, punctuation, Latin and so on.
Encoding: Converts a character to a computer-stored format, for example, a is represented by 65.
Character set: A set of characters and the corresponding encoding method.

A visual character set and encoding are two different concepts. A character set can be encoded in many ways, such as UTF-8, UTF-16, UTF-32, and so on, in the Unicode character set. Charset=utf-8, which in the Web page means the page is in Unicode character set and is UTF-8 encoded.
1.3 MySQL's character set
MySQL currently supports multiple character sets, and it supports conversion between different character sets (portability and support for multiple languages).
MySQL can set the server-level character set, the database-level character set, the data table-level character set, the table column of the character set, in fact, the final use of the character set is the column where the character is stored, for example, you set Table1 col1 column is a character type, col1 used the character set, If the col2 column of the Table1 table is of type int, col2 does not use the concept of a character set.
The server-level character set, the database-level character set, and the data-table-level character set are the default options for the column's character set.
Mysql must have a character set that can be specified at start-up parameters, at compile time, or in a configuration file. MySQL server character set, just as the default value for database level. When you create a database, you can specify a character set, and if not specified, use the server's character set. Similarly, when you create a table, you can specify a table-level character set, and if not specified, use the database's character set as the table's character set. When you create a column, you can specify the character set of a column and, if not specified, the character set of the table.
Typically, you only need to set the server-level character set, other database-level, table-level, and column-level character sets, all inherited from the server-level character set.
Since UTF8 is the most extensive character set, in general, we set the MySQL server-level character set to UTF8.
Inheritance relationship of character set in MySQL
Server character sets (how to set the character set in the configuration file: My.ini [mysqld] Default-character-set=utf8)
|
|
|
Database-level character set, if the database is created with a character set specified, the specified character set is used. If not, use the server-level character set)
|
|
|
Table-level character set, which specifies a character set when the table is created, using the specified character set. If not, the database-level character set is used)
|
|
|
The column-level character set, which specifies the character set of the column when the table is created, uses the specified character set. If not, use the table-level character Set)
The final use of the character set is visible in the column where the text is stored.
2 Character set issues for plain text
Any text storage, there is a character set problem, ordinary text file is no exception.
windows2000+ system, open Notepad, "Save As ..." dialog box, there is an option that lets you choose how to store text encoding.
Usually, everyone uses the windows2000+ system, all using the default encoding, so that the problem of character sets is not encountered.
Under Windows, when you save a text file, you can choose how you encode it, but when you open a text file, it is automatically encoded. Online there is a windows2000+ Notepad to play mobile, unicom jokes, we can search, is because of windows in the Open text file, coding errors caused by error.
Because auto-judgment coding can sometimes be wrong, there are text files that specify how to identify the encoding that you use. The HTML file is one such example.
HTML is a text file. When storing an HTML file, you need to use an encoding, and in the HTML file, you also use HTML syntax, which specifies the encoding (for example) used by the file. If the HTML file does not specify an encoding, the browser automatically recognizes the encoding of the file. If the HTML specifies an encoding, the browser uses the encoding specified by the HTML.
Usually, the HTML file specified by the CharSet and HTML file encoding is consistent, but there are inconsistencies, if inconsistent, will cause the Web page garbled (garbled here, only related to text files, and database-independent.) Using specialized Web page editing tools (such as Dreamwave), the files are automatically encoded according to the CharSet values in the page.
Example: test.html
The contents are as follows:
No matter what encoding is used to save the above file, it will be normal to open it with a browser without garbled characters. This is because the browser can automatically recognize the encoding format.
The contents of the above file shall be:
<meta http-equiv= "Content-type" content= "text/html; Charset=utf-8 ">
At this time test.html do not use UTF-8 encoding format to save, browser open must be garbled. Because the browser uses UTF-8 encoding to parse the received data.

3 Php+mysql Character Set issues
PHP eventually generates a text file, but he takes the text from the database, or saves the text in the database.
Because MySQL supports multiple character sets, by default, MySQL does not know what coded characters PHP sends to him, so MySQL asks the client (PHP) to tell him what character sets are being accessed.
By setting character_set_client, PHP tells mysql,php what encoding is stored in the database.
PHP sets Character_set_results to tell mysql,php what encoding data needs to be taken.
PHP sets Character_set_connection, tells mysql,php the text in the query, and uses what encoding.
MySQL stores text using the encoding of the settings.
Assuming that MySQL uses Setserver to store text, PHP character_set_client is setclient,php character_set_results is setresult. Then, MySQL will send php text, from setclient encoding, converted to setserver encoding, and then into the database, if PHP to take text, MySQL to convert the text from Setserver to Setresult, and then sent to PHP.
PHP file (the resulting HTML file) itself has a code, if the MySQL passed the code, and the PHP file itself, the code is different, then, the entire Web page, must be garbled. So, PHP generally will own the encoding method, tell MySQL.
To ensure that the code is not garbled, you must unify the three: one is the encoding of the Web page itself, the second is the encoding specified in the HTML, and the third is the PHP code to tell MySQL (including character_set_client and Character_set_results).
The first and second encodings, if using an editor such as DW, are usually consistent, but pages written in Notepad may not be consistent.
The third encoding requires manual notification of MySQL. This can be done by using mysql_query ("Set names Characterx") in PHP.

//---------------------------------------------------------------------------------------------------------

MySQL can be garbled in Chinese because of the following points:
1.server itself set the problem, such as still stay in the Latin1
2.table language setting problem (including character and collation)
3. Client program ( For example, PHP), the connection language setting problem
strongly recommends the use of UTF8!!!! The
UTF8 can be compatible with all characters in the world!!!!
First, avoid creating database and table garbled and view encoding method
1: Create database ' Test '
CHARACTER SET ' UTF8 '
COLLATE ' Utf8_general_ CI ';
2. Create table ' Database_user ' (
' ID ' varchar (+) NOT null default ',
' UserID ' varchar (+) NOT NULL Defau Lt ',
) engine=innodb DEFAULT Charset=utf8;

With these 3 settings, there is no problem at all, that is, the same encoding format is used when building the library and building the table.
But if you have built libraries and tables, you can query them in the following ways.
1. View the default encoding format:
Mysql> Show variables like "%char%";
+--------------------------+---------------+
| variable_name | Value |
+--------------------------+---------------+
| character_set_client | GBK |
| character_set_connection | GBK |
| Character_set_database | UTF8 |
| Character_set_filesystem | binary |
| Character_set_results | GBK |
| Character_set_server | UTF8 |
| Character_set_system | UTF8 |
+--------------------------+-------------+
Note: Before 2 to determine, you can set the default encoding format using the set names Utf8,set names GBK;

The effect of performing set NAMES UTF8 is equivalent to setting the following:
SET character_set_client= ' UTF8 ';
SET character_set_connection= ' UTF8 ';
SET character_set_results= ' UTF8 ';

2. View the encoding format of the test database:
mysql> show create database test;
+------------+------------------------------------------------------------------------------------------------+
| Database | Create Database |
+------------+--------------------------------------------------------------------------------------------- ---+
| test | CREATE DATABASE ' test '/*!40100 DEFAULT CHARACTER SET GBK */| |
+------------+--------------------------------------------------------------------------------------------- ---+

3. View the encoding format of the YJDB database:
mysql> show create table yjdb;
| yjdb | CREATE TABLE ' yjdb ' (
' sn ' int (5) NOT NULL auto_increment,
' type ' varchar (TEN) NOT NULL,
' BRC ' varchar (6) is not NULL ,
' Teller ' int (6) is not NULL,
' telname ' varchar (TEN) is not NULL,
' date ' int (ten) is not NULL,
' count ' int (6) is not null,< br> ' Back ' int (ten) not NULL,
PRIMARY key (' SN '),
Unique key ' sn ' (' sn '),
Unique key ' sn_2 ' (' sn ')
) Engine=myis AM auto_increment=1826 DEFAULT CHARSET=GBK row_format=dynamic |

Second, avoid the import of data in Chinese garbled problem
1: Save the Data encoding format to Utf-8
Set the default encoding to UTF8:
set names UTF8;
Set database db_name default to UTF8:
ALTER database ' db_name ' default CHARACTER set UTF8 COLLATE utf8_general_ci;
Set table tb_name default encoding is UTF8:
ALTER table ' tb_name ' default CHARACTER set UTF8 COLLATE utf8_general_ci;
Import:
LOAD DATA LOCAL INFILE ' c:\\utf8.txt ' into TABLE yjdb;
2: Save the data encoding format to ANSI (that is, GBK or GB2312)
to set the default encoding to GBK:
set names GBK;
Set database db_name default encoding to GBK:
ALTER database ' db_name ' default CHARACTER set GBK COLLATE gbk_chinese_ci;
Set table Tb_name default encoding is GBK:
ALTER table ' tb_name ' default CHARACTER set GBK COLLATE gbk_chinese_ci;
Import:
LOAD DATA LOCAL INFILE ' c:\\gbk.txt ' into TABLE yjdb;

Note: 1. UTF8 do not import GBK,GBK do not import UTF8;
The display of UTF8 is not supported under 2.dos;
Third, solve the problem of garbled pages

Set the site encoding to Utf-8, which is compatible with all characters in the world.
If the site has been working for a long time, there are many old data, can not change the setting of Simplified Chinese, then it is recommended to set the page encoding GBK, GBK and GB2312 is the difference between: GBK can show more characters than GB2312, to show the simplified code of traditional characters, you can only use GBK.
1. Edit/etc/my.cnf, add Default_character_set=utf8 in [MySQL] section;
2. When writing the connection URL, add the useunicode=true&characterencoding=utf-8 parameter;
3. Add a "Set names UTF8" or "Set names GBK" instruction in the Web page code to tell MySQL to use the contents of the connection
UTF8 or GBK;

//-----------------------------------------------------------------------------------

Change it into UTF8 in My.ini.

PHP is connected to the database after setting mysql_query (' Set NAMES UTF8 ');

PHP and MySQL coding issues

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.