Reprinted from Sina Blog
SeanFirst, the difference between UTF8 encoded utf8_bin,utf8_general_cs,utf8_general_ci in MySQL
Utf8_general_ci is not case-sensitive, you should use this when registering your username and mailbox.
Utf8_general_cs case-sensitive, if the user name and mailbox Use this will be bad consequences
Utf8_bin:compare strings by the binary value of each character inthe string uses binary data for each string to compile the store. Case-sensitive and binary content can be stored
To illustrate:
If your SQL query statement: where first_name= "Bob"
Which of the following field contents will return a match:
' Bob ': Utf8_bin, Utf8_general_ci and Utf8_general_cs
' B?b ': Utf8_general_ci and Utf8_general_cs turn to O
B? B ': utf8_general_ci is case insensitive
Two
MySQL database utf8_unicode_ci and utf8_general_ci the difference when you start PHP, when you create a new MySQL database, you are likely to set the "character set" to UTF8-utf-8unicode, "grooming" is utf8_ General_ci still use utf8_unicode_ci and worry. I was also puzzled when I was a beginner in PHP, so let's take a look at the following article.
Currently, the UTF8_UNICODE_CI proofing rules only partially support the Unicode collation rule algorithm. Some characters are still not supported. Also, combinations of tokens cannot be fully supported. This mainly affects some minority languages of Vietnam and Russia, such as Udmurt, Tatar, Bashkir and Mari.
The main feature of UTF8_UNICODE_CI is the support for expansion, i.e. when one letter is considered to be equal to the other letter combinations. For example, in German and some other languages ' ß ' equals ' ss '.
Utf8_general_ci is a legacy proofing rule and does not support extensions. It can only be compared between characters. This means that the UTF8_GENERAL_CI proofing rules are relatively fast, but less accurate than the proofing rules that use UTF8_UNICODE_CI.
For example, compare equality using the following UTF8_GENERAL_CI and utf8_unicode_ci two proofing rules:
ä= A
ö= O
ü= U
The difference between the two proofing rules is that for utf8_general_ci the following equation is true:
ß= s
However, for utf8_unicode_ci the following equation is established:
ß= SS
For a language, the UTF8 character set collation rules related to the specific language are performed only if the use of utf8_unicode_ci sorting is not good. For example, for German and French, Utf8_unicode_ci works very well, so you no longer need to create special UTF8 proofing rules for both languages.
Utf8_general_ci also works with German and French, except ' ß ' equals ' s ', not ' SS '. If your app can accept these, then you should use UTF8_GENERAL_CI because it's fast. Otherwise, use utf8_unicode_ci because it is more accurate.
In a word overview above this paragraph: utf8_unicode_ci more accurate, utf8_general_ci speed is relatively fast. Usually the accuracy of utf8_general_ci is enough for us to use, after I have seen a lot of program source code, found that most of them also use is utf8_general_ci, so the new database is generally selected utf8_general_ci. phpMyAdmin the problem of encoding when MySQL was created in a database
phpMyAdmin the problem of encoding when MySQL was created in a database