What is the difference between MySQL Utf8_general_ci and utf8_unicode_ci, and how should I choose?

Source: Internet
Author: User

The general explanation is the utf8_general_ci speed is faster, the utf8_unicode_ci accuracy is better ... But there it is, and where is it?

First of all, in terms of its accuracy, there are countless kinds of words in the world, in addition to the general English use of our more familiar A-Z character, there are a lot of similar wording for French, German, Russian and so on ...

There are dozens of different manifestations of an A-word alone.
Why not all use the same kind of A, dozens of kinds of more trouble ah. In fact, they all have meaning in the text they belong to, and may represent different sounds or something else. In some languages, different pronunciations of the same word may represent two meanings.

Proofing Rules

The purpose of UTF8_UNICODE_CI and Utf8_general_ci is to convert characters that look different, making it easier and more accurate to sort.

In the following example, the direct look is not equal, but in practical application is established, this is the result of Utf8_unicode_ci and utf8_general_ci work.
ä= A

ö= O
ü= U

And the utf8_unicode_ci accuracy is better on it has a more complete character matrix, it can even convert a special character to more than one English characters, in German in the S case:

Under Utf8_unicode_ci the equation is established
ß= SS

It's the only way to set up in Utf8_general_ci.
ß= s

Sorting rules

Because Utf8_unicode_ci has a more complete character and conversion rules, the accuracy of the sorting is also higher than the utf8_general_ci.

Part of the UTF8_GENERAL_CI character list

Utf8_unicode_ci has a more complete character descriptor.

Efficiency

Also because Utf8_unicode_ci's character and conversion rules are more complex, the performance is slower than utf8_general_ci.

Summarize

If your application is in German, Russian, and so on, or if you need to deal with international content accurately, please use UTF8_UNICODE_CI.
Otherwise, you can use UTF8_GENERAL_CI.

Note

Although the Utf8_unicode_ci word set is more complete, but it is still incomplete, so MySQL also provides a lot of other languages of the special word set, for the specific local application, detailed can see the following website link

References

MySQL Official Commentary: http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html

Original: http://www.cnopensource.org/2012/06/mysql-%E7%9A%84-utf8_general_ci-%E5%92%8C-utf8_unicode_ci-%E6%9C%89%E4% bb%80%e4%b9%88%e5%8c%ba%e5%88%ab%ef%bc%8c%e5%ba%94%e5%a6%82%e4%bd%95%e9%80%89%e6%8b%a9%ef%bc%9f/

Transfer from http://blog.chedushi.com/archives/6462

What is the difference between MySQL Utf8_general_ci and utf8_unicode_ci, and how should I choose?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.