The metadata is "data about data." Any data that describes the database--as opposed to the content of the database--is meta data. As a result, most of the strings in the column name, database name, user name, version name, and results from the show statement are meta data. It also includes the contents of tables in the INFORMATION_SCHEMA database because the tables that are defined store information about the database objects.
Metadata representations must meet these requirements:
· All metadata must be within the same character set. Otherwise, the show command and select query that inform a table in a Tion_schema database does not work correctly because the different rows of the same column in the results of these operations will use a different character set.
· Metadata must include all characters in all languages. Otherwise, users will not be able to use their own language to name columns and tables.
To meet both requirements, MySQL uses the Unicode character set to store meta data, or UTF8. If you never use accented characters, this does not cause any damage. But if you use accented characters, it should be noted that the metadata is stored with UTF8.
This means that the return value of the USER (), Current_User (), DATABASE (), and version () functions is set to the UTF8 character set by default, which is the same as the result of synonymous functions such as session_user () and System_user ().
The server sets the Character_set_system system variable to the name of the metadata character set:
mysql> SHOW VARIABLES LIKE 'character_set_system';
Variable_name Value
character_set_system utf8
Storing metadata using Unicode does not mean that the results of column headers and describe functions are default in the Character_set_system character set. When you use the Select Column1 from t statement, the column named Column1 returns the client from the server and uses the character set determined by the set names statement. More specifically, the character set used is determined by the value of the CHARACTER_SET_RESULTS system variable. If this system variable is set to null and no character conversions are performed, the server returns the metadata using the original character set (the character set by the Character_set_system system variable).
If you want the server to return metadata results without using the UTF8 character set, use the set names statement to force the server to perform character set translation or to perform a conversion on the client. Performing conversion on the client is more efficient, but this option is not available to all clients.
Do not worry if you are using (for example) the user () function in a statement to compare or assign a value. MySQL performs some atomic transformations for you.
SELECT * FROM Table1 WHERE USER() = latin1_column;
This is possible because the contents of the Latin1_column column are automatically converted to UTF8 before being compared.
INSERT into Table1 (latin1_column) SELECT USER ();
This is possible because the content returned by the user () function is automatically converted to Latin1 before the value is assigned. To date, automatic conversion is not fully implemented, but it should work properly in future releases.
Although automatic conversion does not belong to the SQL standard, the SQL standardized document says that each character set is (according to the supported characters) a "subset" of Unicode. Therefore, a well-known principle is that "the character set of the applicable superset can be applied to its subset", and we believe that the collation rules of Unicode can apply to comparisons of non-Unicode strings.
Note: In MySQL5.1, errmsg.txt files all use UTF8. The conversion of the client character set is done automatically, as in metadata.