MYSQL binary solution to mysql data case sensitivity problem _ MySQL

Source: Internet
Author: User
MYSQL binary method to solve mysql data case sensitivity problem bitsCN.com


Mysql> select binary 'ABC' = 'abc' COM1, 'ABC' = 'abc' COM2;
+ -------- + ----------- +
| COM1 | COM2 |
+ -------- + ----------- +
| 0 | 1 |
+ --------- + ----------- +
1 row in set (0.00 sec)

(Just a few! 4. * previously)
Because some MySQL, especially 4. * In the past, there may be inaccurate Chinese retrieval, and binary can be added during retrieval.
Table creation:

Create TABLE usertest (
Id int (9) unsigned not null auto_increment,
Username varchar (30) not null default '',
Primary key (id)
)

Insert data:

Insert into usertest (username) VALUES ('文 ');
Insert into usertest (username) VALUES ('u.s. Project ');
Insert into usertest (username) VALUES ('li Wen ');
Insert into usertest (username) VALUES ('ottan ');
Insert into usertest (username) VALUES ('dream ');
Insert into usertest (username) VALUES ('longfu ');
Insert into usertest (username) VALUES ('shares ');

For example, if select * from usertest where username like '% Xia %', all seven records are displayed, which is depressing.
If you use = instead of like, select * from usertest where username = 'Xia ', only one result is displayed. Because the LIKE Operation in mysql is performed according to ASCII, the LIKE operation may be faulty. Problem continues: If you add:

Insert into usertest (username) VALUES ('wen ');
Insert into usertest (username) VALUES ('tang ');

If you still use select * from usertest where username = 'shares', three records are displayed, which is depressing. The solution is as follows:
1. binary is used during the create operation, instead of the query operation.

Username varchar (30) binary not null default ''. if the table has been created, use:
Alter table usertest modify username varchar (32) binary; to view the attributes of the table.

2. when adding binary to the query, select * from usertest where username like binary '% summer %' can accurately query a record.

Char uses a fixed length of space for storage. char (4) stores 4 characters and occupies different bytes according to the encoding method. the gbk encoding method is used, whether it is Chinese or English, each character occupies 2 bytes of space and UTF-8 encoding. each character occupies 3 bytes of space.
If the length of the string to be stored is not much different from the average length of all values, it is suitable to use char, such as MD5.
For frequently changed values, char is better than varchar because fixed-length rows are not prone to fragmentation.
For a very short column, char is better than varchar because varchar requires one or two additional bytes to store the length of the string.

Varchar stores variable-length strings and uses one or two additional bytes to store the string length. varchar (10) requires 10 characters, it also requires one byte to store the length information (10), and more than 255 of the length requires two bytes to store
Exception: when ROW_FORMAT = FIXED is used in the Myisam engine, the same space is used for each row, resulting in waste.

If char and varchar are followed by spaces, char automatically removes spaces and stores them. although varchar does not remove spaces, it removes spaces for comparison during string comparison.

+ ------- + -------------- + ------ + ----- + --------- + ---------------- +
| Field | Type | Null | Key | Default | Extra |
+ ------- + -------------- + ------ + ----- + --------- + ---------------- +
| Id | int (11) | NO | PRI | NULL | auto_increment |
| Name | varchar (4) | YES | NULL |
| Addr | char (8) | YES | NULL |
| Bn | varbinary (4) | YES | NULL |
| B | binary (8) | YES | NULL |
+ ------- + -------------- + ------ + ----- + --------- + ---------------- +
+ ---------------------- +
| Concat ("$", name, "$") | concat ("$", addr, "$") |
+ ---------------------- +
| $ Asdf $ | $ a $ |
| $ Asdf $ | $ a $ |
| $ A $ |
| $ A $ |
| $ T a $ | $ a $ |
+ ---------------------- +
Mysql> select * from zcy where name = 'a'; // because the name is varchar, 'A' is automatically converted to 'A' during comparison'
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 3 | a | AB |
| 4 | a | AB | a |
+ ---- + ------ + ---------- +
2 rows in set (0.00 sec)

Mysql> select * from zcy where name = 'a ';
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 3 | a | AB |
| 4 | a | AB | a |
+ ---- + ------ + ---------- +
2 rows in set (0.00 sec)


+ ------- + -------------- + ------ + ----- + --------- + ---------------- +
| Field | Type | Null | Key | Default | Extra |
+ ------- + -------------- + ------ + ----- + --------- + ---------------- +
| Id | int (11) | NO | PRI | NULL | auto_increment |
| Name | varchar (4) | YES | NULL |
| Addr | char (8) | YES | NULL |
| Bn | varbinary (4) | YES | NULL |
| B | binary (8) | YES | NULL |
+ ------- + -------------- + ------ + ----- + --------- + ---------------- +
+ -------------------- + ------------------- +
| Concat ("$", bn, "$") | concat ("$", B, "$") |
+ -------------------- + ------------------- +
| $ AB a $ | NULL |
| $ AB $ |
| $ AB $ |
| $ AB $ | $ a $ |
| NULL | $ a $ |
| NULL | $ abcde $ |
| NULL | $ abcd1234 $ |
+ -------------------- + ------------------- +

Binary stores binary strings, which are stored in bytes rather than characters without character set restrictions.
Binary (8) can save 8 characters, each of which occupies 1 byte, a total of 8 bytes
Compared by byte, rather than by character (char), compared by byte is simpler and faster than character
The characters are case-insensitive, while binary is case-sensitive. The end is filled with/0 instead of spaces.

Mysql> select * from zcy where B = 'A/0/0/0/0/0/0/0 ';
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 5 | t a | NULL | a |
+ ---- + ------ + ---------- +
Mysql> select * from zcy where B = 'A/0/0/0/0/0/0 ';
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 4 | a | AB | a |
+ ---- + ------ + ---------- +

Varbinary stores variable-length strings, which are not followed by/0
Mysql> select * from zcy where bn = 'AB ';
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 3 | a | AB |
+ ---- + ------ + ---------- +
1 row in set (0.01 sec)

Mysql> select * from zcy where bn = 'AB ';
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 2 | asdf | a | AB |
+ ---- + ------ + ---------- +
1 row in set (0.00 sec)

Mysql> select * from zcy where bn = 'AB ';
+ ---- + ------ + ---------- +
| Id | name | addr | bn | B |
+ ---- + ------ + ---------- +
| 4 | a | AB | a |
+ ---- + ------ + ---------- +
1 row in set (0.00 sec)

Differences between Blob and Text in MySql

BLOB is a large binary object that can hold a variable amount of data. There are four BLOB types: TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB. They only have different maximum lengths that can hold values.

There are four types of TEXT: TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT. These correspond to four BLOB types, with the same maximum length and storage requirements.

The BLOB column is considered as a binary string (byte string ). The TEXT column is considered as a non-binary string (character string ). BLOB columns do not have character sets, and sort and compare value values based on column value bytes. The TEXT column has a character set and sorts and compares values according to the character set's checking rules.

There is no case conversion during TEXT or BLOB column storage or retrieval.

When not running in strict mode, if you assign a value that exceeds the maximum length of the column type to the BLOB or TEXT column, the value is truncated to ensure that it is suitable. If the truncated characters are not spaces, a warning is generated. When strict SQL mode is used, errors are generated, and the value is denied, rather than truncated, and a warning is given.

In most cases, BLOB columns can be considered as VARBINARY columns that are sufficiently large. Similarly, the TEXT column can be considered as a VARCHAR column. BLOB and TEXT are different from VARBINARY and VARCHAR in the following aspects:

When you save or retrieve the values of BLOB and TEXT columns, trailing spaces are not deleted. (This is the same as the VARBINARY and VARCHAR columns ).

Note that during comparison, TEXT will be expanded with spaces to fit the comparison objects, just as CHAR and VARCHAR.

The index prefix length must be specified for index of BLOB and TEXT columns. For CHAR and VARCHAR, the prefix length is optional.

BLOB and TEXT columns cannot have default values.

LONG and long varchar correspond to the MEDIUMTEXT data type. This is to ensure compatibility. If the TEXT column type uses the BINARY attribute, BINARY proofreaders of the column character set are assigned to the column.

MySQL Connector/ODBC defines BLOB as LONGVARBINARY and TEXT as LONGVARCHAR.

Because BLOB and TEXT values may be very long, you may encounter some constraints when using them:

Only the first max_sort_length bytes of the column are used for sorting. The default value of max_sort_length is 1024. this value can be changed using the -- max_sort_length option when the mysqld server is started.

The value of max_sort_length added during running can make more bytes meaningful during sorting or combination. Any client can change the value of the max_sort_length variable of its session:

Mysql & gt; SET max_sort_length = 2000;

Mysql> SELECT id, comment FROM tbl_name

-> Order by comment;

When you want to make the byte that exceeds max_sort_length meaningful, another way to use group by or order by for long BLOB or TEXT columns is to convert column values to fixed-length objects. The standard method is to use the SUBSTRING function. For example, the following statement sorts the 2000 bytes of the comment column:

Mysql> SELECT id, SUBSTRING (comment, 1,2000) FROM tbl_name

-> Order by substring (comment, 1,2000 );

The maximum size of a BLOB or TEXT object is determined by its type, but the actual maximum value that can be passed between the client and the server is determined by the amount of available memory and the size of the communication cache. You can change the size of the message cache by changing the value of the max_allowed_packet variable, but you must modify both the server and client programs. For example, you can use mysql and mysqldump to change the max_allowed_packet value of the client.

Each BLOB or TEXT value is represented by an internally assigned object. This is a comparison with other column types. The latter allocates a storage engine for each column when the table is opened.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.