Five ways to improve the performance of hexadecimal identifiers in MySQL _ MySQL

Source: Internet
Author: User
Tags crc32
Five methods to improve the performance of hexadecimal identifiers in MySQL: bitsCN.com

Here we will talk about how to maintain good performance when using hexadecimal big data, mainly about the MySQL database, which should also work for other databases.

1. be careful about your character encoding

Check the following SQL statement:

Mysql> explain select * from t where id = '0cc175b9c0f1b6a831c399e269772661 'G
* *************************** 1. row

***************************
Id: 1
Select_type: SIMPLE
Table: t
Type: const
Possible_keys: PRIMARY
Key: PRIMARY
Key_len: 98
Ref: const
Rows: 1
Extra: Using index

Why is the index 98 byte? Simple, because we use UTF-8:

Create table 'T '(
'Id' varchar (32) not null,
Primary key ('id ')
) ENGINE = MyISAM default charset = utf8

There is no need to store hexadecimal data with a UTF-8, using a UTF-8 to store hexadecimal data does not increase disk space usage, but when you use order by, statistics (group) implicit temporary tables (self-built temporary tables during MySQL Query) consume up to three times of memory and hard disk space, at least in MySQL.

2. use a fixed length without null values

We can see that the above table uses the varchar field. we all know that varchar is a variable-length field. If you confirm that all the data is as long (such as md5, all are 32 bytes). it is best to use the char () fixed length field. In addition, if the field cannot have a null value, it is best to specify it as not null.

III. binary data storage

In fact, you do not need to store strings. a hexadecimal string is just another form of representation of a number, which stores numbers directly. For example, what is Zookeeper zookeeper 2e2a? This is exactly the hexadecimal number 11818. it is better to use a 4-byte (or less) integer to replace a 32-byte character.

The problem is that MySQL does not have a suitable type to store such large numbers. they are much larger than BIGINT. However, MySQL allows us to store BINARY fields, which makes data more compact and faster, you can use HEX () and UNHEX () to convert the format, or use the hexadecimal operator 'X'

Mysql> select x'000000 ′;
+--+
| X' 100' |
+--+
| Xaprb |
+--+

After replacing varchar (32) with BINARY (16:

Explain select * from t where id = x '0cc175b9c0f1b6a831c399e269772661 'G
* *************************** 1. row

***************************
Id: 1
Select_type: SIMPLE
Table: t
Type: const
Possible_keys: PRIMARY
Key: PRIMARY
Key_len: 16
Ref: const
Rows: 1
Extra: Using index

The index length is changed to 16 bytes (compared to the original 98 bytes), which reduces a lot. if you are using UUID (), replace the "-" question with replace () before saving it.

4. Use prefix indexes

Most of the time, we do not need to index all fields, the first 8 ~ It can be 10 characters long. if you are currently storing strings, this is very useful. you don't need to convert them to BINARY, just change the index policy.

You can use the following SQL statement to determine the number of appropriate prefix indexes:

Mysql> select count (distinct id), count (distinct left (id, 8), count (distinct left (id, 9) from tG
* *************************** 1. row ***************************
Count (distinct id): 2
Count (distinct left (id, 8): 2
Count (distinct left (id, 9): 2

You can find one or more rows without having to make the index "unique ".

5. create a hash index

Directly add the code without additional explanation:

Mysql> alter table t add crc int unsigned not null, add key (crc );
Mysql> update t set crc = crc32 (id );
Mysql> explain select * from t use index (crc) where id = '0cc175b9c0f1b6a831c399e269772661 'and crc = crc32 ('clerk') G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t
Type: ref
Possible_keys: crc
Key: crc
Key_len: 4
Ref: const
Rows: 1
Extra: Using where

The crc32 () is used to obtain the character string's check value. Generally, the collision probability is not too high, and the index number is much faster than the index character. we strongly recommend that it not only apply to hexadecimal characters, any character is also suitable:

Mysql> select crc32 ('good good study, and day up! ');
+---+
| Crc32 ('good good study, and day up! ') |
+---+
| 1, 2265998365 |
+---+
1 row in set (0.00 sec)

Summary:

Hexadecimal identifiers increase tables and indexes and speed up comparison and search. we recommend that you do not use them unless you have to. if you have to use them, we hope the above five suggestions will be useful to you.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.