(MariaDB) MySQL Data Types and storage mechanisms, mariadbmysql

Source: Internet
Author: User
Tags truncated

(MariaDB) MySQL Data Types and storage mechanisms, mariadbmysql

1.1 data type Overview

A data type is a field constraint that limits the data that each field can store, the data that can be stored, and the format that can be stored. MySQL/MariaDB has five types of data: integer, floating point, String, date and time, and special ENUM and SET types.

The meanings, restrictions, and descriptions of these five data types are shown in:

For the number of bytes occupied by each data type, see the mariadb official manual.

1.2 Storage Mechanism and Operation Method

The reason why the data type can limit the data storage length of a field is that the address space is strictly defined in the memory when the table is created, and the length of the address space is the size of the data that can be stored. Of course, this is a rough concept. For more details about the storage method, see the following description.

There are two ways to limit the range of data types: one is to strictly limit the space, and the data can be stored only when the space is divided, and the data exceeding the limit will be cut off; the second is to use the bit of an extra byte to mark whether the byte in an address space stores data. If the byte is stored, it is marked. If it is not stored, it is not marked.

1.2.1 integer Storage

Here we mainly describe the storage method of Integer type. There are too many things to consider for the Storage Method of floating point data type.

For the integer data type, it strictly limits space, but it is different from the character, because each byte on the bit 0 and 1 can be calculated directly, therefore, its range is calculated based on the number of bit values. A single byte has 8 bits. These 8 bits can constitute 2 ^ 8 = 256 values. Likewise, 2 bytes have 2 ^ 16 = 65536 values, the 4-byte int occupies 32 bits, which can be expressed in the range of 0-2 ^ 32. That is to say, numbers between 0 and occupy only one byte, while numbers between 256-65535 occupy two bytes.

Note that the integer data type in MySQL/mariadb can use the M parameter. M is a positive integer, such as INT (M) and tinyint (M ). This M indicates the display length. For example, int (4) indicates that four integers are displayed in the output. If the actual number of digits is smaller than the display width, spaces are left by default. When the number of results exceeds the limit, the displayed results are not affected. Generally, this function is used with the zerofill attribute to replace space with 0. However, after zerofill is used, the column is automatically converted to an unsigned field. For example:

CREATE TABLE test3(id INT(2) ZEROFILL NOT NULL);INSERT INTO test3 VALUES(1),(2),(11),(111);SELECT id FROM test3;+-----+| id |+-----+| 01 || 02 || 11 || 111 |+-----+4 rows in set (0.00 sec)

The only thing to note is that the display width only affects the display effect, and does not affect storage, comparison, length calculation, and other operations.

1.2.2 storage of Character Types

The storage methods and differences between char and varchar are described here.

Char is often referred to as "fixed-length string type". It strictly limits the space length, but it limits the number of characters rather than the number of bytes, however, in earlier versions, the number of bytes is limited. Therefore, char (M) strictly stores M characters. spaces are used to fill up the missing parts, and the parts exceeding M characters are directly truncated.

Because the char type has the ability to "fill with spaces when it is short", in order to reflect the authenticity of the data, trailing spaces are automatically deleted when retrieving data from the address space. This is a special feature of char. Even the trailing spaces that we manually store will be deemed to be automatically supplemented, so they will be deleted during retrieval. That is to say, in the where statement, the results of name = 'gaoxiaofang 'and name = 'gaoxiaofang' are the same.

For example:

Create table test2 (a char (4) charset utf8mb4); insert into test2 values ('Congratulations, '), ('Congratulations, you have been promoted successfully'), ('hello '), ('H'); select concat (a, 'x') from test2; + ------------- + | concat (a, 'x ') | + --------------- + | congratulations x | congratulations on your achievement of x | hellx | hex | + ------------- + 4 rows in set

As shown in the preceding results, char (4) can only store 4 characters and Remove trailing spaces.

Varchar is often referred to as "variable-length string type". It uses extra bytes of bit to mark whether a byte stores data. Each stored byte (not a character) occupies one bit for record. Therefore, an additional byte can be marked with 256 bytes in total, and two additional bytes can be marked with 65536 bytes. However, MySQL/mariadb limits the storage capacity to 65536 bytes. This indicates that a single-byte character can store up to 65536 characters. If it is a multi-byte character, for example, each character of UTF8 occupies 3 bytes, it can store up to 65536/3 = 21845 UTF-8 characters.

Therefore, in varchar (M) Storage, in addition to the actual data space length, the Bit length of one or two bytes is also calculated, that is, the space occupied by single-byte characters is M + 1 or M + 2 bytes. For multi-byte characters (such as 3 bytes) the actual occupied space is M * 3 + 1 or M * 3 + 2 bytes.

Because the varchar storage requires an additional bit to record each byte, the short data will not be automatically supplemented, therefore, the trailing spaces explicitly stored will also be stored and marked on the Bit, that is, the trailing spaces will not be deleted.

Like char (M), when varchar (2) is specified, only two bytes of characters can be stored. If it exceeds the limit, it is cut off.

For char, varchar, and text string types, trailing spaces are not considered during comparison, but spaces are considered for like matching or regular matching, because the matching characters are precise. For example:

create table test4(a char(4),b varchar(5));insert into test4 values('ab ','ab ');select a='ab ',b='ab ',a=b from test4;+-----------+--------------+-----+| a='ab ' | b='ab ' | a=b |+-----------+--------------+-----+|  1 |  1 | 1 |+-----------+--------------+-----+1 row in setselect a like 'ab ' from test4;+-------------------+| a like 'ab ' |+-------------------+|   0 |+-------------------+1 row in set

It should be noted that when values are stored (or transferred to memory), the numeric storage method saves more space than the numeric or datetime storage. Because the integer is directly calculated through bit during storage, any integer between 0 and occupies only one byte, and any integer between-occupies two bytes, when 4 bytes are occupied, it can represent any one of several billions of integers. This is obviously more space-saving than when character storage occupies one byte. For example, when the value "100" is stored as the numeric type, it occupies three bytes, while the numeric type occupies only one byte. Therefore, the database treats the value not enclosed by quotation marks as a numeric value by default. If you want to explicitly store the value as a numeric or datetime type, enclose it with quotation marks to avoid ambiguity.

1.2.3 Date and Time Storage

To store datetime data, use quotation marks to avoid ambiguity with numerical data. The Input Method of date and time is very loose. The following methods are allowed: Any allowed separators. We recommend that you use a four-digit year.

201101012011-01-01 18:40:202011/01/01 18-40-2020110101184020

1.2.4 ENUM Data Type

The ENUM data type is Enumeration type. Defined as ENUM ('value1 ', 'value2', 'value3 ',...), when inserting data into a field of this type, only one or NULL values can be inserted. when inserting other values or NULL values (that is, ''), NULL data is truncated. The case sensitivity is ignored during storage (the characters in the ENUM will be converted), and trailing spaces will be truncated.

mysql> create table test6(id int auto_increment primary key,name char(20),gender enum('Mail','f'));mysql> insert into test6(name,gender) values('malongshuai','Mail'),('gaoxiaofang','F'),('wugui','x'),('tuner',null),('woniu','');Query OK, 5 rows affectedRecords: 5 Duplicates: 0 Warnings: 2mysql> show warnings;+---------+------+---------------------------------------------+| Level | Code | Message     |+---------+------+---------------------------------------------+| Warning | 1265 | Data truncated for column 'gender' at row 3 || Warning | 1265 | Data truncated for column 'gender' at row 5 |+---------+------+---------------------------------------------+2 rows in setmysql> select * from test6;+----+-------------+--------+| id | name | gender |+----+-------------+--------+| 1 | malongshuai | Mail || 2 | gaoxiaofang | f || 3 | wugui | || 4 | tuner | NULL || 5 | woniu | |+----+-------------+--------+5 rows in set

Data of the ENUM type is stored by the index value. Compared with the string type, it only needs 1 or 2 bytes for storage. Theoretically, when the number of values is less than 256, only one byte is required. If the number of values exceeds 256, but less than 65536, two bytes are used for storage. MySQL/MariaDB can only store up to 65536 values. Of course, this is a theoretical limitation. There are many factors to consider in actual storage. For example, NULL also occupies bit, so in actual storage, 250 values may need 2 bytes.

Each value of ENUM is numbered by the index number. The value of index is used for retrieval and operation of this field. Value1 index = 1, value2 index = 2, and so on. Note that there are two special index values: NULL index = NULL, and NULL index = 0.

For example, ENUM ('A', 'B', 'C'), insert '', 'B', 'A', 'C', NULL to this field in sequence, 'xxx', since the first and last data are truncated to NULL, their index is 0, the inserted NULL index is NULL, And the inserted 'B ', the index values of 'A' and 'C' are 2, 1, and 3, respectively. Therefore, the correspondence between index numbers and values is:

Index Value
NULL NULL
0 ''
0 ''
1 'A'
2 'B'
3 'C'

Use the index of ENUM for data retrieval:

mysql> select * from test6 where gender=2;+----+-------------+--------+| id | name | gender |+----+-------------+--------+| 2 | gaoxiaofang | f |+----+-------------+--------+1 row in set

We recommend that you do not use ENUM to store values, because both sorting, retrieval, and other operations use index values as conditions, which may lead to misunderstandings. For example, the following uses ENUM to store two values and then performs retrieval and sorting.

Mysql> create table test7 (id enum ('3', '1', '2'); mysql> insert into test7 values ('1 '), ('2'), ('3'); # search id = 2, but the result is 1, because id = 2 is the index value of enum, in enum, the value of index = 2 is 1 mysql> select * from test7 where id = 2; + ---- + | id | + ---- + | 1 | + ---- + 1 row in set # When sorting by id, mysql> select * from test7 order by id asc; + ---- + | id | + ---- + | 3 | 1 | 2 | + ---- + 3 rows in set

Therefore, we strongly recommend that you do not store values in ENUM. Even floating-point values are prone to ambiguity.

1.2.5 SET Data Type

For the SET type, similar to enum, It is case-insensitive. Trailing spaces are deleted during storage, and null is also a valid value. But the difference is that multiple values can be combined. For example, set ('A', 'B', 'C', 'D') can store 'a, B ', 'd, B', etc, multiple members are separated by commas. Therefore, when multiple members are used, the values of the Members cannot contain commas. If the content to be stored is not in the set list, it is truncated to a null value.

The size of the space occupied by the SET Data Type depends on the number of SET members M. The calculation method is (M + 7)/8 to take an integer. Therefore, 1-8 members occupy 1 byte;

9-16 members occupy 2 bytes; 17-24 members occupy 3 bytes; 25-32 members occupy 4 bytes; 32-64 members occupy 8 bytes.

MySQL/MariaDB can have up to 64 members.

When storing SET data, duplicate members are ignored and stored in the enumerated order. For example, set ('B', 'B', 'A') Stores 'a, B, A', 'B, a, B', and the result is 'B, A '.

mysql> create table test8(a set('d','b','a'));mysql> insert into test8 values('b,b,a'),('b,a,b'),('bab');Query OK, 3 rows affectedRecords: 3 Duplicates: 0 Warnings: 1mysql> select * from test8;+-----+| a |+-----+| b,a || b,a || |+-----+3 rows in set

Use find_in_set (set_value, set_column_name) to retrieve rows containing the specified set value set_value. For example, to retrieve a row containing member B in field:

mysql> select * from test8 where find_in_set('b',a);+-----+| a |+-----+| b,a || b,a |+-----+2 rows in set

1.3 Data Type attribute: unsigned

The unsigned attribute is used to make the data of the numeric type non-symbolic. The unsigned attribute will change the range of the numeric data type. For example, the signed range of the tinyint type is-128 to 127, and the range will be 0 to 255 When unsigned is used. At the same time, unsigned will also restrict the column from inserting negative values.

create table t(a int unsigned,b int unsigned);insert into t select 1,2;insert into t select -1,-2;

In the preceding statement, an error is reported when the second statement is executed to insert a negative number, prompting that the number is out of the range.

The use of unsigned does play a role in some cases. For example, a general ID Primary Key column does not allow negative numbers, which is equivalent to implementing a check constraint. However, when unsigned is used, unexpected problems may occur: an error is reported if a negative number is obtained during a numerical operation. For example, in table t above, fields a and B are all unsigned columns with a row a = 1 and B = 2.

mysql> select * from t;+---+---+| a | b |+---+---+| 1 | 2 |+---+---+1 row in set

In this case, an error occurs when a-B is calculated. As long as the unsigned column is involved in the calculation, a negative number is returned.

mysql> select a-b from t;1690 - BIGINT UNSIGNED value is out of range in '(`test`.`t`.`a` - `test`.`t`.`b`)'mysql> select a-2 from t;1690 - BIGINT UNSIGNED value is out of range in '(`test`.`t`.`a` - 2)'

If the calculation result is not a negative number, it will not be affected.

mysql> select 2-a,a*3 from t;+-----+-----+| 2-a | a*3 |+-----+-----+| 1 | 3 |+-----+-----+1 row in set

This is not a bug in MySQL/MariaDB, And the unsigned in C language also has a similar problem. Set SQL _mode = 'no _ unsigned_subtraction 'in MySQL/MariaDB to solve this problem.

Therefore, we do not recommend that you use the unsigned attribute to modify fields.

1.4 data type attribute: zerofill

After zerofill modifies a field, the insufficient field display part uses 0 instead of space filling. When zerofill is enabled, unsigned is automatically set. Zerofill is generally used only after the column display width is set. The column display width has been described above.

mysql> create table t1(id int(4) zerofill);mysql> select * from t1;+-------+| id |+-------+| 0001 || 0002 || 0011 || 83838 |+-------+4 rows in set (0.00 sec)

Zerofill only modifies the display results and does not affect the stored data values.

The above (MariaDB) MySQL Data Type and storage mechanism is a comprehensive explanation of all the content that I have shared with you. I hope to give you a reference, and I hope you can provide more support for the customer's house.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.