Why did you suddenly want to study it? It is because you want to use MySQL for something, but you find that the client library, MySql Connector/Net, uses GPL authorization. In other words, source distribution cannot be closed. I heard that MariaDB uses the same communication protocol, so its client library can be compatible with MySQL. It is LGPL, but it does not have a. Net version. Oh, no such bullying. Net. Therefore, I came up with the idea of writing connector first). So I want to study the MySQL communication protocol.
I declare that this is not a tutorial, but a note during the study, so errors are inevitable. In addition, because I don't want to write a large and complete component, I intentionally ignore something.
References: http://dev.mysql.com/doc/internals/en/client-server-protocol.html
First, sort out the basic knowledge:
1. Communication Protocol
The Client and Server support multiple communication modes, the most widely used is TCP communication, and also supports named pipes and shared memory. C/S adopts a semi-dual mode to send and receive data, that is, after the Client sends the request data on a TCP link, the next batch of data can only be sent after receiving the response data from all servers. Other data cannot be sent in the middle, which requires a strong sequence. Take logon as an example:
Client Server | handshake | |<-------------------| | authentication | |------------------->| | auth result | |<-------------------| | |
2. Protocol faults
MySQL expands the communication protocol in version 4.1. Therefore, different versions of servers need to communicate with each other through different protocols. However, we can ignore this issue because it is basically 5.0 or later.
3. Basic Data Types
Note that this refers to the data type used in the communication protocol, rather than the data type of the field. All information in communication protocols is based on two types of data: values and strings.
The values are further divided into fixed-length and variable-length types. The fixed-length type can be 1/2, 3/4, or 6/8 bytes. The variable-length type is stored in different bytes according to the value range, it can be 1/2/3/8 bytes, and we will see it later.
A string can be divided into fixed-length, NULL-ending, and variable-length types.
4. Basic data packet format
+-------------------+------------------+---------------+| data_len(3 bytes) | sequence(1 byte) | data(n bytes) |+-------------------+------------------+---------------+
A common package format can be understood when you write a TCP program. Note that the Data Length refers to the length of the subsequent data, excluding the four bytes in the header.
5. Character Set
Character sets will be encountered in many places, as follows:
+-----+----------------------+| id | collation_name |+-----+----------------------+| 1 | big5_chinese_ci || 2 | latin2_czech_cs || 3 | dec8_swedish_ci || 4 | cp850_general_ci || 5 | latin1_german1_ci || 6 | hp8_english_ci || 7 | koi8r_general_ci || 8 | latin1_swedish_ci || 9 | latin2_general_ci || 10 | swe7_swedish_ci || 11 | ascii_general_ci || 12 | ujis_japanese_ci || 13 | sjis_japanese_ci || 14 | cp1251_bulgarian_ci || 15 | latin1_danish_ci || 16 | hebrew_general_ci || 18 | tis620_thai_ci || 19 | euckr_korean_ci || 20 | latin7_estonian_cs || 21 | latin2_hungarian_ci || 22 | koi8u_general_ci || 23 | cp1251_ukrainian_ci || 24 | gb2312_chinese_ci || 25 | greek_general_ci || 26 | cp1250_general_ci || 27 | latin2_croatian_ci || 28 | gbk_chinese_ci || 29 | cp1257_lithuanian_ci || 30 | latin5_turkish_ci || 31 | latin1_german2_ci || 32 | armscii8_general_ci || 33 | utf8_general_ci || 34 | cp1250_czech_cs || 35 | ucs2_general_ci || 36 | cp866_general_ci || 37 | keybcs2_general_ci || 38 | macce_general_ci || 39 | macroman_general_ci || 40 | cp852_general_ci || 41 | latin7_general_ci || 42 | latin7_general_cs || 43 | macce_bin || 44 | cp1250_croatian_ci || 47 | latin1_bin || 48 | latin1_general_ci || 49 | latin1_general_cs || 50 | cp1251_bin || 51 | cp1251_general_ci || 52 | cp1251_general_cs || 53 | macroman_bin || 57 | cp1256_general_ci || 58 | cp1257_bin || 59 | cp1257_general_ci || 63 | binary || 64 | armscii8_bin || 65 | ascii_bin || 66 | cp1250_bin || 67 | cp1256_bin || 68 | cp866_bin || 69 | dec8_bin || 70 | greek_bin || 71 | hebrew_bin || 72 | hp8_bin || 73 | keybcs2_bin || 74 | koi8r_bin || 75 | koi8u_bin || 77 | latin2_bin || 78 | latin5_bin || 79 | latin7_bin || 80 | cp850_bin || 81 | cp852_bin || 82 | swe7_bin || 83 | utf8_bin || 84 | big5_bin || 85 | euckr_bin || 86 | gb2312_bin || 87 | gbk_bin || 88 | sjis_bin || 89 | tis620_bin || 90 | ucs2_bin || 91 | ujis_bin || 92 | geostd8_general_ci || 93 | geostd8_bin || 94 | latin1_spanish_ci || 95 | cp932_japanese_ci || 96 | cp932_bin || 97 | eucjpms_japanese_ci || 98 | eucjpms_bin || 99 | cp1250_polish_ci || 128 | ucs2_unicode_ci || 129 | ucs2_icelandic_ci || 130 | ucs2_latvian_ci || 131 | ucs2_romanian_ci || 132 | ucs2_slovenian_ci || 133 | ucs2_polish_ci || 134 | ucs2_estonian_ci || 135 | ucs2_spanish_ci || 136 | ucs2_swedish_ci || 137 | ucs2_turkish_ci || 138 | ucs2_czech_ci || 139 | ucs2_danish_ci || 140 | ucs2_lithuanian_ci || 141 | ucs2_slovak_ci || 142 | ucs2_spanish2_ci || 143 | ucs2_roman_ci || 144 | ucs2_persian_ci || 145 | ucs2_esperanto_ci || 146 | ucs2_hungarian_ci || 192 | utf8_unicode_ci || 193 | utf8_icelandic_ci || 194 | utf8_latvian_ci || 195 | utf8_romanian_ci || 196 | utf8_slovenian_ci || 197 | utf8_polish_ci || 198 | utf8_estonian_ci || 199 | utf8_spanish_ci || 200 | utf8_swedish_ci || 201 | utf8_turkish_ci || 202 | utf8_czech_ci || 203 | utf8_danish_ci || 204 | utf8_lithuanian_ci || 205 | utf8_slovak_ci || 206 | utf8_spanish2_ci || 207 | utf8_roman_ci || 208 | utf8_persian_ci || 209 | utf8_esperanto_ci || 210 | utf8_hungarian_ci |+-----+----------------------+
To be continued ......
This article from the rabbit nest blog, please be sure to keep this source http://boytnt.blog.51cto.com/966121/1275952