MySQL lexical analysis

Source: Internet
Author: User

This link is a little introduction, you can understand a rough: http://blog.imaginea.com/mysql-query-parsing/

Key points:
1. SQL parsing includes a syntax analyzer and a lexical analyzer.
A simple practice is to use the bison/flex combination. However, the MySQL lexical analyzer is manually created.
The syntax analyzer entry function is MYSQLparse, And the lexical analyzer entry function is MYSQLlex.
2. the lexical analysis checks whether the token is a keyword.
The most direct method is to get a large keyword array and perform a half-fold search. MySQL has made some optimizations here.
This section mainly introduces this part.

Considering that the keyword is a read-only list, creating a read-only search tree can improve the search performance.
Generate Search Tree:
1. Read the keyword array to generate a Trie tree.
2. Adjust the tree and generate an array (that is, a tree not represented by a linked list ).

Use the search tree:
This is relatively simple. Let's look at the function get_hash_symbol.

Generate the search tree. The Makefile rules are as follows:
In 'SQL/CMakeFiles/SQL. dir/build. make ':

SQL/lex_hash.h: SQL/gen_lex_hash
$ (CMAKE_COMMAND)-E cmake_progress_report/home/zedware/Workspace/mysql/CMakeFiles $ (CMAKE_PROGRESS_153)
@ $ (CMAKE_COMMAND)-E cmake_echo_color -- switch = $ (COLOR) -- blue -- bold "Generating lex_hash.h"
Cd/home/zedware/Workspace/mysql/SQL &./gen_lex_hash> lex_hash.h

It is easy to find that the main function is 'get _ hash_symbol', and its main call relationship is:

/* SQL/lex_hash.h */
Get_hash_symbol-> SQL _functions_map
Get_hash_symbol-> symbols_map

/* SQL/SQL _lex.cc */
Find_keyword-> get_hash_symbol
Is_keyword-> get_hash_symbol
Is_lex_native_function-> get_hash_symbol

Tree example in the comment of the file "gen_lex_hash.cc:

+ ----------- +-+
| Len | 1 | 2 | 3 |
+ ----------- +-+
| First_char | 0 | 0 | a |
| Last_char | 0 | 0 | d |
| Link | 0 | 0 | + |
|
V
+ ---------- +-+ -- +
| 1 char | a | B | c | d |
+ ---------- +-+ -- +
| First_char | d | 0 | 0 | 0 |
| Last_char | n | 0 | 0 |-1 |
| Link | + | 0 | 0 | + |
|
| V
| Symbols [2] ("DAY ")
V
+ ---------- + -- +-+ -- +
| 2 char | d | e | f | j | h | I | j | k | l | m | n |
+ ---------- + -- +-+ -- +
| First_char | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Last_char |-1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |-1 |
| Link | + | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | + |
|
V
Symbols [0] ("ADD") symbols [1] ("AND ")

It is easier to understand the Trie tree. The following is the tree corresponding to different input arrays.
I = 0

+ ----------- +-+ -- +
| Len | 1 | 2 |
+ ----------- +-+ -- +
| First_char | 0 |-1 |
| Last_char | 0 | 0 |
| Char_tails | 0 | x |
| Ithis | 0 | 0 |
| Iresult | 0 | 0 |
|
&&

Static SYMBOL symbols [] = {
{"&", SYM (AND_AND_SYM )},

Static uchar symbols_map [8] = {
0, 0, 1, 0, <== 1 = number of elements in the symbols [] array, indicating that no elements are found
0, 0, 0, 0, <== symbols [0]
};

I = 1

+ ----------- + -- +
| Len | 1 | 2 |
+ ----------- + -- +
| First_char |-1 |-1 |
| Last_char | 0 | 0 |
| Char_tails | x |
| Ithis | 0 | 0 |
| Iresult | 1 | 0 |
|
<&&

Static SYMBOL symbols [] = {
{"&", SYM (AND_AND_SYM )},
{"<", SYM (LT )},

Static uchar symbols_map [8] = {
0, 0, 1, 0, <== 1 <symbols [] Number of elements in the array 2, indicating that the symbols [1] is found
0, 0, 0, 0, <== symbols [0]
};

I = 2

+ ----------- + -- +
| Len | 1 | 2 |
+ ----------- + -- +
| First_char |-1 | & |
| Last_char | 0 | <|
| Char_tails | x | ^ |
| Ithis | 0 | 0 |
| Iresult | 1 | x |
|
<|
|
+ ---------- + -- ++ -- +
| 1 char | & |... | <|
+ ---------- + -- ++ -- +
| First_char |-1 | 0 |-1 |
| Last_char | 0 | 0 | 0 |
| Char_tails | 0 | 0 | x |
| Ithis | 0 | 0 | 0 |
| Iresult | 0 | 0 | 2 |
|
& <=

Static SYMBOL symbols [] = {
{"&", SYM (AND_AND_SYM )},
{"<", SYM (LT )},
{"<=", SYM (LE )},

Static uchar symbols_map [100] = {
0, 0, 1, 0,
'&', '<', 2, 0,
0, 0, 0, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 3, 0,
0, 0, 2, 0,
};

I = 3

+ ----------- + -- +
| Len | 1 | 2 |
+ ----------- + -- +
| First_char |-1 | & |
| Last_char | 0 | <|
| Char_tails | x | ^ |
| Ithis | 0 | 0 |
| Iresult | 1 | x |
|
<|
|
+ ---------- + -- ++ -- +
| 1 char | & |... | <|
+ ---------- + -- ++ -- +
| First_char |-1 | 0 |-1 |
| Last_char | 0 | 0 | 0 |
| Char_tails | 0 | 0 | x |
| Ithis | 0 | 0 | 0 |
| Iresult | 0 | 0 | p |
|
& |
|
+ ---------- + -- +
| 2 char | = |> |
+ ---------- + -- +
| First_char |-1 |-1 |
| Last_char | 0 | 0 |
| Char_tails | x |
| Ithis | 0 | 0 |
| Iresult | 2 | 3 |
|
<=<>

Static SYMBOL symbols [] = {
{"&", SYM (AND_AND_SYM )},
{"<", SYM (LT )},
{"<=", SYM (LE )},
{"<>", SYM (NE )},

Static uchar symbols_map [108] = {
0, 0, 1, 0,
'&', '<', 2, 0,
0, 0, 0, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
0, 0, 4, 0,
'=', '>', 25, 0,
0, 0, 2, 0,
0, 0, 3, 0,
};

As you can see, arrays indicate a certain amount of space waste. If we are not afraid of trouble, we can squeeze out a bit of oil.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.