Exploring the adventures of SQL in MySQL source code

Source: Internet
Author: User

This article starts from the execution process of a select statement and traverses multiple subsystems of MySQL.

First, let's take a picture to start our adventure.

After the client connects to the MySQL server and sends a request, the server thread is blocked in the my_net_read function (that is, the read in the socket) in do_command (SQL/parse. cc ).

After the client sends an SQL statement (select * from zzz in this example) to the server, my_net_read returns and reads data from the tcpbuffer and writes it to the packet string.

 
 
  1. packet_length= my_net_read(net); 

The first byte of packet is a flag, which determines whether the data packet is a query, a command, a success, or an error.

Next, enter the dispatch_command (SQL/parse. cc) function. The data type is no longer required.

 
 
  1. return_value= dispatch_command(command, thd, packet+1, (uint) (packet_length-1)); 

Go to dispatch_command. We can see that

 
 
  1. statistic_increment(thd->status_var.questions, &LOCK_status); 

This is the value of show status questions.

Next to mysql_parse (SQL/SQL _parse.cc), this function is the total intersection of SQL statement parsing.

After entering this function, the first thing you encounter is query_cache_send_result_to_client. The original name of this function is to query whether there are the same statements in QCache. If yes, the results are immediately returned from QCache, so the entire SQL statement is over.

If the SQL statement does not exist in QCache, the system proceeds to parse_ SQL (SQL/SQL _parse.cc). This function mainly calls MYSQLparse. MYSQLparse is actually the yyparse (pai_^) in bison/yacc ),

 
 
  1. #define yyparse MYSQLparse 

Yes, I started to parse the SQL statement. Let's briefly talk about lexical analysis and syntax matching.

For a statement like select * from zzz, first enter the lexical analysis, find two token (select, from), and then perform syntax Matching Based on the token. The rule is in SQL/yacc. yy, I paste several matched pattern and action.

 
 
  1. select: 
  2. select_init 
  3. LEX *lex= Lex; 
  4. lex->sql_command= SQLCOM_SELECT; 
  5. /* Need select_init2 for subselects. */ 
  6. select_init: 
  7. SELECT_SYM select_init2 
  8. | '(' select_paren ')' union_opt 
  9. select_paren: 
  10. SELECT_SYM select_part2 
  11. LEX *lex= Lex; 
  12. SELECT_LEX * sel= lex->current_select; 
  13. ..... 
  14. select_from: 
  15. FROM join_table_list where_clause group_clause having_clause 
  16. opt_order_clause opt_limit_clause procedure_clause 
  17. Select->context.table_list= 
  18. Select->context.first_name_resolution_table= 
  19. (TABLE_LIST *) Select->table_list.first; 
  20. .... 
  21. select_item_list: 
  22. select_item_list ',' select_item 
  23. | select_item 
  24. | '*' 
  25. THD *thd= YYTHD; 
  26. Item *item= new (thd->mem_root) 
  27. Item_field(&thd->lex->current_select->context, 
  28. NULL, NULL, "*"); 
  29. if (item == NULL) 
  30. MYSQL_YYABORT; 
  31. if (add_item_to_list(thd, item)) 
  32. MYSQL_YYABORT; 
  33. (thd->lex->current_select->with_wild)++; 
  34. select_item: 
  35. remember_name select_item2 remember_end select_alias 
  36. THD *thd= YYTHD; 
  37. DBUG_ASSERT($1 < $3); 
  38. if (add_item_to_list(thd, $2)) 
  39. MYSQL_YYABORT; 
  40. if ($4.str) 
  41. ... 

We can see that the most important thing in the action is the assignment of add_item_to_list and table_list.

After parsing, information about the tables (zzz) and fields (*) to be queried is written to the thd-> lex struct.

For example, thd-> lex-> query_tables indicates the table (zzz), thd-> lex-> current_select-> with_wild indicates whether the statement is used *, and so on.

 
 
  1. (gdb) p *thd->lex->query_tables 
  2. $7 = {next_local = 0x0, next_global = 0x0, prev_global = 0x855a458, db = 0x85a16b8 "test", alias = 0x85a16e0 "zzz", 
  3. table_name = 0x85a16c0 "zzz", schema_table_name = 0x0, option = 0x0, on_expr = 0x0, prep_on_expr = 0x0, cond_equal = 0x0, 

After SQL Parsing is complete, what about optimization? Don't rush to look down.

Next, enter the mysql_execute_command function, which is the total entry of all SQL commands.

Because this SQL statement is a select statement, execute_sqlcom_select is the function to be executed, and then enters mysql_select (anger is so complicated ).

Mysql_select is the optimizer module. The code of this module is complicated. We can clearly see the three steps for creating, optimizing, and executing the optimizer. The optimization details are not listed.

 
 
  1. if (!(join= new JOIN(thd, fields, select_options, result))) 
  2. ... 
  3. if ((err= join->optimize())) 
  4. ... 
  5. join->exec(); 

After optimization, We need to execute join-> exec (). The function actually enters JOIN: exec () (SQL _select.cc ).

Exec () first sends the title function send_fields to the client, which has no data, but the field is also required.

Then enter do_select (), and jump to the specific implementation of the engine based on the table's storage engine (zzz is myisam table ).

Here, mi_scan is the function used by the myisam engine to scan files.

 
 
  1. (gdb) p info->filename 
  2. ./test/zzz 

Is this the physical file corresponding to the zzz table.

After accessing the disk with a series of mi functions, the system sends data to the client through send_data and returns the data from dispatch_command. Finally, the entire SQL statement is ended at net_end_statement.

The execution process behind a simple select statement is very complex, and the above steps only end at a point.

Ps: In SQL _yacc.yy, we can see that MySQL is mocking the commonly used dual tables in Oracle.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.