Optimized MySQL redo operation to the extreme three rounds (ii): Multithreading parallel execution

Source: Internet
Author: User
Tags rand rounds

The previous article has adjusted the single redo statement to the optimal, but the statement is executed in one-threaded fashion. Can the multi-processor be used to allow multi-threading to be executed in parallel to further improve speed? For example, my lab environment is a 4 processor, and if you use 4 threads to execute the SQL at the same time, it should theoretically be nearly 4 times times the performance boost.

First, data fragmentation
When we generate the test data, the Created_time takes one second per record, that is, the maximum and the minimum time difference is 500,000 seconds, and the data is evenly distributed. Therefore, the average data is divided into 4 parts first.

1. Find out the Created_time boundary value of 4 data

Select Date_add (' 2017-01-01 ', Interval 125000 second) dt1, Date_add       (' 2017-01-01 ', Interval 2*125000 second) DT2,       date_add (' 2017-01-01 ', Interval 3*125000 second) DT3,       Max (created_time) dt4 from  T_source;
The query results are shown in.


Figure A

2. View the number of records per copy and confirm the average distribution of the data
Select Case-Created_time >= ' 2017-01-01 ' and Created_time < ' 2017-01-02 10:43:20 ' then ' 2017-01-01 ' when Created_time >= ' 2017-01-02 10:43:20 ' and Created_time < ' 2017-01-03 21:26: 2017-01-02 10:43:20 ' when Created_time >= ' 2017-01-03 21:26:40 ' and create D_time < ' 2017-01-05 08:10:00 ' then ' 2017-01-03 21:26:40 ' Else ' 2017-01-05 08:10:00 ' end            Min_dt, case when Created_time >= ' 2017-01-01 ' and Created_time < ' 2017-01-02 10:43:20 ' Then ' 2017-01-02 10:43:20 ' when Created_time >= ' 2017-01-02 10:43:20 ' and Created_time < ' 2              017-01-03 21:26:40 ' Then ' 2017-01-03 21:26:40 ' when Created_time >= ' 2017-01-03 21:26:40 ' and Created_time < ' 2017-01-05 08:10:00 ' then ' 2017-01-05 08:10:00 ' Else ' 2017-01-06 18:53:  ' End Max_dt,     Count (*) from T_source GROUP by case when Created_time >= ' 2017-01-01 ' and Created_time < ' 2017-0 1-02 10:43:20 ' Then ' 2017-01-01 ' when Created_time >= ' 2017-01-02 10:43:20 ' and creat Ed_time < ' 2017-01-03 21:26:40 ' then ' 2017-01-02 10:43:20 ' when Created_time >= ' 2017-01-03 21 : 26:40 ' and Created_time < ' 2017-01-05 08:10:00 ' then ' 2017-01-03 21:26:40 ' Else ' 2 017-01-05 08:10:00 ' End, case when Created_time >= ' 2017-01-01 ' and Created_time < ' 2017-0             1-02 10:43:20 ' Then ' 2017-01-02 10:43:20 ' when Created_time >= ' 2017-01-02 10:43:20 ' and Created_time < ' 2017-01-03 21:26:40 ' then ' 2017-01-03 21:26:40 ' when Created_time >= ' 2017            -01-03 21:26:40 ' and Created_time < ' 2017-01-05 08:10:00 ' then ' 2017-01-05 08:10:00 ' Else ' 2017-01-06 18:53:"END; 

The results of query two show.


Figure II


The set of 4 copies of the data should cover the entire source dataset, and the data is not duplicated. That is to say, 4 data created_time to be continuous and mutually exclusive, continuous guarantee processing all data, mutual exclusion to ensure that no two check weight. This is actually similar to the concept of time-range partitioning, perhaps better with partitioned tables, except that the steps to rebuild the table are omitted.

3. Create a stored procedure for checking the weight
With the above information, we can write 4 statements to process all the data. In order to invoke the interface as simple as possible, establish the following stored procedure.
Delimiter//create procedure Sp_unique (i smallint) begin set @a:= ' 0000-00-00 00:00:00 ';  Set @b:= "; if (i<4) then insert to T_target SELECT * FROM T_source Force index (idx_sort) where create D_time >= date_add (' 2017-01-01 ', Interval (i-1) *125000 second) and Created_time < Date_add (' 2017-01-01 ', in            Terval i*125000 second) and (@a!=created_time or @b!=item_name) and (@a:=created_time) is not null          and (@b:=item_name) is not NULL for order by Created_time,item_name;    Commit Else INSERT INTO T_target SELECT * FROM T_source Force index (idx_sort) where Created_time >= date_a DD (' 2017-01-01 ', Interval (i-1) *125000 second) and Created_time <= date_add (' 2017-01-01 ', Interval i*125000 se COND) and (@a!=created_time or @b!=item_name) and (@a:=created_time) are NOT null and (@b: =item_name) is isn't null order by CREated_time,item_name;    Commit  End If;end//delimiter;

The execution plan for the query is shown in three.


Might


The MySQL optimizer makes an index range scan and uses the index condition push (ICP) to refine the query.

Second, parallel execution
The following separately uses the shell daemon process and the MySQL Schedule event to implement parallelism.

1. Shell Background Process

(1) The establishment of the duplicate_removal.sh file, the contents are as follows.
#!/bin/bashmysql-vvv-u root-p123456 test-e "truncate t_target" &>/dev/null date ' +%h:%m.%n ' for y in {1..4}do
   sql= "Call Sp_unique ($y)"  mysql-vvv-u root-p123456 test-e "$sql" &>par_sql1_$y.log &donewaitdate ' +%H :%m.%n '

(2) Execute script file

chmod 755 duplicate_removal.sh./duplicate_removal.sh

The execution output is shown in Figure four.


Figure Four

This method takes 3.4 seconds, and 4 procedure calls executed in parallel are taken as shown in five.


Figure Five

as you can see, the execution time of each procedure is less than 3.4 seconds, because it is executed in parallel, the total process execution time is less than 3.4 seconds, and it is nearly 3 times times faster than the single-thread SQL.

2. MySQL Schedule Event
Miss Wu also used the parallel, but he is using the MySQL Schedule event function implemented, the code should be similar to the following.

(1) Create an Event history log table

--Used to view information such as event execution time CREATE TABLE T_event_history  (     dbname  varchar) NOT null default ',     eventname  varchar (+) NOT null default ",     starttime  datetime (3) NOT null default ' 0000-00-00 00:00:00 ',     Endtime  datetime (3) default NULL,     issuccess  int (one) default NULL,     duration  int (one) default NULL,     errormessage  varchar (+) default NULL,     randno  Int (one-by-one) default null);  

(2) Modify Event_scheduler parameters

Set global Event_scheduler = 1;

(3) Create an event for each concurrent thread

Delimiter//create event EV1 on schedule @ current_timestamp + interval 1 hour on completion preserve disable do begin      Declare R_code char (5) Default ' 00000 ';      declare r_msg text;      declare v_error integer;      DECLARE v_starttime datetime default now (3);            Declare v_randno integer default floor (rand () *100001); Insert into T_event_history (Dbname,eventname,starttime,randno) #作业名 values (Database (), ' EV1 ', V_starttime,v_randno)             ;              Begin #异常处理段 Declare continue handler for SqlException begin set V_error = 1;          Get diagnostics Condition 1 R_code = returned_sqlstate, r_msg = Message_text;                    End      #此处为实际调用的用户程序过程 call Sp_unique (1);            End Update T_event_history set Endtime=now (3), Issuccess=isnull (V_error), Duration=timestampdiff (Microsecond,starttime, Now (3)), Errormessage=concat (' error= ', R_code, ', message= ', r_msg), Randno=null where Starttime=v_starttiMe and Randno=v_randno; end//Create event Ev2 on schedule @ current_timestamp + interval 1 hour on completion preserve disable do begin D      Eclare R_code char (5) Default ' 00000 ';      declare r_msg text;      declare v_error integer;      DECLARE v_starttime datetime default now (3);            Declare v_randno integer default floor (rand () *100001); Insert into T_event_history (Dbname,eventname,starttime,randno) #作业名 values (Database (), ' Ev2 ', V_starttime,v_randno)             ;              Begin #异常处理段 Declare continue handler for SqlException begin set V_error = 1;          Get diagnostics Condition 1 R_code = returned_sqlstate, r_msg = Message_text;                    End      #此处为实际调用的用户程序过程 call Sp_unique (2);            End Update T_event_history set Endtime=now (3), Issuccess=isnull (V_error), Duration=timestampdiff (Microsecond,starttime, Now (3)), Errormessage=concat (' error= ', R_code, ', message= ', r_msg), randno=null where Starttime=v_starttime and Randno=v_randno; end//Create event Ev3 on schedule @ current_timestamp + interval 1 hour on completion preserve disable do begin Decl      Is R_code char (5) Default ' 00000 ';      declare r_msg text;      declare v_error integer;      DECLARE v_starttime datetime default now (3);            Declare v_randno integer default floor (rand () *100001); Insert into T_event_history (Dbname,eventname,starttime,randno) #作业名 values (Database (), ' Ev3 ', V_starttime,v_randno)             ;              Begin #异常处理段 Declare continue handler for SqlException begin set V_error = 1;          Get diagnostics Condition 1 R_code = returned_sqlstate, r_msg = Message_text;                    End      #此处为实际调用的用户程序过程 Call Sp_unique (3);            End Update T_event_history set Endtime=now (3), Issuccess=isnull (V_error), Duration=timestampdiff (Microsecond,starttime, Now (3)), Errormessage=concat (' error= ', R_code, ', Message= ', r_msg), Randno=null where Starttime=v_starttime and Randno=v_randno; end//Create event ev4 on schedule @ current_timestamp + interval 1 hour on completion preserve disable do begin Decl      Is R_code char (5) Default ' 00000 ';      declare r_msg text;      declare v_error integer;      DECLARE v_starttime datetime default now (3);            Declare v_randno integer default floor (rand () *100001); Insert into T_event_history (Dbname,eventname,starttime,randno) #作业名 values (Database (), ' ev4 ', V_starttime,v_randno)             ;              Begin #异常处理段 Declare continue handler for SqlException begin set V_error = 1;          Get diagnostics Condition 1 R_code = returned_sqlstate, r_msg = Message_text;                    End      #此处为实际调用的用户程序过程 Call Sp_unique (4);            End Update T_event_history set Endtime=now (3), Issuccess=isnull (V_error), Duration=timestampdiff (Microsecond,starttime, Now (3)), errormessage=concat (' error= ', R_code, ', message= ', r_msg), Randno=null where Starttime=v_starttime and Randno=v_randno;    End//delimiter;

Note: In order to record the time of each event execution, the logic of the Operation log table has been added to the event definition because only one insert is executed in each event, and the effect of a update,4 event on a total of 8 very simple statements will be negligible. Execution time is accurate to milliseconds.

(4) Triggering event execution

Mysql-vvv-u root-p123456 test-e "Truncate T_target;alter event EV1 on schedule at Current_timestamp Enable;alter event Ev2 on schedule @ current_timestamp Enable;alter event ev3 on schedule @ current_timestamp Enable;alter event Ev4 on SC Hedule at Current_timestamp enable; "

Note: This command line sequence triggers 4 events, but does not wait until the previous one finishes executing the next, but executes immediately down. This can also be seen clearly from the output in figure six. So four procedure calls are executed in parallel.

Figure Six
(5) View event execution log
SELECT * from T_event_history;

The results of query 7 show.


Figure Seven


As you can see, each procedure executes for 3.5 seconds, and because it is executed in parallel, the total execution is also 3.5 seconds, and the optimization effect is almost the same as the shell daemon process.

Reference:
Increasing slow query performance with the parallel query execution
Mysql Event Scheduling History


Optimized MySQL redo operation to the extreme three rounds (ii): Multithreading parallel execution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.