Use perl for database migration, from MSSQL to MYSQL (iii) -- V1.1 ~ Multi-thread + handlerSocket from the front side of the program running, the program can run, but the speed is too high, before Reading and Writing 2000 million records, the speed is still acceptable (about 400 seconds), but after (about seconds), of course, this is the same as SQLSERV
Use perl for database migration, from MSSQL to MYSQL (iii) -- V1.1 ~ Multi-thread + handlerSocket from the front side of the program running, the program can run, but the speed is too high, before Reading and Writing 2000 million records, the speed is still acceptable (about 400 records/second), but after (about records/second), of course, this is the same as SQL SERV
Use perl for database migration, from MSSQL to MYSQL (iii) -- V1.1 ~ Multithreading + handlerSocket
From the previous running status of the program, the program can run, but the speed is too high. The speed is still acceptable before reading and writing 2000 records (about records/second ), but after 400 (about records per second), the Hong Kong virtual host, of course, is related to SQL SERVER reading, network and SERVER performance, but this speed, I don't know whether my friends who have been tested can't stand it. I can't stand it anyway. So I thought about it. The single thread is slow, so I have to change it. Change to multi-thread, multi-process.
In addition, after being tested by the younger brother, the efficiency is much faster ....
Don't worry about it any more. Go directly to the code.
DBI; 3 use Switch; 4 use strict; 5 use Net: HandlerSocket; 6 use threads;: HiRes ;=;==;=; = 9999; = DBI->, $ source_user_name, $ source_user_psd); 19 # obtain all user tables without geographic fields = $ dbh-> prepare ("select name, object_id from sys. all_objects ao where type = 'U' and not exists (); 23 $ something-> execute (); # Number of threads... This is very tangled. When the younger brother's server is being guided, the server will be suspended when there are more than five threads on the U.S. server ~~~~ = (Not defined $ ARGV [0])? 5: $ ARGV [0]; = (not defined $ ARGV [1])? 3000: $ ARGV [1]; 32 while (@ data = $ something-> fetchrow_array () 33 {34 # Use ($ select_columns, $ insert_columns, $ column_count, $ sort_column, $ column_types); 37 # obtain the columns of a table and construct a query, insert, total number of columns, column type 38 # input parameters are as follows: 39 ### data [0]: Table Name, data [1]: ID 40 ## the returned parameters are described as follows: 41 ###$ select_columns: When a SELECT statement is created, column string 42 ###$ insert_columns: the column string when an insert operation is created. The reason for separating these two types is that the column attribute method is used in select, for example, geometry. STAsText () 43 ### $ column_count: Number of columns, which can be obtained from @ $ column_types, but @ $ columns_types is appended, this parameter does not remove the 44 ### $ sort_column: Field Used for sorting. As a summary, generally, the first field is the ID field and the primary key. Therefore, here, we only take the first field 45 ###$ columns_types: column type list, an array. Some types of values in SQL server must be processed during mysql, such as geometry ($ select_columns, $ insert_columns, $ column_count, $ sort_column, $ column_types) = get_columns ($ data [0], $ data [1]); 48 # query results. If the Import fails, False is returned. Otherwise, it is null = export_data_in ($ select_columns, $ insert_columns, $ column_count, $ sort_column, $ data [0], $ column_types ); 51 52}-> disconnect; export_data_in 61 {($ select_columns, $ insert_columns, $ columns_count, $ sort_column, $ table_name, $ column_types) =_; = 0; = DBI->, $ source_user_name, $ source_user_psd); =); 67 $ sth_ SC-> execute (); = $ sth_ SC-> fetchrow_array (); = 0; = $ per_records-1; 72 while ($ begin_cnt <= @ Data_count [0]) 73 {; (my $ count = 1; $ count <= $ threads_cnt; $ count ++) 77 {78 # basic, Hong Kong Space, the following SQL statement has become the biggest performance bottleneck of the program. In the test, the first pieces of data are okay. However, after pieces, the query performance of this SQL statement decreases sharply. Of course, the younger brother is directed at the remote test, (Of course, my table has no partitions. If you have experience in MSSQL optimization, the table can be partitioned.) = "select * 81 FROM 82 (83 SELECT $ select_columns, ROW_NUMBER () OVER (order by $ sort_column) AS RowNum 84 FROM $ table_name 85) as t; = threads-> new (\ & export_data, $ table_name, $ SQL _select, $ insert_columns, $ columns_count, $ column_types); 91 push (@ threads, $ res0); 92 $ begin_cnt = $ begin_cnt + $ per_records; 93 $ en D_cnt = $ end_cnt + $ per_records; 94} (@ threads) 97 {98 $ _-> join; 99} 100} export_data105 {= time; 'my ($ table_name, $ SQL _select, $ insert_columns, $ columns_count, $ column_types) =_; = DBI->, $ source_user_name, $ source_user_psd); = $ dbh_mssql-> prepare ($ SQL _select ); 111 $ sth_select-> execute (); 112 $ sth_select-> {LongTruncOk} = 1; = rand (3200); = ""; 118 # change to fetchrow_arrayref (), the younger brother tested the speed, which is not comparable to fetchrow_array ($ sele Ct_data = $ sth_select-> fetchrow_arrayref () 121 {122if ($ data_str ne "") 123 {; 125} = ., @ {; 128 129}, time-$ startTime); 131 $ startTime = time; 132 # statement for viewing data during testing. ($ Data_str ne "") 135 {;={ host =>$ aim_ip, port =>$ hs_port };= new Net: HandlerSocket ($ args) ;= ,); -> get_error () if $ res! = 0; = $ hs-> execute_multi (eval ($ data_str);-> get_error () if $ hs-> get_error ()! = 0; 144 $ hs-> close (); 145};, time-$ startTime); # here, I will also show you my results ^-^ 150 # exporting data t_p_areagroup_plate_userdiy_l; total: 42758121; now: 12825000151 # exporting data t_p_areagroup_plate_userdiy_l; total: 42758121; now: 12830000152 # exporting data pipeline; total: 42758121; now: 12835000153 # exporting data t_p_areagroup_plate_userdiy_l; total: 42758121; now: 12840000154 # exporting data t_p_areagroup_plate_userdiy_l; total: 42758121; now: 12845000155 # Read time 18.9 seconds.156 # write time 1.3 seconds.157 # Read time 23.3 seconds.158 # write time 1.4 seconds.159 # Read time 23.7 seconds.160 # write time 1.1 seconds.161 # Read time 25.6 seconds.162 # write time 0.6 seconds.163 # Read time 25.6 seconds.164 # write time 0.9 seconds .} get_columns169 {; = "select col. name, tp. name from sys. all_columns col172 inner join sys. types tp on col. system_type_id = tp. system_type_id and col. user_type_id = tp. user_type_id; = DBI->, $ source_user_name, $ source_user_psd) ;=$ dbh2-> prepare ($ SQL); 176 $ cols-> execute (); = ""; = ""; = 0; = ""; 183 while (@ col = $ cols-> fetchrow_array () 184 {185my ($ col_name, $ type_name) = @ col; 186 @ cols_types [$ cols_count] = $ type_name; 187if ($ cols_count> 0) 188 {; 191} {; 195}) 197 {;; 200 }{;; 205} 206 $ cols_count ++; 207} 208 $ dbh2-> disconnect; 209 ($ cols_select, $ cols_insert, $ cols_count, $ sort_column, \ @ cols_types); 210} 211 212
Call method (put the running result in out. log ):
1 nohup perl export_data_muti_thread_v0.5.pl 10 5000> out. log &
In addition, I would like to say something else... There are not many replies from cnblogs, even if it is a brick. Don't be so dead.
Posted on