Others, the contents of the file row is id = 2112112, email = xxx@163.com, and so on other, id = 2112112, email = xxx@163.com, and so on other, id = 2112112, email = xxx massive log warehouse
There are 10 log files under the log, each file is compressed after about 60 mleft, the file suffix is .gz, such as a.gzw. B .gz, the contents of the file is id = 2112112, email = xxx@163.com, and so on other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Now, we want to insert all the content of each file in this directory into the database. the tables in the database are divided by email, which is about log_1, log_2, until the sharding of log_1000, I would like to provide a detailed solution. for example, how can we ensure that each file is stored in the database in a short time to make script execution more efficient?
First paste a piece of code
Error_reporting (E_ALL &~ E_NOTICE );
// Receive parameters
$ Mysql_host = XX. XX;
$ Mysql_user = XXX;
$ Mysql_pass = XX;
$ Mysql_port = 3306;
$ Mysql_db = 'test ';
$ Table_pre = 'Log _';
$ Gz_log_file = a.gz;
// Script execution log
$ Exec_log = '/data_log/record.txt ';
File_put_contents ($ exec_log, '*************************************** ** START ***********************************'. "\ r \ n", FILE_APPEND );
File_put_contents ($ exec_log, 'param is mysql_host = '. $ mysql_host. 'mysql_user = '. $ mysql_user. 'mysql_pass = '. $ mysql_pass. 'mysql_port = '. $ mysql_port. 'mysql_db = '. $ mysql_db. 'table_pre = '. $ table_pre. 'gz_log_file = '. $ gz_log_file. 'start_time = '. date ("Y-m-d H: I: s "). "\ r \ n", FILE_APPEND );
// Read logs into the database
$ Z_handle = gzopen ($ gz_log_file, 'r ');
$ Time_start = microtime_float ();
$ Mysql_value_ary = array ();
// Link to the database
$ Conn = mysql_connect ("$ mysql_host: $ mysql_port", $ mysql_user, $ mysql_pass );
If (! $ Conn ){
File_put_contents ($ exec_log, 'could not connect database error, error = '. mysql_error (). "\ r \ n", FILE_APPEND );
Exit;
}
$ Selec_db = mysql_select_db ($ mysql_db );
If (! $ Selec_db ){
File_put_contents ($ exec_log, 'Select database error, database = '. $ mysql_db. "\ r \ n", FILE_APPEND );
Exit;
}
While (! Gzeof ($ z_handle )){
$ Each_gz_line = gzgets ($ z_handle, 4096 );
$ Line_to_array = explode ("\ t", $ each_gz_line );
// Filter invalid logs
If (! Empty ($ line_to_array [3]) &! Empty ($ line_to_array [2]) &! Empty ($ line_to_array [4]) {
$ Insert_value = "('". $ line_to_array [3]. "','". $ line_to_array [2]. "','". $ line_to_array [1]. "','". $ line_to_array [4]. "','". $ line_to_array [0]. "')";
$ Insert_ SQL = "insert into $ table_name (uid, email, ip, ctime) values $ insert_value ";
$ Table_id = abs (crc32 ($ line_to_array [2]) % 1000 );
$ Table_name = $ table_pre. $ table_id;
$ Result = mysql_query ($ insert_ SQL );
If (! $ Result ){
// Logs are recorded if an insert error occurs.
File_put_contents ($ exec_log, 'Table _ name = '. $ table_name. 'email ='. $ line_to_array [2]. "\ r \ n", FILE_APPEND );
}
}
}
$ Time_end = microtime_float ();
$ Diff = $ time_end-$ time_start;
File_put_contents ($ exec_log, 'Success to insert database, log_file is '. $ gz_log_file. 'time-consuming is ='. $ diff. "s \ r \ n", FILE_APPEND );
File_put_contents ($ exec_log, '*************************************** *********************************** '. "\ r \ n", FILE_APPEND );
Gzclose ($ z_handle );
The code above is very slow and intolerable. could you please help me with the massive logs? Log analysis? Script? Efficiency
------ Solution --------------------
Modify the table type to InnoDB, and then implement it using transactions,
If not, load file