Inbound flood logs

Source: Internet
Author: User
Others, the contents of the file row is id = 2112112, email = xxx@163.com, and so on other, id = 2112112, email = xxx@163.com, and so on other, id = 2112112, email = xxx massive log warehouse
There are 10 log files under the log, each file is compressed after about 60 mleft, the file suffix is .gz, such as a.gzw. B .gz, the contents of the file is id = 2112112, email = xxx@163.com, and so on other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,
Id = 2112112, email = xxx@163.com, etc. other,

Now, we want to insert all the content of each file in this directory into the database. the tables in the database are divided by email, which is about log_1, log_2, until the sharding of log_1000, I would like to provide a detailed solution. for example, how can we ensure that each file is stored in the database in a short time to make script execution more efficient?
First paste a piece of code

Error_reporting (E_ALL &~ E_NOTICE );
// Receive parameters
$ Mysql_host = XX. XX;
$ Mysql_user = XXX;
$ Mysql_pass = XX;
$ Mysql_port = 3306;
$ Mysql_db = 'test ';
$ Table_pre = 'Log _';
$ Gz_log_file = a.gz;
// Script execution log
$ Exec_log = '/data_log/record.txt ';
File_put_contents ($ exec_log, '*************************************** ** START ***********************************'. "\ r \ n", FILE_APPEND );
File_put_contents ($ exec_log, 'param is mysql_host = '. $ mysql_host. 'mysql_user = '. $ mysql_user. 'mysql_pass = '. $ mysql_pass. 'mysql_port = '. $ mysql_port. 'mysql_db = '. $ mysql_db. 'table_pre = '. $ table_pre. 'gz_log_file = '. $ gz_log_file. 'start_time = '. date ("Y-m-d H: I: s "). "\ r \ n", FILE_APPEND );
// Read logs into the database
$ Z_handle = gzopen ($ gz_log_file, 'r ');
$ Time_start = microtime_float ();
$ Mysql_value_ary = array ();
// Link to the database
$ Conn = mysql_connect ("$ mysql_host: $ mysql_port", $ mysql_user, $ mysql_pass );
If (! $ Conn ){
File_put_contents ($ exec_log, 'could not connect database error, error = '. mysql_error (). "\ r \ n", FILE_APPEND );
Exit;
}
$ Selec_db = mysql_select_db ($ mysql_db );
If (! $ Selec_db ){
File_put_contents ($ exec_log, 'Select database error, database = '. $ mysql_db. "\ r \ n", FILE_APPEND );
Exit;
}
While (! Gzeof ($ z_handle )){
$ Each_gz_line = gzgets ($ z_handle, 4096 );
$ Line_to_array = explode ("\ t", $ each_gz_line );
// Filter invalid logs
If (! Empty ($ line_to_array [3]) &! Empty ($ line_to_array [2]) &! Empty ($ line_to_array [4]) {
$ Insert_value = "('". $ line_to_array [3]. "','". $ line_to_array [2]. "','". $ line_to_array [1]. "','". $ line_to_array [4]. "','". $ line_to_array [0]. "')";
$ Insert_ SQL = "insert into $ table_name (uid, email, ip, ctime) values $ insert_value ";
$ Table_id = abs (crc32 ($ line_to_array [2]) % 1000 );
$ Table_name = $ table_pre. $ table_id;
$ Result = mysql_query ($ insert_ SQL );
If (! $ Result ){
// Logs are recorded if an insert error occurs.
File_put_contents ($ exec_log, 'Table _ name = '. $ table_name. 'email ='. $ line_to_array [2]. "\ r \ n", FILE_APPEND );
}
}
}
$ Time_end = microtime_float ();
$ Diff = $ time_end-$ time_start;
File_put_contents ($ exec_log, 'Success to insert database, log_file is '. $ gz_log_file. 'time-consuming is ='. $ diff. "s \ r \ n", FILE_APPEND );
File_put_contents ($ exec_log, '*************************************** *********************************** '. "\ r \ n", FILE_APPEND );
Gzclose ($ z_handle );

The code above is very slow and intolerable. could you please help me with the massive logs? Log analysis? Script? Efficiency
------ Solution --------------------
Modify the table type to InnoDB, and then implement it using transactions,
If not, load file

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.