Analysis of apache access log access. log and how to import it to mysql Apache
BitsCN.com
Analysis of apache access log access. log and how to import it to mysql
Access. log import mysqlload data local inf
First, let's explain the apache access log,
220.181.108.92--[22/Aug/2013: 23: 59: 59 + 0800] "GET/min/f =/media/js/jquery-1.7.2.media/js/global_interaction.js & 201308221836 HTTP/1.1" 200 70550 "http://adfdfs.com.cn1071-2130-1.html" "Mozilla/5.0 (X11; u; Linux x86_64; en-US; rv: 1.9) Gecko Minefield/3.0"
The first information is the address of the remote host, that is, it indicates who is accessing the website.
The second item is blank and replaced with a "-" placeholder. In fact, this is the case most of the time. This location is used to record the visitor's identity. it is not just the visitor's login name, but the viewer's email address or other unique identifier, usually just "-";
The third item is blank and replaced by a placeholder. This location is used to record the name provided by the viewer for identity authentication. Of course, this information will not be blank if users are required to perform authentication on some content of the website. However, for most websites, this field is still blank in most records of log files.
The fourth item indicates the access time record of the visitor, which does not need to be explained. The last "+ 800" of the time information indicates that the time zone of the server is 8 hours after UTC.
The fifth item is the most useful information in the entire log. it tells us what kind of request the server receives.
Item 6: 200 status code
Seventh, 70500 is the total number of bytes sent by the server to the client.
Item 8: The Directory or URL of the customer when making the request.
Item 9: client details
Create a table as needed
Create table if not exists 'Apache _ log '(
'Id' int (11) not null AUTO_INCREMENT COMMENT 'id ',
'Hostip' char (32) not null default '0. 0.0.0 'COMMENT 'remote host IP address ',
'Marking' char (30) comment' viewer identifi ',
'Client _ name' char (30) COMMENT 'viewer name ',
'Addtime' char (64) not null comment' browsing time ',
'Time _ zone 'char (8) not null comment 'timezone ',
'Method' char (8) not null comment' method ',
'Resource' char (64) not null comment 'requested resource ',
'Protocol' char (8) comment' request protocol ',
'Status' int (5) comment' status code ',
'Bytes 'int (5) comment' total number of bytes sent to the client ',
'Referer' char (128) Comment' The directory or URL of the customer when making the request ',
'Client _ info' char (128) COMMENT 'directory or URL of the customer when making the request ',
Primary key ('id ')
) ENGINE = MyISAM default charset = utf8 COLLATE = utf8_bin;
When awk is used to process apache logs and separate each item with spaces, you will be confused by the following numbers. What is this. First, awk '{print 0. a value of 0 is written in each line of apache log text. The reason is that the first field in the table structure is an auto-incrementing column. I only need to import 0, and 0 indicates the default value, it can automatically increase.
Cat apache_log | awk '{print 0 "" $1 "" $2 "" $3 "" for (I = 4; I <= NF; I ++) printf $ I; printf "/n"} '> apache_log.txt
Mysql> load data infile "/root/apache_log.txt" into table apache_log fields terminated by "" lines terminated by '/N ';
ERROR 13 (HY000): Can't get stat of '/root/apache_log.txt' (Errcode: 13-Permission denied)
Mysql> load data local infile "/root/apache_log.txt" into table apache_log fields terminated by "" lines terminated by '/N ';
Query OK, 2000 rows affected, 2672 warnings (0.61 sec)
Records: 2000 Deleted: 0 Skipped: 0 Warnings: 2672
By the way, compare the differences between load data infile and load data local infile
If you do not provide a local address, the server locates it as follows:
1) If your filename is an absolute path, the server will start searching for the file from the root directory.
2) If your filename is a relative path, the server will start searching for the file from the database data directory.
If you provide local, the file will be located as follows:
1) If your filename is an absolute path, the client will start searching for the file from the root directory.
2) If your filename is a relative path, the client will start searching for the file from the current directory.
Mysql> desc apache_log;
+ -------------- + ----------- + ------ + ----- + --------- + ---------------- +
| Field | Type | Null | Key | Default | Extra |
+ -------------- + ----------- + ------ + ----- + --------- + ---------------- +
| Id | int (11) | NO | PRI | NULL | auto_increment |
| Hostip | char (32) | NO | 0.0.0.0 |
| Marking | char (30) | YES | NULL |
| Browser_name | char (30) | YES | NULL |
| Addtime | char (64) | NO | NULL |
| Time_zone | char (8) | NO | NULL |
| Method | char (8) | NO | NULL |
| Resource | char (64) | NO | NULL |
| Protocol | char (8) | YES | NULL |
| Status | int (5) | YES | NULL |
| Bytes | int (5) | YES | NULL |
| Refer | char (128) | YES | NULL |
| Client_info | char (128) | YES | NULL |
+ -------------- + ----------- + ------ + ----- + --------- + ---------------- +
13 rows in set (0.00 sec)
Mysql> select * from apache_log limit 3;
+ ---- + ----------------- + --------- + -------------- + --------------------- + ----------- + -------- + ---------- + -------- + ------- + Accept +
| Id | hostip | marking | browser_name | addtime | time_zone | method | resource | protocol | status | bytes | refer | client_info |
+ ---- + ----------------- + --------- + -------------- + --------------------- + ----------- + -------- + ---------- + -------- + ------- + Accept +
| 1 | 220.181.108.116 |-| [21/Aug/2013: 23: 59: 59 | + 0800] | "GET |/search/prj/tag/2089-2124-2350-2354-1.html | HTTP/1.1 | 200 | 8181 |"-"|" Mozilla/5.0 (compatible; baiduspider/2.0; + http://www.baidu.com/search/spider.html) "|
| 2 | 220.181.165.5 |-| [21/Aug/2013: 23: 59: 59 | + 0800] | "GET |/event/bushanping/admin_login.php | HTTP/1.1 | 200 | 32095 |"-"|" Mozilla/5.0 (Windows; U; WindowsNT5.1; zh-CN; rv: 1.9.1.1) Gecko/20090715 Firefox/3.5.1 "|
| 3 | 220.181.165.136 |-| [21/Aug/2013: 23: 59: 59 | + 0800] | "GET |/search/diy/server-status/| HTTP/1.1 | 404 | 21795 |"-"|" Mozilla/5.0 (Windows; U; windowsNT5.1; zh-CN; rv: 1.9.1.1) Gecko/20090715 Firefox/3.5.1 "|
+ ---- + ----------------- + --------- + -------------- + --------------------- + ----------- + -------- + ---------- + -------- + ------- + Accept +
3 rows in set (0.00 sec)
Imported
BitsCN.com