Use the Hive regular parser RegexSerDe to analyze nginx logs and regexserdenginx
1. Environment:
Hadoop-2.6.0 apache-hive-1.2.0-bin
2. Use Hive to analyze nginx logs. The website access logs are as follows:
Cat/home/hadoop/hivetestdata/nginx.txt
192.168.1.128--[09/Jan/2015: 12: 38: 08 + 0800] "GET/avatar/helloworld.png HTTP/1.1" 200 1521 "http://write.blog.csdn
PHP. The FastCGI is a development improvement from CGI. The main disadvantage of the traditional CGI interface approach is poor performance, because every time the HTTP server encounters a dynamic program, it needs to restart the script parser to perform the parsing, and the result is returned to the HTTP server. This is almost not available when handling high concurrent access. In addition, the traditional CGI interface is also very poor security,
Recently, in the company's Nginx log analysis, one of the requirements is to extract the daily access to the TOP10 page this month, and its number of visits.
To do this, the first thing to do is to clean up effective page access. I use the exclusion method to remove the. js. css and other access. But initially, I was not able to fully understand the request to remove the suffix.
Cleaning, sampling, cleaning
Zabbix as a monitoring software is very flexible, supporting the data types are very rich, such as digital (no plus or minus), digital (floating-point), log, text and so on. All we need to do is use the script to collect the data, then Zabbix collect and draw, and set the alarm line. Here we learn to use Zabbix to monitor memcached, PHP-FPM, Tomcat, Nginx, MySQL, and web logs. memcached MonitoringCustom Key
This article mainly records the Nginx log statistics process given to Python, the main reason is the recent system is often unknown program crazy crawl data, although did a anti-crawling mechanism, but still have to find out which IP access times more. The idea is to find these IP rankings by analyzing the ngxin logs. The procedures for specific scenarios include
on the path, assuming that the user name and user group set by the nginx USR directive are www, and the logs directory's username and group is root, then the log file cannot be created;
Nginx Log Cutting script
#!/usr/bin/env python
import datetime,os,sys,shutil
1. Edit the shell program for log cutting and customize the Directory
[Plain] view plaincopy
# Vi/data/nginx/cut_nginx_log.sh
Enter the code:
[Python] view plaincopy
#! /Bin/bash
# This script run at 00:00
Function cutAccess ()
{
Dir = $1
Newdir = "$ {dir}/$ (date-d" yesterday "+" % Y ")/$ (date-d" yesterday "+" % m ")"
Suffix = $ (date-d "yesterday
If a 502 Bad GateWay error occurs in nginx, the program may fail or the response may be slow. However, sometimes Nginx itself has problems. For example, after nginx is restarted, all accesses are 502 errors. The error log contains a large number of no live upstream logs.We used to misuse the check upstream module. Let&
entity label. To learn more about this, you can refer to Using NGINX as an application Gateway with Uwsgi and Django.The following code configures NGINX, which is used to cache various types of static files, including JPEG files, GIF, PNG files, MP4 video files, PPT, etc. (using your URL instead of www.example.com):server { # substitute your web server‘s URL for "www.example.com" server_name www.exam
The interview will be quilt cover to the problem is: to give the Web server access logs, please write a script to statistics access to the top 10 of the IP? What are the top 10 requests for access? When you have a taste of goaccess, you understand that these problems, in addition to testing your script memorization ability, the only role is to install a or C.For nginx log analysis, there are many tools to m
Reference address
http://www.ttlsa.com/nginx/nginx-modules-ngxtop-ttlsa/
Ngxtop Real-time parsing nginx access log, and the processing results output to the terminal, functions similar to the system command top, so the software named Ngxtop. With Ngxtop, you can immediately understand the current
: This article mainly introduces [log analysis] to extract valid requesturi from nginx logs. For more information about PHP tutorials, see. Recently, I am working on the company's nginx log analysis. one of the requirements is to extract the top 10 pages visited every day this month and their visits.
To meet this r
; ')For data in Cur.fetchall ():Print dataV. Installation of Djangowget https://www.djangoproject.com/m/releases/1.5/Django-1.5.4.tar.gzTar xzvf django-1.5.4.tar.gzCD Django-1.5.4Python setup.py InstallTest your Django InstallationEnter Python in the terminal, enter Python interactive mode and enter as follows:>>>import DjangoDjango. VERSIONThe version number is displayed in normal conditions.Six,
Nginx log cutting and recording Cookies [python] www.2cto.com #! /Bin/bash # log file storage directory logs_path = "/data/Service/nginx/logs" # Name of the log file. Separate multiple log
visitor, and the number of visitors (00:00-24:00) within a day. Access to the same cookie within one day is calculated 1 times.Standalone IPThe same IP address is computed only once in the 00:00-24:00Second, nginx configurationVersionNginx version:nginx/1.10.2Log Configuration entryAccess_log/var/log/access.log access;Log format
Log_format access
writing (or such) F:/python25/
Test both writing methods.
Use this section to replace CONF/nginx. conf and back up the original file.# User nobody;
Worker_processes 1;
# Error_log logs/error. log;
# Error_log logs/error. Log notice;
# Error_log logs/error. Log Info;
# PID logs/n
Be able to monitor the client link response time and response return code in the log, monitor the number of 50.100.1000 over 10 seconds, monitor the average response time, and monitor the number of response return codes, respectively, 302.404.502Nginx Log Format:
| Log_format upstream2 ' $remote _addr $remote _user [$time _local] "$request" $http _host '' $body _bytes_sent ' $http _referer ' "$http _user_a
, nginx, uwsgi, and django are successfully connected.
6. Optimize uwsgi startupWhen starting the uwsgi service, we use the command line method to start and set parameters. This is not easy to remember and may forget the parameters. Here we will introduce another way to set parameters, that is, using the configuration file to record uwsgi parameters, loading parameters from the configuration file at startup. The parameters are as follows:
# WHPAIWecha
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.