In the statistical project, the most difficult to implement is the collection of log data. Log distribution in the country each room, and the data volume is relatively large, like rsync+inotify this way obviously can not meet the requirements of fast log synchronization. Of course, we can also use FLUENTD and flume to collect log data, in addition to this we can write their own set of simple.
I wrote this log analysis system process is:
Collect the data on the client and send the data to the server by Redis pub Way
2 server-side is a Redis sub he would put the data together in a file, or filter out the current
Client collects updated data for logs
#!/bin/bash
date= ' DATE +%s '
logfile=$1
if [!-F $];then
echo "LOG file did not give or it's not a file"
fi
sleep_time= "2"
count_init= ' wc-l ${logfile}|awk ' {print $} ' while
true '
does
date_new= ' Date +%s '
# date=$ (date +%s)
count_new= ' wc-l ${logfile}|awk ' {print} '
add_count=$ ((${count_ New}-${count_init})
count_init=${count_new}
if [!-n ' ${add_count} ']
then
add_count=0
Fi
qps=$ ((${add_count}/${sleep_time}))
info= ' Tail-n ${add_count} ${logfile} '
echo $info
# we can pass the value of info.
echo "Then QPS at ' date-d" 1970-01-01 UTC ${date_new} seconds "+"%y-%m-%d%h:%m:%s "' Is" ${qp S}
# echo "date_new:" $DATE _new "Date_plus:" $DATE _plus sleep
$sleep _time
Done