When using Logstash, some regular expressions are written for finer-grained cutting logs. How to use
input { file { type => "billin" path => "/data/logs/product/result.log" } } filter { grok { type => "billin" pattern => "%{BILLINCENTER}" patterns_dir => "/data/logstash/patterns/my_patterns" } } output { redis { host => "192.168.50.13" data_type =>"list" key => "logstash:redis" } }
The following content is a regular expression file: Cat My_patterns
TAB \ t META \-+ WZ ([^]*) ipport%{ip}:%{posint}|%{meta} REQUEST (?:/ [a-za-z0-9$.+!* ' (), ~:#%_-]*) +\? [a-za-z0-9$.+!* ' (), ~#%&/=:;_-]* TY (?:(?) <!\\) (?: "(?: \ \.| [^\\ "]+) *" #EVERYURL ((\w+://)? ( [^\.] +)(\. [^/:]+] (: \d*)? ([^#]*))| -#EVERYURL ((\w+://)? ( [^\.] +)(\. [^/:]+)? ([^#]*)) +)| (\w+) |-#EVERYURL ((\w+://)? ( [^\.] +)(\. [^/:]+)? ([^#]*)) +) |-Everyurl (http://+[\w\d:#@%/;$ () ~_?\+-=\\\.&]+) | (-) #Logformat ####### #nginx access log example######## #122.137.199.113 "122.137.199.113" www.xxxx.com 172.16.10.110172.16.12.114:8018/jun/2013:15:51:03 +0800get/g/getsalecounts.do?rnd=1371541857448&showstatus= True&goodsids=215abd2e8fa95bc8 http/1.120078 "http://www.xxxx.com/goods-215abd2e8fa95bc8.html" "mozilla/5.0 ( Compatible MSIE 9.0; Windows NT 6.1; WOW64; trident/5.0; NP06) "a8fdb711-a695-43bd-abdd-a224fb07350d" ############################### nginxaccesslog%{IP:remote_ip}% {Space}%{qs:x_forward}%{space}%{hostname:server_name}%{space}%{ip:server_ip}%{space}%{ipport:upstrem_ip}%{space}%{ Httpdate:timestamp}%{space}%{word:verb}%{space}%{wz:request}%{space}http/%{number:httpversion}%{space}%{number : Response}%{space}%{number:bytes}%{space}%{qs:uri}%{space}%{qs:agent}%{space}%{qs:guid} #picture p0.xxxx.com Access log. 2012.07.19 Add Piclog%{ip:remote_ip}%{space}%{qs:x_forward}%{space}%{hostname:server_name}%{space}%{ip:server_ip }%{space}%{httpdate:timestamp}%{space}%{word:verb}%{space}%{wz:request}%{space}http/%{number:httpversion}%{ Space}%{number:response}%{space}%{number:bytes}%{space}%{qs:uri}%{space}%{qs:agent} #iis Log Format 20120618 add ########## #iis log example############### #2013 -06-18 08:00:00 172.16.10.233 get/js/functions.js-80-117.1 36.34.2 mozilla/5.0+ (linux;+u;+android+4.1.2;+zh-cn;+lt22i+build/6.2.a.0.400) +applewebkit/534.31+ (KHTML,+like+ Gecko) +ucbrowser/9.0.1.275+u3/0.8.0+mobile+safari/534.31 200 0 0 0 # ################################## iislog%{date_eu:log_date}%{time:log_time}%{ip:server_ip}%{WORD:verb}%{URIPAT H:uri_stem}%{wz:uri_query}%{posint:s_port}%{wz:cs_username}%{ip:c_ip}%{wz:agent}%{posint:request}%{POSINT: Substatus}%{posint:win32_status}%{posint:time_taken} #2012/07/12 add ZW \w+ # #java Date Example # 2012-11-27 14:52:42 ############ java_date%{date_eu}%{time} earthlog \[%{java_date:log_date }\] \[%{word:level}\] \[%{word:action}\] \[\{"desc": "%{zw:desc}", "DateTime":%{zw:datetime}, "UserId": "%{zw:userid} "," code ":%{zw:code}\}\] eagleupdate \[%{java_date:log_date}\] \[%{word:level}\] \[%{word:action}\] \[\{" desc ":%{QS: DESC}, "DateTime":%{zw:datetime}, "UserId": "%{zw:userid}", "Code":%{zw:code}, "OrderId": "%{zw:orderid}" \}\] Eaglelogin \[%{java_date:log_date}\] \[%{word:level}\] \[%{word:action}\] \[\{"desc":%{qs:desc}, "DateTime":%{ZW: DateTime}, "UserId": "%{zw:userid}", "Code":%{zw:code}\}\]#2012/10/23 Add LJF (-\s+-) Resinlog%{ip:remote_ip}%{space}%{number}%{space}%{ljf}%{space}\[%{httpdate:time Stamp}\]%{space} "%{word:verb}%{space}%{wz:request}%{space}http/%{number}"%{space}%{number:response}%{space}%{ Number:bytes}%{space}%{qs:uri}%{space}%{qs:agent}%{space}%{qs:session} #RESINLOG%{ip:ip}%{NUMBER}--\[%{HTTPDAT E:time}\] "%{word:verb}%{wz:request} Http/%{number}"%{numver:response}%{number:bytes}%{QS:uri}%{QS:agent}%{QS: Session} #2012/11/13 add DKH (\{.*\}) storegrep (\[\/\/\/\-\] INFO \-) DHMH ([^;|=]*) Centerlog%{java_date}%{storegrep} BID=%{NUMBER:BID};BR=%{DHMH:BR}; BP=%{DKH:BP} #2012/11/20 add javagrep (\[\/\/\/\-\]) ordercentererr%{java_date} \[RMI TCP Connect ion\ (%{numer:threadid}\)-%{ip:ip}\]%{javagrep}%{word:level}%{space}%{wz}-%{qs:message} ORDERCENTERRESULT%{JAV A_date} \[RMI TCP connection\ (%{numer:threadid}\)-%{ip:ip}\]%{javagrep}%{word:level}%{space}%{WZ}-%{dkh:message} #2012/11/27 Add # # #log example####### #2013 -06-18 15:28:12 INFO: {message: Media The body passed the parameter {"UID": ["0"], "CID": ["a100054947| | 0000 "]," url ": [" Http://www.xxxx.com/?from=lianmeng-weiyi "]," src ": [" Weiyi "]}} # PARTNER%{java_date:timestamp} %{word:level}:%{dkh:message} #2012/11/28 add Partnerapi%{java_date:timestamp}%{wz:level}:%{dkh:message} #2013/06/18 Add #pattern all in the ' [Adskfjl}{\] ' Fkh ([^;] *) ###### #aether. log##### #[2013-06-18 15:27:29] [INFO] [Com.tuan.web.controller.IndexController] [{message: Sethotstore#hot store Size:5}] aetherlog \[%{java_date:timestamp}\] \[%{wz:level}\] \[%{wz:method}\]%{FKH:message} USERNAME [a-za-z0-9._-]+ USER%{username} INT (?: [+]? (?: [0-9]+)] Base10num (? <![ 0-9.+-]) (? >[+-]? (?:(?: [0-9]+ (?: \. [0-9]+]?) | (?:\. [0-9]+)] Number (?:%{base10num}) base16num (? <![ 0-9A-FA-F]) (?: [-+]? (?: 0 x)? (?: [0-9a-fa-f]+)) base16float \b (? <![ 0-9a-fa-f.]) (?:[+-]? (?: 0 x)? (?:(?: [0-9a-fa-f]+ (?: \. [0-9a-fa-f]*]?) | (?:\. [0-9a-fa-f]+)] \b posint \b (?: [1-9][0-9]*) \b Nonnegint \b (?: [0-9]+) \b WORD \b\w+\b notspace \s + SPACE \s* DATA. *? Greedydata. * #QUOTEDSTRING (?:(? <!\\) (?: "(?: \ \.| [^\\"]) *"| (?:‘(?:\ \.| [^\\ ']) * ') | (?:`(?:\ \.| [^\\`]) * ')) quotedstring (?> (? <!\\) (?> "(? >\\.| [^\\"]+)+"|""| (?> ' (? >\\.| [^\\ ']+) + ') | (?> ' (? >\\.| [^\\ ']+) + ') | ') UUID [a-fa-f0-9]{8}-(?: [a-fa-f0-9]{4}-) {3}[a-fa-f0-9]{12} # Networking MAC (?:%{cisc OMAC}|%{WINDOWSMAC}|%{COMMONMAC}) Ciscomac (?:(?: [A-fa-f0-9]{4}\.) {2} [A-fa-f0-9] {4}) Windowsmac (?:(?: [a-fa-f0-9]{2}-) {5}[a-fa-f0-9]{2}) Commonmac (?:(?: [a-fa-f0-9]{2}:) {5}[a-fa-f0-9]{2}) IPV6 (([0-9a-fa-f]{1,4}:) {7} ([0-9a-fa-f]{1,4}|:)) | (([0-9a-fa-f]{1,4}:) {6} (: [0-9a-fa-f]{1,4}| ( (25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\D)) {3}) |:)) | (([0-9a-fa-f]{1,4}:) {5} (((: [0-9a-fa-f]{1,4}) {) |:( (25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d)) {3}) |:)) | (([0-9a-fa-f]{1,4}:) {4} (((: [0-9a-fa-f]{1,4}) {1,3}) | ( (: [0-9a-fa-f]{1,4})?:( (25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d)) {3}) |:)) | (([0-9a-fa-f]{1,4}:) {3} (((: [0-9a-fa-f]{1,4}) {1,4}) | ( (: [0-9a-fa-f]{1,4}) {0,2}:((25[0-5]|2[0-4]\d|1\d\d|[ 1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d)) {3}) |:)) | (([0-9a-fa-f]{1,4}:) {2} (((: [0-9a-fa-f]{1,4}) {1,5}) | ( (: [0-9a-fa-f]{1,4}) {0,3}:((25[0-5]|2[0-4]\d|1\d\d|[ 1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d)) {3}) |:)) | (([0-9a-fa-f]{1,4}:) {1} (((: [0-9a-fa-f]{1,4}) {1,6}) | ( (: [0-9a-fa-f]{1,4}) {0,4}:((25[0-5]|2[0-4]\d|1\d\d|[ 1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d)) {3}) |:)) | (:(((: [0-9a-fa-f]{1,4}) {1,7}) | ((: [0-9a-fa-f]{1,4}) {0,5}:((25[0-5]|2[0-4]\d|1\d\d|[ 1-9]?\d) (\. ( 25[0-5]|2[0-4]\d|1\d\d| [1-9]?\d)) {3})) |:))) (%.+)? IPV4 (? <![ 0-9]) (?:(? : 25[0-5]|2[0-4][0-9]| [0-1]? [0-9] {.}) [.] (?: 25[0-5]|2[0-4][0-9]| [0-1]? [0-9] {.}) [.] (?: 25[0-5]|2[0-4][0-9]| [0-1]? [0-9] {.}) [.] (?: 25[0-5]|2[0-4][0-9]| [0-1]? [0-9] {)}) (?! [0-9]) IP (?:%{ipv6}|%{ipv4}) HOSTNAME \b (?: [0-9a-za-z][0-9a-za-z-]{0,62}) (?: \. (?: [0-9a-za-z][0-9a-za-z-]{0,62})) *(\.?| \b) HOST%{hostname} iporhost (?:%{hostname}|%{ip}) Hostport (?:%{iporhost=~/\./}:%{posint}) # Paths PATH (?:%{unixpath}|%{winpath}) Unixpath (? >/(? >[\w_%[email protected]:.,-]+|\\.) *) + #UNIXPATH (? <![ \w\/]) (?:/ [^\/\s?*]*) + TTY (?:/ dev/(Pts|tty ([PQ])?) (\w+)?/? (?: [0-9]+)] Winpath (? >[a-za-z]+:|\\) (?: \ \[^\\?*]*) + Uriproto [a-za-z]+ (\+[a-za-z+]+)? Urihost%{iporhost} (?::%{posint:port})? # Uripath comes loosely from RFC1738, but mostly from what Firefox # doesn ' t turn into%XX uripath (?:/ [a-za-z0-9$.+!* ' () {},~:; [email protected]#%_\-]*] + #URIPARAM \? (?: [a-za-z0-9]+ (?: = (?: [^&]*))?] (?:& (?: [a-za-z0-9]+ (?: = (?: [^&] *))?)?) *)? Uriparam \? [a-za-z0-9$.+!* ' | () {},[email protected]#%&/=:;_?\-\[\]]* Uripathparam%{uripath} (?:%{uriparam})? URI%{uriproto}://(?:%{user} (?:: [^@]*) [email protected])? (?:%{urihost})? (?:%{uripathparam})? # Months:january, Feb, 3, December, MONTH \b (?: Jan (?: uary)? | Feb (?: ruary)? | Mar (?: ch)? | APR (?: il)? | may| June (?: E)? | Jul (?: Y)? | (?: UST)? | Sep (?: tember)? | OCT (?: o ber)? | Nov (?: Ember)? | Dec (?: Ember)?) \b Monthnum (?: 0?) [1-9]|1[0-2]) monthday (?:(?: 0 [1-9]) | (?: [12][0-9]) | (?: 3[01]) | [1-9] # Days:monday, Tue, Thu, etc ... Day (?: Mon (?:d ay)? | Tue (?: sday)? | Wed (?: nesday)? | Thu (?: rsday)? | Fri (?:d ay)? | Sat (?: urday)? | Sun (?:d ay)?) # years? Year (?) {>\d\d} # Time:HH:MM:SS #TIME \d{2}:\d{2} (?:: \d{2} (?: \. \d+)?)? # I ' m still on the fence about using Grok to perform the time match, # since it ' s probably slower. # Time%{posint<24}:%{posint<60} (?::%{posint<60} (?:\.%{posint})?)? HOUR (?: 2[0123]|[ 01]? [0-9]) MINUTE (?: [0-5][0-9]) # ' is a leap second in the most time standards and thus is valid. SECOND (?:(?: [0-5][0-9]|60) (?: [:.,][0-9]+)?) Time (?! <[0-9])%{hour}:%{minute} (?::%{second}) (?! [0-9]) # datestamp is yyyy/mm/dd-hh:mm:ss. Uuuu (or something like it) Date_us%{monthnum}[/-]%{monthday}[/-]%{year} date_eu%{monthday}[./-]%{monthnum }[./-]%{year} iso8601_timezone (?: z|[ +-]%{hour} (?::?%{minute})) Iso8601_second (?:%{second}|60) timestamp_iso8601%{year}-%{monthnum}-%{monthday} [T]%{hour}:?%{minute} (?::?%{second})?%{iso8601_timezone}? DATE%{date_us}|%{date_eu} datestamp%{date}[-]%{time} TZ (?: [pmce][sd]t| UTC) datestamp_rfc822%{day}%{month}%{monthday}%{year}%{time}%{tz} datestamp_other%{day}%{month}%{MO Nthday}%{time}%{tz}%{year} # Syslog Dates:month day HH:MM:SS syslogtimestamp%{month} +%{monthday}%{tim E} PROG (?: [\w._/%-]+) Syslogprog%{prog:program} (?: \ [%{posint:pid}\])? Sysloghost%{iporhost} syslogfacility <%{NONNEGINT:facility}.%{NONNEGINT:priority}> httpdate%{monthda Y}/%{month}/%{year}:%{time}%{int} # shortcuts QS%{quotedstring} # Log formats Syslogbase%{s Yslogtimestamp:timestamp} (?:%{syslogfacility})?%{sysloghost:logsource}%{syslogprog}: Combinedapachelog%{IPORHOS T:clientip}%{user:ident}%{user:auth} \[%{httpdate:timestamp}\] "(?:%{word:verb}%{notspace:request} (?: HTTP/%{ Number:httpversion})? | %{data:rawrequest}) "%{number:response} (?:%{number:bytes}|-)%{qs:referrer}%{qs:agent} # Log levels LOGLEV EL ([a-a]lert| Alert| [t|t]race| trace| [D|d]ebug| Debug| [n|n]otice| notice| [i|i]nfo|info| [W|w]arn? (?: ing)? | WARN? (?: I NG)? | [E|E]RR? (?: O R)? | Err? (?: O R)? | [C|c]rit? (?: ical)? | Crit? (?: I CAL)? | [f|f]atal| fatal| [s|s]evere| severe| Emerg (?: ency)? | [Ee]merg (?: ency)?)
Logstash Grok split matching log