Use Rsyslog to collect logs to Kafka
The project needs to collect logs for storage and analysis. The data flow is rsyslog (Collection)-> kafka (Message Queue)-> logstash (cleanup)-> es, hdfs; today, we will first collect logs to kafka using rsyslog.
I. Environment preparation
Through viewing the official rsyslog documentation, we know that rsyslog supports kafka only after v8.7.0. The version change of V8.X can also be seen through ChangeLog.
The latest V8 stable version already provides the Rsyslog-kafka plug-in for the RPM package. Install yum directly and add the yum Source:
[rsyslog_v8]name=Adiscon CentOS-$releasever - local packages for $basearchbaseurl=http://rpms.adiscon.com/v8-stable/epel-$releasever/$basearchenabled=1gpgcheck=0gpgkey=http://rpms.adiscon.com/RPM-GPG-KEY-Adisconprotect=1
After addingyum install rsyslog rsyslog-kafka.x86_64
To complete the installation.
Ii. Configuration 1. Handling principles
- Input submit received ed messages to rulesets, zero or zero
- Ruleset contains rule, rule consist of a filter and an action list
- Actions consist of the action call itself (e.g. ": omusrmsg:") as well as all action-defining configuration statements ($ Action... directives VES)
2. Statement Types expression type
Generally, RainerScript type statements is used for concise and clear configuration declaration. For example:
mail.info /var/log/mail.log
3. Process Control
- Control structures
- Filter Condition
- Selector: the traditional mode. The format is as follows:
<facility>[,facility...][,*].[=,!]<priority>[,priority...][,*];<facility>[,facility...][,*].[=|!]<priority>[,priority...][,*]...
The default facility is auth, authpriv, cron, daemon, kern, lpr, mail, mark, news, security (same as auth), syslog, user, uucp and local0 through local7;
The default priority is debug, info, notice, warning, warn (same as warning), err, error (same as err), crit, alert, emerg, panic (same as emerg );
2) Property-based filters: new filter type. The format is as follows:
:property, [!]compare-operation, "value"
Corresponding to the name, comparison character, and fields to be compared. Comparison characters include contains, isequal, startswith, regex, and ereregex.
3) Expression based filters:
if expr then action-part-of-selector-line
- BSD-style blocks:
- Example:
if $syslogfacility-text == 'local0' and $msg startswith 'DEVNAME' and not ($msg contains 'error1' or $msg contains 'error0') then /var/log/somelog
4. Data Processing: supports set, unset, and reset operations.
Note: Only message json (CEE/Lumberjack) properties can be modified by the set, unset andreset statements
5. input
There are many input modules. We use the imfile module as an example. This module transfers all text files to syslog lines by line.
input(type="imfile" tag="kafka" file="analyze.log" ruleset="imfile-kafka"[, Facility=local.7])
6. outputs
It is also called actions. The processing action is in the following format:
action ( type="omkafka" topic="kafka_test" broker="10.120.169.149:9092" )
7. Rulesets and Rules
Rulesets includes multiple rule rules. One rule is a way for rsyslog to process messages. Each rule includes filter and actions.
input(type="imfile" tag="kafka" file="analyze.log" ruleset="rulesetname")ruleset(name="rulesetname") { action(type="omfile" file="/path/to/file") action(type="..." ...) /* and so on... */}
Through the ruleset configuration in the input, enter the input stream into the ruleset for rule matching, and then execute the action operation to complete streaming processing.
8. Queue parameters
Enter different input streams into different queues to process data in parallel. It is usually configured in ruleset or action. By default, there is only one queue. Configuration Parameter example
action(type="omfwd" target="192.168.2.11" port="10514" protocol="tcp" queue.filename="forwarding" queue.size="1000000" queue.type="LinkedList" )
9. templates
This is an important feature of rsyslog. It allows you to customize the input stream format. It can also be used to dynamically generate log files. The default format is the original format.
The general expression is as follows:
template(parameters) { list-descriptions }
- List: list template, including name, type = "list", multiple constant and property pairs.
template(name="tpl1" type="list") { constant(value="Syslog MSG is: '") property(name="msg") constant(value="', ") property(name="timereported" dateFormat="rfc3339" caseConversion="lower") constant(value="\n") }
- String: string custom format module, which consists of name, type = "string", string = "<onstant text and replacement variables>", for example
%TIMESTAMP:::date-rfc3339% %HOSTNAME%%syslogtag%%msg:::sp-if-no-1st-sp%%msg:::drop-last-lf%\n"
Use custom variables and processing methods (property replacer) to obtain globally readable log variables for each log field.
Note:
- Original format: the format before v6,
$template strtpl,"PRI: %pri%, MSG: %msg%\n"
.
- Use the template parameter in action to bind templates and action, as shown in figure
action(template=TEMPLATENAME,type="omfile" file="/var/log/all-msgs.log")
Iii. Instances
Add an instance that transfers nginx access logs to kafka through rsyslog, put nginx_kafka.conf in the/etc/rsyslog. d directory, and restart rsyslog.
# Load omkafka and imfile module (load = "omkafka") module (load = "imfile ") # nginx templatetemplate (name = "nginxAccessTemplate" type = "string" string = "% hostname % <-+> % syslogtag % <-+> % msg % \ n ") # rulesetruleset (name = "nginx-kafka") {# log forwarding kafka action (type = "omkafka" template = "nginxAccessTemplate" confParam = ["compression. codec = snappy "," queue. buffering. max. messages = 400000 "] partitions. number = "4" topic = "test_nginx" broker = "10.120.169.149: 9092" queue. spoolDirectory = "/tmp" queue. filename = "test_nginx_kafka" queue. size = "360000" queue. maxdiskspace = "2G" queue. highwatermark = "216000" queue. discardmark = "350000" queue. type = "queue list" queue. dequeuebatchsize = "4096" queue. timeoutenqueue = "0" queue. maxfilesize = "10 M" queue. saveonshutdown = "on" queue. workerThreads = "4")} # define the message source and set the relevant actioninput (type = "imfile" Tag = "nginx, aws" File = "/var/log/access. log "Ruleset =" nginx-kafka ")
Check whether the conf file is correctly run in rsyslogd debug mode.rsyslogd -dn
Run, view log output results, or directly runrsyslogd -N 1
Check whether the conf file is correct.
Deploy Rsyslog + LogAnalyzer + MySQL central log server in CentOS7.3
Deploy a log server in Rsyslog + Loganalyer + MySQL
Rsyslog log collection service and display with Loganalyzer Tool