Flume-kafka-logstash-elasticsearch-kibana Process Description

Last Update:2016-05-06 Source: Internet

Author: User

Tags geoip syslog zookeeper kibana logstash

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First of all, the installation of the tools are not in this explanation, many online, can be viewed by themselves.

Here we use examples to illustrate the configuration of each tool and the effect of the final presentation.

If we have a batch of tracklog logs that need to be displayed in real time elk:

First, collect logs, we use Flume tool

The log server-side placement agent is sent to collect collect, configured as follows:

Agent (can be multiple)

Agent.sources = S1

Agent.channels = M1

Agent.sinks = K1

Agent.sources.s1.interceptors=i1

Agent.sources.s1.interceptors.i1.type=org.apache.flume.interceptor.hostbodyinterceptor$builder

# for each one of the sources, the type is defined

Agent.sources.s1.type = Com.source.tailDir.TailDirSourceNG

Agent.sources.s1.monitorpath=d:\\trackloguc

Agent.sources.s1.channels = M1

agent.sources.s1.fileencode=gb2312

# each sink ' s type must be defined

Agent.sinks.k1.type = Avro

agent.sinks.k1.hostname=10.130.2.249

agent.sinks.k1.port=26003

#agent. Sinks.k1.type = Logger

Agent.sinks.k1.channel = M1

# each channel ' s type is defined.

#agent. Channels.m1.type = Memory

#agent. channels.m1.capacity=100000

Agent.channels.m1.type = File

Agent.channels.m1.checkpointdir=. \\mobilecheck

Agent.channels.m1.datadirs=. \\mobiledata

agent.channels.m1.transactioncapacity=3000000

agent.channels.m1.capacity=10000000

Collect

Agent.sources = S1

Agent.channels = M1 m2

Agent.sinks = K1 K2

Agent.source.s1.selector.type=replicating

# for each one of the sources, the type is defined

Agent.sources.s1.type = Avro

agent.sources.s1.bind=10.130.2.249

agent.sources.s1.port=26002

Agent.sources.s1.channels = M1 m2

#放入Kafka

Agent.sinks.k1.type = Org.apache.flume.plugins.KafkaSink

agent.sinks.k1.metadata.broker.list=bdc53.hexun.com:9092,bdc54.hexun.com:9092,bdc46.hexun.com:9092

Agent.sinks.k1.serializer.class=kafka.serializer.stringencoder

Agent.sinks.k1.request.required.acks=0

agent.sinks.k1.max.message.size=100

Agent.sinks.k1.producer.type=sync

Agent.sinks.k1.custom.encoding=utf-8

Agent.sinks.k1.custom.topic.name=tracklogt

Agent.sinks.k1.channel = m2

#channel采用file方式 because the log is too large

Agent.channels.m1.type = File

Agent.channels.m1.checkpointdir=/opt/modules/apache-flume-1.5.2-bin/tracklog-kafka/checkpoint

Agent.channels.m1.datadirs=/opt/modules/apache-flume-1.5.2-bin/tracklog-kafka/datadir

Agent.channels.m1.transactionCapacity = 1000000

agent.channels.m1.capacity=1000000

Agent.channels.m1.checkpointInterval = 30000

Second, the data into the Kafka

The above collect topic need to be Kafka in advance, the other steps into the Kafka has been configured in the Collect.

To create a topic statement reference:

%{kafka_home}/bin/kafka-topics.sh--create--zookeeper bdc41.hexun.com--replication-factor 3--partitions 3--topic Tracklogt

View topic Data Statement Reference:

%{kafka_home}/bin/kafka-console-consumer.sh--zookeeper bdc46.hexun.com:2181,bdc40.hexun.com:2181, bdc41.hexun.com:2181--topic Tracklogt

Three, from Kafka to Elasticsearch

We use the Logstash tool to take Kafka data into ES, mainly because the Logstash tool is more closely aligned with ES and Kibana.

If we want to get topic for TRACKLOGT data into the Es,logstash configuration as follows:

input{

Kafka {

Zk_connect = "bdc41.hexun.com:2181,bdc40.hexun.com:2181,bdc46.hexun.com:2181,bdc54.hexun.com:2181, bdc53.hexun.com:2181 "

group_id = "Logstash"

topic_id = "Tracklogt"

Reset_beginning = False # Boolean (optional)

Consumer_threads = 5 # number (optional)

Decorate_events = True

}

Filter {

#multiline可以多行合一行, the way the costumes are matched in a regular fashion.

Multiline {

Pattern = "^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\s\d{4}-\d{1,2}-\d{1,2}\s\d{2}:\d{2}:\d{2}"

Negate = True

what = "Previous"

}

#下面是用空格分隔每一行

Ruby {

init = "@kname =[' hostip ', ' dateday ', ' datetime ', ' IP ', ' Cookieid ', ' userid ', ' logserverip ', ' referer ', ' Requesturl ', ' Remark1 ', ' remark2 ', ' alexaflag ', ' ua ', ' Wirelessflag '] "

Code = "Event.append (hash[@kname. zip (event[' message '].split (//))])"

Remove_field = ["Message"]

add_field=>{

"Logsdate" = "%{dateday}"

}

#下面是替换logsdate字段中的-Is empty

mutate{

gsub=>["Logsdate", "-", "" "

# convert=>{"Dateday" = "Integer"}

}

#对于logsdate的格式不合规范的数据drop

If [logsdate]!~/\d{8}/{

drop{}

}

#对外网ip进行解析, geo-location information is automatically obtained

GeoIP {

Source = "IP"

# type = "Linux-syslog"

Add_tag = ["GeoIP"]

}

#对ua进行解析

useragent {

Source = "UA"

# type = "Linux-syslog"

Add_tag = ["useragent"]

}

output{

#入es

elasticsearch{

hosts = ["10.130.2.53:9200", "10.130.2.46:9200", "10.130.2.54:9200"]

flush_size=>50000

Workers = 5

Index=> "Logstash-tracklog"

}

Need to note:

1. The logsdate is replaced because: for example, the 2016-01-01 form of the field, into the ES, will be considered a time format, auto-completion is: 2016-01-01 08:00:00, resulting in kibana need to show the field by day is incorrect.

2. For some abnormal data, such as the Logsdate column should be a time number, such as 20160101, if there are some alphabetic characters of the abnormal data, in the Kibana display will be problematic, so the data dropped.

3. Because different business data have different format, need to deal with the data in filter, need to use the relevant plug-ins, related syntax, suggest more look at Logstash official documents.

Iv. data Display in Kibana

The following is a usage example, for reference only:

1. First enter the Kibana page, click the menu "Setting"-"Indices",

@ : You can fill in a wildcard form of a name so that you can monitor multiple indexes (typically data by talent index)

Click Create can be.

2. Click the Menu "Discover", select the setting map you just created, you can find the following:

@ then click Save in the upper right corner to enter a name.

@ This is the data source to be used in the following illustration, but you can also search for your data here, and note that it is best to double quotation marks on both sides of the string.

3. Click "Visualize" to make various icons.

You can choose which kind of chart to make, such as a histogram of daily statistics, and click the last one.

@order by The field type must be Date or int This is why it is important to emphasize the data type in the previous guide.

4. Finally click on the "DashBoard" menu to make the dashboard, you can set the previous discover and visualize the data and graphs stored in this instrument panel.

Flume-kafka-logstash-elasticsearch-kibana Process Description

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Flume-kafka-logstash-elasticsearch-kibana Process Description

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Flume-kafka-logstash-elasticsearch-kibana Process Description

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support