Simple data analysis based on Elk

Source: Internet
Author: User

Original link: http://www.open-open.com/lib/view/open1455673846058.html

Environment
    • CentOS 6.5 64-bit
    • JDK 1.8.0_20
    • Elasticsearch 1.7.3
    • LogStash 1.5.6
    • Kibana 4.1.4
Introduced
    • Elasticsearch is a well-known open source search engine, now many companies use Elk technology stack for log analysis, such as Sina use elk processing 3.2 billion records per day, detailed introduction can see here

    • Our data volume is not as large as Sina, one day normal level in 60 million or so, more when there is an billion record, by the Sina case inspired us to build their own simple data analysis system based on elk, just started to choose this reason: (1) I am a person toss things, (2) I will not front, But elk in the Kibana can be directly used, (3) Hadoop/hbase, Storm and other big data stacks need to learn costs, short-term difficulty is too large. (4) The number of machines available is also quite a dick wire.

Environment construction
    • Need to install Java, configure Java_home,bin directory to add to PATH environment variable
ElasticSearch
  • Download Elasticsearch, then unzip to/opt
  • execution/opt/elasticsearch-1.7.3/bin/elasticsearch-d can be started in the background, but in order to manage the elk three processes at the same time, I chose Supervisor for unified management
  • After starting Elasticsearch, we need to close the word breaker, the need for data analysis is not needed, and there are problems, but when as a search engine, this is necessary.
     
Kibana
    • Download Kibana, then unzip to/opt
    • Run/opt/kibana-4.1.4-linux-x64/bin/kibana, same for supervisor management
    • Visit http://YourIP:5601 to
Logstash
    • So far, we don't have a data source.
    • Download Logstash, unzip to/opt
    • Write the following configuration file

Our data comes from a topic in Kafka, the format is JSON, output to Elasticsearch index, varies by day

Simple data analysis
    • Ran for four hours, almost 890w of data.
    • Let's take a look at the OS version number of the device (Android 4.4.4 has the most devices, almost 3 million)

    • Equipment Model Distribution

Simple data analysis based on Elk

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.