https://mp.weixin.qq.com/s?__biz=MjM5MDkwNjA2Nw==&mid=2650373776&idx=1&sn= e823e0d8d64e6e31d22e89b3d23cb759&scene=1&srcid=0720bzuzpl916ozwvgfiwdur&key= 77421cf58af4a65382fb69927245941b4402702be12a0f1de18b1536ac87135d4763eab4e820987f04883090d6c327b6&ascene=0 &uin=mjm1nzqymju4ma%3d%3d&devicetype=imac+macbookpro11%2c3+osx+osx+10.9.5+build (13F1134) &version= 11020201&pass_ticket=%2ffa%2bpunyakluvklmowgfej98fet9nhj4aewiblccnxmupsxriailomhskhy6z2cz
What is 0x01 elk?
Elk is an abbreviation for the three applications of Elasticsearch, Logstash, and Kibana. Elasticsearch is referred to as ES, and is used primarily for storing and retrieving data. The Logstash is primarily used to write data to ES. Kibana is mainly used to display data.
Why 0x02 with Elk?
Traditional social work libraries are usually built with MySQL databases, and the retrieval efficiency is very low under quite large data. Querying in this relational database requires explicitly specifying the column name. In the ES, full-text indexing is available, and the response to big data queries is almost always millisecond-fast! Elk was originally used in the Journal of Big Data Collection and analysis, its terrible speed as a social work pool is also a good choice.
0X03 Installation and Configuration
The premise is that you need a large hard disk, approximately 2.5 times times the size of the social library (es will create the relevant index).
You need to install at least 7 of Java version. and configure the JAVA_HOME environment variable.
Because the installation is very simple, only need to download the corresponding compressed files, decompression can be. Don't dwell on it here.
The environment I am demonstrating in Windows8.1, ES is not allowed to run in Linux with root privileges.
ES 2.0.0 Logstash 2.0.0 Kibana 4.2.0
To modify a configuration file:
Es/config/elasticsearch.yml
Cluster.name:esdemo (description information of the cluster)
NODE.NAME:63 (node name)
network.host:192.168.1.5 (bound IP address)
HTTP.PORT:9200 (port number default 9200)
Some Linux environments need to be modified Es/bin/elasticsearch file additions
Export JAVA_HOME=JDK Path
Start Es/bin/elasticsearch or Elasticsearch.bat to start
Then visit http://localhost/IP:9200 to see if it works
Kibana/config/kibana.yml
Elasticsearch.url: "http://192.168.1.5:9200" Specifies ES address
Perform kibana/bin/kibana or Kibana.bat boot
To see if the http://localhost:5601 starts normally
ElasticSearch is inherently well-supported distributed, if the environment allows more than one load to be used.
Compare some of the concepts in ES with MySQL for ease of understanding
ES Index (index) type (type) document (document) field (field)
MySQL database table Row column
0x04 Social Work Pool building
When the above work is ready to be completed, it is time to enter the construction stage. First determine what column names exist in ES, I saved 10 columns for everyone to make a reference.
Nickname (nickname), password (password), email (e-mail), QQ (QQ), Telno (mobile number), Idno (identity card number), Realname (real name), address (home address), salt (salt value), from (data source). In contrast, I was quite detailed in this division. But in the process of writing ES with Logstash, it is time-consuming. In fact, if you want to be lazy, you can simply write the existing data as a field, you can also query out. But this will be messy, look very uncomfortable.
My main practice now is to use a script to clean some existing CSV files and wash them into the format of the fields I have specified. If it is a. sql file, drop it directly into MySQL and then export the CSV file. That is, the different Web site database is ultimately all CSV files, and is the same format CSV. If there are fields that do not exist, use null characters instead. In this case, you can import es directly without changing the Logstash configuration file, and it is easy to migrate.
Configuration Logstash file (test.conf)
Then execute it under the Logstash Bin directory
Logstash.bat-f test.conf
You can see such as:
The data is being written toward ES.
When repeating the test, be careful to delete the Sincedb file in the home directory. I've provided a Python code to do these things.
The above code is primarily used to delete data from the created index and to delete some temporary files.
You can view the data after you finish writing it
HTTP://IP Address: 9200/_cat/indices?v
To view the information for the index
Login Kibana
http://localhost:5601/
Click Settings---Index patterns Add New--Enter the index name you created and click Create to finish.
You can then search in the Discover.
You can also specify a field name to search for example: telno:13588888888. This syntax is Lucene syntax.
Has FUN!!!
Ps:
1, if 100 million data, the test is also a millisecond-level query, the current 300 million data, the speed is very fast.
2, the three software all open source. Can be used in the official download of es, after the download, simple configuration.
Building a social work pool using elk