We create a website or application and add a search function, which makes it difficult to search. We hope that our search solution will be faster. We hope to have a zero configuration and a completely free search mode. We hope that we can simply use JSON to index data through HTTP, we hope that our search server will always be available. We hope that we can start with one server and expand it to several hundred. We need to search in real time, and we need simple multi-tenancy, we hope to build a cloud solution. Elasticsearch aims to solve all these problems and more.
After a simple test, about 20 million pieces of data are inserted on two identical virtual machines. Elasticsearch inserts data much slower (tolerable) than MongoDB ), however, the search/query speed is more than 10 times faster. This is only the case of a single machine. Elasticsearch performs better in multi-machine clusters. The following installation steps are completed on Ubuntu Server 14.04 LTS.
Install Elasticsearch
Install Oracle Java 7 after upgrading the system. Since Elasticsearch officially recommends using Oracle JDK 7, do not try JDK 8 or OpenJDK:
$ Sudo apt-get update
$ Sudo apt-get upgrade
$ Sudo apt-get install software-properties-common
$ Sudo add-apt-repository ppa: webupd8team/java
$ Sudo apt-get update
$ Sudo apt-get install oracle-java7-installer
After adding Elasticsearch official sources, install elasticsearch:
$ Wget-O-http://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add-
$ Sudo echo "deb http://packages.elasticsearch.org/elasticsearch/1.1/debian stable main">/etc/apt/sources. list
$ Sudo apt-get update
$ Sudo apt-get install elasticsearch
Add to the system startup file and start the elasticsearch Service. Use curl to test whether the installation is successful:
$ Sudo update-rc.d elasticsearch defaults 95 1
$ Sudo/etc/init. d/elasticsearch start
$ Curl-x get 'http: // localhost: 8080'
{
"Status": 200,
"Name": "Fer-de-Lance ",
"Version ":{
"Number": "1.1.1 ",
"Build_hash": "f1585f096d3f3985e73456debdc1a0745f512bbc ",
"Build_timestamp": "2014-04-16T14: 27: 12Z ",
"Build_snapshot": false,
"Inclue_version": "4.7"
},
"Tagline": "You Know, for Search"
}
Elasticsearch's cluster and data management interface Marvel is awesome. Unfortunately, it is only free of charge for the development environment. If this tool is free of charge, it will be invincible. The installation is simple. After the tool is complete, restart the service to access http: // 192.168.2.172: 9200/_ plugin/marvel/You can see the interface:
$ Sudo/usr/share/elasticsearch/bin/plugin-I elasticsearch/marvel/latest
$ Sudo/etc/init. d/elasticsearch restart
* Stopping Elasticsearch Server [OK]
* Starting Elasticsearch Server [OK]
Install the Python client driver
Like MongoDB, we generally use a program to interact with Elasticsearch. Elasticsearch also supports client drivers in multiple languages. Here, only the Python driver is installed. For other languages, see the official documentation.
$ Sudo apt-get install python-pip
$ Sudo pip install elasticsearch
Write a simple program to import data from gene_info.txt to Elasticsearch:
#! /Usr/bin/python
#-*-Coding: UTF-8 -*-
Import OS, OS. path, sys, re
Import csv, time, string
From datetime import datetime
From elasticsearch import Elasticsearch
Def import_to_db ():
Data = csv.reader(open('gene_info.txt ', 'RB'), delimiter ='t ')
Data. next ()
Es = Elasticsearch ()
For row in data:
Doc = {
'Tax _ id': row [0],
'Geneid': row [1],
'Symbol': row [2],
'Locustag': row [3],
'Synonyms': row [4],
'Dbxrefs': row [5],
'Chromosome': row [6],
'Map _ location': row [7],
'Description': row [8],
'Type _ of_gene ': row [9],
'Symbol _ from_nomenclature_authority ': row [10],
'Full _ name_from_nomenclature_authority ': row [11],
'Nomenclature _ status': row [12],
'Other _ designations ': row [13],
'Modification _ date': row [14]
}
Res = es. index (index = "gene", doc_type = 'gene _ info', body = doc)
Def main ():
Import_to_db ()
If _ name _ = "_ main __":
Main ()
Kibana is a powerful data display client. It is integrated with Elasticsearch through plug-ins. It is easy to install and download and decompress it. Then restart the Elasticsearch Service to access http: // 192.168.2.172: 9200/_ plugin/kibana/to see the interface:
$ Wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.1.tar.gz
$ Tar zxvf kibana-3.0.1.tar.gz
$ Sudo mv kibana-3.0.1/usr/share/elasticsearch/plugins/_ site
$ Sudo/etc/init. d/elasticsearch restart