Install the Elasticsearch search tool and configure Python-driven methods

Source: Internet
Author: User
Tags kibana
Elasticsearch is a Lucene-based search server. It provides a distributed multi-user-capable full-text search engine, based on a restful web interface. Elasticsearch was developed in Java and published as an open source under the Apache license terms, and is the second most popular enterprise search engine. Designed for cloud computing, it can achieve real-time search, stable, reliable, fast, easy to install and use.
We build a website or application, and to add search functionality, what strikes us is that it is difficult to search for work. We want our search solution to be fast, we want to have a 0 configuration and a completely free search mode, we want to be able to simply use JSON indexed data via HTTP, we want our search server to always be available, we want to be able to start one and expand to hundreds of, we want to search in real time, We want simple multi-tenancy and we want to build a cloud-based solution. Elasticsearch is designed to solve all these problems and more.
Elasticsearch is a new member of the open source search platform, the real-time data analysis artifact, developed rapidly, based on Lucene, RESTful, distributed, cloud-oriented design, real-time search, full-text search, stability, high reliability, extensible, installation + easy to use, introduction are said to be very pleasant, Good to take out for a walk.
Did a simple test, in two identical virtual machines, 20 million or so data, Elasticsearch inserted data speed than MongoDB much slower (can endure), but search/query faster than 10 times times, this is only a single case, multi-machine cluster case Elasticsearch a better performance. The following installation steps are completed on Ubuntu Server 14.04 LTS.





Installing Elasticsearch
After upgrading the system, install Oracle Java 7, since Elasticsearch officially recommends using Oracle JDK 7, do not try JDK 8 and OpenJDK:

$ sudo apt-get update
$ sudo apt-get upgrade
 
$ sudo apt-get install software-properties-common
$ sudo add-apt-repository ppa: webupd8team / java
$ sudo apt-get update
 
$ sudo apt-get install oracle-java7-installer
Install Elasticsearch after joining the official Elasticsearch source:

$ wget -O-http://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add-
$ sudo echo "deb http://packages.elasticsearch.org/elasticsearch/1.1/debian stable main" >> /etc/apt/sources.list
 
$ sudo apt-get update
$ sudo apt-get install elasticsearch
Add to the system startup file and start the elasticsearch service, use curl to test whether the installation is successful:

$ sudo update-rc.d elasticsearch defaults 95 1
 
$ sudo /etc/init.d/elasticsearch start
 
$ curl -X GET 'http: // localhost: 9200'
{
 "status": 200,
 "name": "Fer-de-Lance",
 "version": {
  "number": "1.1.1",
  "build_hash": "f1585f096d3f3985e73456debdc1a0745f512bbc",
  "build_timestamp": "2014-04-16T14: 27: 12Z",
  "build_snapshot": false,
  "lucene_version": "4.7"
 },
 "tagline": "You Know, for Search"
}
Elasticsearch's cluster and data management interface Marvel is very good. Unfortunately, it is only free for the development environment. If this tool is also free, it is invincible. The installation is very simple. Restart the service after completion. You can see the interface:

$ sudo / usr / share / elasticsearch / bin / plugin -i elasticsearch / marvel / latest
 
$ sudo /etc/init.d/elasticsearch restart
 * Stopping Elasticsearch Server [OK]
 * Starting Elasticsearch Server [OK]

Install the Python client driver
Like MongoDB, we generally use programs to interact with Elasticsearch. Elasticsearch also supports client drivers in multiple languages. Only the Python driver is installed here. For other languages, you can refer to the official documentation.

$ sudo apt-get install python-pip
$ sudo pip install elasticsearch
Write a simple program to import the data of gene_info.txt into Elasticsearch:

#! / usr / bin / python
#-*-coding: UTF-8-*-
 
import os, os.path, sys, re
import csv, time, string
from datetime import datetime
from elasticsearch import Elasticsearch
 
def import_to_db ():
  data = csv.reader (open ('gene_info.txt', 'rb'), delimiter = '\ t')
  data.next ()
 
  es = Elasticsearch ()
  for row in data:
    doc = {
      'tax_id': row [0],
      'GeneID': row [1],
      'Symbol': row [2],
      'LocusTag': row [3],
      'Synonyms': row [4],
      'dbXrefs': row [5],
      'chromosome': row [6],
      'map_location': row [7],
      'description': row [8],
      'type_of_gene': row [9],
      'Symbol_from_nomenclature_authority': row [10],
      'Full_name_from_nomenclature_authority': row [11],
      'Nomenclature_status': row [12],
      'Other_designations': row [13],
      'Modification_date': row [14]
    }
    res = es.index (index = "gene", doc_type = 'gene_info', body = doc)
 
def main ():
  import_to_db ()
 
if __name__ == "__main__":
  main ()
Kibana is a powerful data display client. It is integrated with Elasticsearch through a plug-in method. Installation is easy. Download and decompress it. Then restart the Elasticsearch service and visit http://192.168.2.172:9200/_plugin/kibana/ Can see the interface:

$ wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.1.tar.gz
$ tar zxvf kibana-3.0.1.tar.gz
$ sudo mv kibana-3.0.1 / usr / share / elasticsearch / plugins / _site
$ sudo /etc/init.d/elasticsearch restart


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.