Install the ElasticSearch search tool and configure the Python driver,

Source: Internet
Author: User
Tags kibana

Install the ElasticSearch search tool and configure the Python driver,

ElasticSearch is a Lucene-based search server. It provides a distributed full-text search engine with multi-user capabilities, based on RESTful web interfaces. Elasticsearch is developed in Java and released as an open source code under the Apache license terms. It is the second most popular enterprise search engine. Designed for cloud computing, it can achieve real-time search, stable, reliable, fast, and easy to install and use.
We create a website or application and add a search function, which makes it difficult to search. We hope that our search solution will be faster. We hope to have a Zero Configuration and a completely free search mode. We hope that we can simply use JSON to index data through HTTP, we hope that our search server will always be available. We hope that we can start with one server and expand it to several hundred. We need to search in real time, and we need simple multi-tenancy, we hope to build a cloud solution. Elasticsearch aims to solve all these problems and more problems.
Elasticsearch is a new member of the open-source search platform. It provides real-time data analysis artifacts and is growing rapidly, based on Lucene, RESTful, distributed, cloud computing-oriented design, real-time search, full-text search, stable, highly reliable, scalable, and easy to install and use, the introduction is quite nice, take it out for a moment.
After a simple test, about 20 million pieces of data are inserted on two identical virtual machines. Elasticsearch inserts data much slower (tolerable) than MongoDB ), however, the Search/query speed is more than 10 times faster. This is only the case of a single machine. Elasticsearch performs better in multi-machine clusters. The following installation steps are completed on Ubuntu Server 14.04 LTS.

Install Elasticsearch
Install Oracle Java 7 after upgrading the system. Since Elasticsearch officially recommends using Oracle JDK 7, do not try JDK 8 or OpenJDK:

$ sudo apt-get update$ sudo apt-get upgrade $ sudo apt-get install software-properties-common$ sudo add-apt-repository ppa:webupd8team/java$ sudo apt-get update $ sudo apt-get install oracle-java7-installer

After adding Elasticsearch official sources, install elasticsearch:

$ wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add -$ sudo echo "deb http://packages.elasticsearch.org/elasticsearch/1.1/debian stable main" >> /etc/apt/sources.list $ sudo apt-get update$ sudo apt-get install elasticsearch

Add to the System Startup File and start the elasticsearch service. Use curl to test whether the installation is successful:

$ sudo update-rc.d elasticsearch defaults 95 1 $ sudo /etc/init.d/elasticsearch start $ curl -X GET 'http://localhost:9200'{ "status" : 200, "name" : "Fer-de-Lance", "version" : {  "number" : "1.1.1",  "build_hash" : "f1585f096d3f3985e73456debdc1a0745f512bbc",  "build_timestamp" : "2014-04-16T14:27:12Z",  "build_snapshot" : false,  "lucene_version" : "4.7" }, "tagline" : "You Know, for Search"}

Elasticsearch's cluster and Data Management Interface Marvel is awesome. Unfortunately, it is only free of charge for the development environment. If this tool is free of charge, it will be invincible. The installation is simple. After the tool is complete, restart the service to access http: // 192.168.2.172: 9200/_ plugin/marvel/you can see the interface:

$ sudo /usr/share/elasticsearch/bin/plugin -i elasticsearch/marvel/latest $ sudo /etc/init.d/elasticsearch restart * Stopping Elasticsearch Server                      [ OK ] * Starting Elasticsearch Server                      [ OK ]

Install the Python client driver
Like MongoDB, we generally use a program to interact with Elasticsearch. Elasticsearch also supports client drivers in multiple languages. Here, only the Python driver is installed. For other languages, see the official documentation.

$ sudo apt-get install python-pip$ sudo pip install elasticsearch

Write a simple program to import data from gene_info.txt to Elasticsearch:

#!/usr/bin/python# -*- coding: UTF-8 -*- import os, os.path, sys, reimport csv, time, stringfrom datetime import datetimefrom elasticsearch import Elasticsearch def import_to_db():  data = csv.reader(open('gene_info.txt', 'rb'), delimiter='\t')  data.next()   es = Elasticsearch()  for row in data:    doc = {      'tax_id': row[0],      'GeneID': row[1],      'Symbol': row[2],      'LocusTag': row[3],      'Synonyms': row[4],      'dbXrefs': row[5],      'chromosome': row[6],      'map_location': row[7],      'description': row[8],      'type_of_gene': row[9],      'Symbol_from_nomenclature_authority': row[10],      'Full_name_from_nomenclature_authority': row[11],      'Nomenclature_status': row[12],      'Other_designations': row[13],      'Modification_date': row[14]    }    res = es.index(index="gene", doc_type='gene_info', body=doc) def main():  import_to_db() if __name__ == "__main__":  main()

Kibana is a powerful data display client. It is integrated with Elasticsearch through plug-ins. It is easy to install and download and decompress it. Then restart the Elasticsearch service to access http: // 192.168.2.172: 9200/_ plugin/kibana/To see the interface:

$ wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.1.tar.gz$ tar zxvf kibana-3.0.1.tar.gz$ sudo mv kibana-3.0.1 /usr/share/elasticsearch/plugins/_site$ sudo /etc/init.d/elasticsearch restart


Articles you may be interested in:
  • A small crawler implemented by python that has been continuously searched by Baidu
  • Python combines multiple text files into a single text code (easy to search)
  • Using Python Pyspider as an example to analyze how to implement web crawler in search engines

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.