Python uses RETHINKDB summary _ database other

Source: Internet
Author: User
Tags bind mongodb

Similar to the MongoDB RETHINKDB is a database engine that is primarily used to store JSON documents (MongoDB stores Bson), easy to connect to multiple nodes as a distributed database, a very useful query language and support for table joins and group by operations.
Yesterday played a bit rethinkdb, test on a virtual machine, insert 25 million line record performance is more stable, maintain the 1.5K line to 2K lines per second, RETHINKDB data fragmentation (sharding) function is very simple, one click can be completed. The following installation and testing is done on the Ubuntu 12.04.4 LTS Server version.
Install after joining RETHINKDB official source:

Copy Code code as follows:
$ sudo apt-get install python-software-properties
$ sudo add-apt-repository Ppa:rethinkdb/ppa
$ sudo apt-get update
$ sudo apt-get install RETHINKDB

Copy an example configuration file and modify the Bind section so that you can access it from another machine:
Copy Code code as follows:
$ cd/etc/rethinkdb/
$ sudo cp default.conf.sample instances.d/default.conf

$ sudo vi instances.d/default.conf
...
# bind=127.0.0.1
bind=0.0.0.0
...


Start RETHINKDB:
Copy Code code as follows:
$ sudo/etc/init.d/rethinkdb Start
rethinkdb:default:Starting instance. (Logging to '/var/lib/rethinkdb/default/data/log_file ')

Visit the http://192.168.2.39:8080/to see the RETHINKDB Admin interface:

If you don't like working on the command line, the Web interface also provides a Data Explorer online query tool that supports syntax highlighting, online function hints, and so on, without additional help files.


To deal with RETHINKDB in a procedural way, you need to install client-side drivers (clients drivers), the official support drivers are JavaScript, Ruby and Python 3 languages, and community-supported drives include almost C, go, C + +, Java, All major programming languages such as PHP, Perl, Clojure, Erlang, and so on. I use Python a bit more, so install the Python client driver here:
Copy Code code as follows:
$ sudo apt-get install Python-pip
$ sudo pip install rethinkdb

Test whether the driver can work, if the import rethinkdb did not make a mistake can basically explain the success of the module installation:
Copy Code code as follows:
$ python
Python 2.7.3 (Default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type ' help ', ' copyright ', ' credits ' or ' license ' for the more information.
>>> Import RETHINKDB
>>>

Gene2go.txt is a text file containing genetic data, about more than 10 million lines of records, formatted as follows:
Copy Code code as follows:
$ head-2 Gene2go.txt
#Format: tax_id geneid go_id Evidence Qualifier go_term PubMed Category (tab is used as a separator, pound sign-start of A comment)
3702 814629 go:0005634 ism-nucleus-component

Write a simple program to import the Gene2go.txt data into the RETHINKDB:
Copy Code code as follows:
#!/usr/bin/python
#-*-Coding:utf-8-*-

Import OS, Os.path, SYS, RE, CSV, string

Def csv2db ():
data = Csv.reader (open (' gene2go.txt ', ' RB '), delimiter= ' \ t ')
Data.next ()

Import RETHINKDB as R
R.connect (' localhost ', 28015). REPL ()
R.db (' Test '). Table_create (' Gene2go '). Run ()
Gene2go = r.db (' Test '). Table (' Gene2go ')
For row in data:
Gene2go.insert ({
' tax_id ': row[0],
' GeneID ': row[1],
' go_id ': row[2],
' Evidence ': row[3],
' Qualifier ': row[4],
' Go_term ': row[5],
' PubMed ': row[6],
' Category ': row[7]
). Run (durability= "soft", noreply=true)

def main ():
CSV2DB ()

if __name__ = = "__main__":
Main ()

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.