Comparison of eight NoSQL database systems

Source: Internet
Author: User
Tags cassandra connection pooling memcached riak neo4j couchdb groovy script

Article Source: http://article.yeeyan.org/view/271351/239915

Although SQL database is a very useful tool, the monopoly is about to be broken after 15 years of a single show. This is only a matter of time: forced to use relational databases, but eventually found to be unable to adapt to the needs of the numerous.

But the difference between NoSQL databases is far more than the differences between two SQL databases. This means that the Software architect should choose a suitable NoSQL database at the beginning of the project.

In this case, Cassandra, Mongodb, CouchDB, Redis, Riak, Membase, neo4j, and hbase are compared:

1. CouchDB(V1.1.0)

L Language: Erlang

L Features:DB consistency, easy to use

L License: Apache

L Protocol: Http/rest

l Bidirectional data replication,

L Persistence or Ad-hoc

L Conflict Detection

L with Master-master copy

L mvcc– Write operation does not block read operation

L version Control (previous version documentation is valid)

L single point of collapse (reliable) design

• Data compression is possible if necessary

L View: inline map/reduce mechanism (MapReduce is a programming model for parallel operations on large datasets)

L View format: List & Display

L support for server-side document validation

L Support Authentication

• Real-time update of data changes

L Support Attachment Handling

L Couchapps (standalone JS application)

L contains the jquery program library

Best Practice Scenario : Applies to applications that accumulate large amounts of data, change less, and perform predefined queries. Applies to apps that need to provide data versioning support.

For example : CRM, CMS system. Master-master replication is an interesting feature that makes it easy to deploy multiple sites.

-----------------------------------------------------------------------

2. Redis(V2.4)

L Languages : C + +

L features : Extremely fast speed

L use license : BSD

L Protocol : Telnet-like

l have a memory database supported by hard disk storage,

L currently have no disk swap (VMS and diskstore are discarded)

L Master-slave Copy

L Simple value storage or hash table storage via key

L Support complex operations, such as Zrevrangebyscore.

L INCR & Co (digital increment storage key value, suitable for calculating limit values or statistics)

L Support set operations (intersection/subtraction/subset Support)

L support list (support queue; blocking pop operation)

L Support Hash Table (objects of multiple domains)

L Support Set sort (high score table, apply to range query)

L Support Transactions

L support setting data to expire data (similar to fast buffer design and cold and hot separation)

L Publish/Subscribe feature allows user to implement message mechanism (long connection push mechanism)

Best -case scenario: Applications that are fast-changing data and can be expected to have a database size. (This allows for reasonable configuration of memory capacity)

For example : Stock price, data analysis, real-time data collection, real-time communication.

-----------------------------------------------------------------------

3. MongoDB

L Language : C + +

L features : Reserved SQL some friendly features (query, index).

L License : AGPL (initiator: Apache)

L Protocol : Custom, Binary (BSON)

L Master/slave Replication (inter-server data replication and automatic failover)

• Built-in automatic shard mechanism (supports horizontal DB cluster)

L Support JavaScript expression query

• Arbitrary JavaScript functions can be executed on the server side

L superior to Couchdb Update-in-place

L data storage using memory-mapped files

L performance requirements are higher than the function

L Best Open Log function (parameter –journal)

• The database size limit is approximately 2.5G on 32-bit operating systems

L Empty database accounts for approximately 192Mb

L use Gridfs to store big data and metadata (not a real file system)

Best Practices : If you need to query dynamically, you need to use indexes instead of map/reduce features, and you need to have good performance requirements for large databases. If you want to use CouchDB, but the data changes too frequently and fills up the memory of the application.

For example , you would have wanted to use MySQL or PostgreSQL, but you would be deterred by the need for a predefined table knot structure.

-----------------------------------------------------------------------

4. Riak(V1.0)

L Languages : Erlang and C, and some JavaScript

L features : fault-tolerant capability

L License : Apache

L Protocol : http/rest or Custom binary

L Customizable parameter control distribution and replication (n-Copy nodes, R-The minimum number of nodes for a successful read operation, W – the minimum number of nodes for a successful write operation)

L use JavaScript or Erlang for validation and security detection when operating a pre-commit or commit

L use JavaScript or Erlang for Map/reduce

L Link & Link traversal: Can be used as a graphical database

L Multi-level index: Can be searched in Meta data

L Big Data Object Support (Luwak)

L provide two versions of "Open source" and "Enterprise"

L Riak Search Server (Beta) supports full-text search, indexing, querying

L Migrating back-end storage from "Bitcask" to Google's "LevelDB"

L Support SNMP Monitoring of masterless Multi-site replication and business authorization

Best Practices : For situations where you want to use a database like Cassandra (similar to Dynamo, Amazon's key-value-mode storage platform) but don't want to deal with data bloat and complexity. If you need good single-site scalability, availability, and fault tolerance, you're ready for multi-point replication.

For example : Sales data collection, plant control system, strict requirements for downtime, and can be used as an easy-to-update Web server.

-----------------------------------------------------------------------

5. membase

L Languages : Erlang and C

L features : Compatible with Memcache, both persistent and support cluster

L use license : Apache 2.0

L Protocol : Memcached extended Enhancement

L very fast (200k+/seconds), Access data by key value

L can persist storage to hard disk

L all nodes are the same (Master-master replication)

L provide a cache unit similar to memcached in memory

L Write data via deduplication to reduce IO

L provide a good cluster management web interface

L software updates do not need to stop the database service

L Support connection pooling and multiplexing of connection agents

Best Practices : For applications that require low-latency data access, high concurrency, and high availability

For example : Low latency data access such as ad-targeted applications, high-concurrency Web applications such as online games (e.g. Zynga)

-----------------------------------------------------------------------

6. neo4j (v1.5m02)

L language : Java

L features : Graphic database

L use license : GPL, some of which use agpl/commercial license

L Protocol : Http/rest (or embedded in Java)

L can be used standalone or embedded in Java applications

L fully complies with ACID properties (including persistent data)

L graph nodes and edges can have metadata

Integrated pattern-matching-based query language (Cypher)

L can use graphics to traverse language "Gremlin"

L Index of nodes and relationships

L built-in Web management interface

L Advanced Path Lookup supported by multi-algorithm

L Key and Relational index

L Optimized read operation

L Support transactions (Java API)

L Support Groovy Script

L Support Online Backup, advanced monitoring and high reliability support, use agpl/commercial license

Best Practices : For describing the relationship between graphic class data, rich data, or complex data. This is the most significant difference between neo4j and other NoSQL databases.

For example : social relations, public transport networks, maps, network topologies, etc.

-----------------------------------------------------------------------

7. Cassandra

L language : Java

L features : The best support for BigTable and Dynamo (Amazon's key-value-mode storage platform)

L License : Apache

L Protocol : Custom, Binary (Thrift)

L Customizable parameter control distribution and replication (n-Copy nodes, R-The minimum number of nodes for a successful read operation, W – the minimum number of nodes for a successful write operation)

L support query with a range of key values, column query

L functions similar to bigtable: columns, column groups

• Write operations are faster than read operations

L Map/reduce based on Apache Hadoop (a distributed system infrastructure developed by the Apache Foundation) as much as possible

I admit to being biased against Cassandra because of its bloated and complex nature, partly because of Java problems (configuration, anomalies, etc.)

Best Practice Scenario : When a write operation is more than a read operation (logging). If every component in the system has to be written in Java (no one is fired for using Apache software).

For example : Banking, Finance (although not required for financial transactions, these industries are more likely to require a database than they are) write faster than read, so a natural feature is real-time data analysis

-----------------------------------------------------------------------

8. HBase

(for use with Ghshephard)

L language : Java

L features : Support billions of rows x millions of columns

L License : Apache

L Protocol : Http/rest (Support Thrift, see note 4)

L take bigtable as the blueprint

L Use Hadoop for Map/reduce

L Pre-Contract query operations through server-side scanning and filtering

L Real-time query optimization

L High Performance Thrift Gateway

L support XML, protobuf, and binary http

L cascading, Hive, and pig source and Sink modules

L JRuby (JIRB)-based shell

L No single point of failure

L rollback of configuration changes and minor upgrades

• Random access performance comparable to MySQL

Best -case scenario: applications that favor bigtable and require random, real-time access to big data.

For example: Facebook message database (more common use cases coming soon)

Comparison of eight NoSQL database systems

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.