Comparison of 8 NoSQL database systems

Source: Internet
Author: User
Tags cassandra connection pooling riak jquery library neo4j couchdb

This article by Bole Online-Tang Yuhua translation from Kristóf Kovács. Welcome to the technical translation team. Please refer to the requirements at the end of this article for reprint. Guide: Kristóf Kovács is a software architect and consultant who recently published an article comparing various types of NoSQL databases. Although SQL database is a very useful tool, the monopoly is about to be broken after 15 years of a single show. This is only a matter of time: forced to use relational databases, but eventually found to be unable to adapt to the needs of the numerous. But the difference between NoSQL databases is far more than two of the differences between SQL databases. This means that the Software architect should choose a suitable NoSQL database at the beginning of the project. In this case, Cassandra, Mongodb, CouchDB, Redis, Riak, Membase, neo4j, and HBase are compared: (note 1:nosql: It's a whole new database revolutionary movement, NoSQL advocates are advocating the use of non-relational data storage. Today's computer architectures require a huge level of scalability in data storage, and NoSQL is committed to changing this situation. Google's BigTable and Amazon's Dynamo are now using NoSQL databases. See also NoSQL entry. ) 1. CouchDB language: Erlang features: DB consistency, easy to use license: Apache protocol: http/rest bidirectional data replication, ongoing or temporary processing, with conflict checking during processing, and therefore, using    Master-master copy (see note 2) mvcc– write operation does not block read operations before saving a file version crash-only (reliable) design requires data compression views from time to times: Embedded map/Reduce formatted view: List display Support for server-side document validation support verification based on change real-time update support attachment processing therefore, Couchapps (standalone JS application) requires the jquery library best practice: Suitable for less data changes, execute predefined queries, and perform data statistics The application. Applies to applications that need to provide data versioning support. For example: CRM, CMS system. Master-master replication is useful for multi-site deployments. (Note 2:master-master replication: A database synchronization method that allows data to be shared between a group of computers and can be used by any member of the groupData is updated within the group. ) 2. Redis language: C/S features: Running abnormally fast License: BSD protocol: Class Telnet has a memory database supported by hard disk storage, but can exchange data to hard disk since version 2.0 (note that the feature is not supported in version 2.4)!    Master-slave copy (see note 3) Although a hash table with simple data or a key index is used, it also supports complex operations such as Zrevrangebyscore. INCR & Co (suitable for calculating limit values or statistics) support sets (also supports Union/diff/inter) support list (also supports queue; blocking pop operations) support hash tables (objects with multiple domains) support sorting sets (High score table, applies to range query) Redis support transaction supports setting data to outdated data (similar to fast buffer design) Pub/sub allows users to implement the best scenario for messaging mechanisms: applications where data changes quickly and the database size can be met (for memory capacity). For example: stock price, data analysis, real-time data collection, real-time communication. (Note 3:master-slave replication: If only one server handles all replication requests at the same time, this is referred to as Master-slave replication, and is typically applied to a server cluster that needs to provide high availability.) ) 3.    MongoDB Language: C + + features: Preserves some of the SQL friendly features (queries, indexes). License: AGPL (initiator: Apache) protocol: Custom, Binary (BSON) master/slave replication (supports automatic error recovery, using sets replication) built-in Shard mechanism supports JavaScript expression checking Queries can perform arbitrary JavaScript functions on the server update-in-place support is better than couchdb. Memory-to-file mapping in data storage more attention to performance than the requirements of the feature suggest best to turn on logging (parameter –journa L) on a 32-bit operating system, the database size is limited to approximately 2.5Gb empty database about 192MB the best scenario for storing big data or metadata (not a real file system) with Gridfs: for dynamic query support; you need to use indexes instead of Map/reduce features ; need to have performance requirements for large databases;Couchdb An application that fills up memory because the data changes too frequently. For example: you intended to use MySQL or PostgreSQL, but because of their own predefined columns, you are deterred. 4. Riak languages: Erlang and C, and some JavaScript features: fault-tolerant License: Apache protocol: http/rest or custom binary adjustable distribution and replication (N, R, W    Use JavaScript or Erlang to verify and secure support before or after the operation.    Use JavaScript or Erlang for map/reduce connection and connection traversal: can be used as a graphical database index: Input metadata for search (supported by version 1.0) Big Data Object Support (Luwak) provides two versions of "Open source" and "Enterprise" Full-Text search, indexing, Riak Search server queries (Beta) support for masterless Multi-site replication and commercially licensed SNMP monitoring best practices: for applications that want to use similar Cassandra (like Dynamo) databases but cannot handle bloat and complexity Case Applies to scenarios where you intend to do long site replication, but require scalability, availability, and error handling for a single site. For example: Sales data collection, plant control system, strict requirements for downtime, and can be used as an easy-to-update Web server.    5. Membase language: Erlang and C Features: compatible with Memcache, but both persistent and support for cluster Licensing: Apache 2.0 protocol: Distributed cache and extended very fast (200k+/seconds), index data by key value Persistent storage to hard disk all nodes are unique (Master-master replication) in memory similarly supports cache unit write data similar to distributed cache by removing duplicate data to reduce IO provides very good cluster Management web interface update software when soft without stopping Database service supports connection pooling and multiplexing of connection agents best practices for applications that require low latency data access, high concurrency support, and high availability for example: low latency data access such as ad-targeted applications, high-concurrency Web applications such as online games such as Zynga 6. Neo4j language: Java Features: relational-based graphical database use license: GPL, some of which makeWith agpl/Commercial License Agreement: Http/rest (or embedded in Java) nodes and edges that can be used independently or embedded in Java application graphics can have metadata good self-contained web management capabilities use multiple algorithms to support path search using key values and Relational indexes optimized for read operations support transactions (with Java API) using Gremlin Graphics traversal language support groovy scripting supports online backup, advanced monitoring and high reliability support using agpl/commercial licensing best Practices: for Graphics class Data. This is the most significant difference between neo4j and other NoSQL databases such as: social relations, public transport networks, maps and network extension 7. Cassandra language: Java features: The best use of support for large tables and Dynamo: Apache protocol: Custom, binary (economical) adjustable distribution and replication (N, R, W) supports a  A range of key values is queried by columns for functions like large tables: Columns, column collection for an attribute write operations are faster than read operations based on Apache distributed platform as much as possible map/reduce I admit to being biased against Cassandra, partly because of its bloated and complex nature, and because The best scenario for Java problems (configuration, exceptions, etc.): When using write operations with multiple read operations (logging) If each system build must be written in Java (no one is fired for using Apache software) For example: Banking, finance (although not necessary for financial transactions, But these industries are more likely to require a database than they are to write faster than read, so a natural feature is real-time data analysis 8.    HBase (used with Ghshephard) language: Java features: Support billions of rows x million-column license: Apache protocol: http/rest (support Thrift, see note 4) modeling after BigTable Use distributed architecture map/reduce to optimize real-time queries high performance thrift gateways Support XML, PROTOBUF, and binary HTTP CASCAD by implementing query operations on server-side scanning and filtering ING, hive, and pig source and sink modules based on Jruby (JIRB) shell-to-configuration changes and minor upgrades will be rolled back without a single point of failure comparable to MySQL's Random access performance best Practice scenario: for Preference BigTable:) and for random, real-time access to big data. For example: Facebook message database (more common use cases coming soon) Note 4:thrift is an interface definition language that provides definition and creation services for a variety of other languages, developed by Facebook and open source. Of course, all systems do not only have these features listed above. Here I just list some of the important features I think are based on my own opinion. At the same time, technological progress is rapid, so the above content must be constantly updated. I will do my best to update this list. Original link: Kristóf Kovács translation: Bole online-Tang Yuhua translation Link: http://blog.jobbole.com/1344/[reprint must be in the text to mark and retain the original link, translation links and translators and other information. ]

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.