Four types of NoSQL
The status of NoSQL database in the whole database field is self-evident. In the era of big data, although the RDBMS is excellent, with the rapid growth of data size and increasingly complex data models, the RDBMS is unable to cope with many database processing tasks, when NoSQL relies on easy to expand, Big data volumes and high performance and flexible data models have succeeded in establishing a foothold in the database field.
At present, the basic identity of the NoSQL database is divided into four categories: key-value Storage database, document database, Columnstore database and graph database, each type of database can solve the problem of relational data can not solve. In practice, the classification limits of NoSQL databases are not so obvious, but they tend to be multiple types of combinations.
Mainstream NoSQL: MongoDB, Hbase, Redis
Mongodb
MongoDB is a high-performance, open-source, modeless document-based database that is developed in C + +. It can be used in many scenarios to replace the relational database of the system or key/value storage.
1.MongoDB Features
Language used: C + +
Features: Preserves some of the SQL friendly features (queries, indexes).
License for use: AGPL (initiator: Apache)
Protocol: Custom, Binary (BSON)
Master/slave Replication (supports automatic error recovery with sets replication)
Built-in shard mechanism
Support for JavaScript expression queries
Arbitrary JavaScript functions can be executed on the server side
Update-in-place support is better than couchdb
Memory-to-file mapping when data is stored
Performance concerns outweigh the requirements for functionality
It is recommended to turn on the log function (parameter--journal)
On 32-bit operating systems, the database size is limited to about 2.5GB
The empty database accounts for approximately 192Mb
Use GRIDFS to store big data or metadata (not a real file system)
2.MongoDB Advantages:
1) Higher write load, MongoDB has higher insertion speed.
2) dealing with a large scale of a single table, when the data table is too large to easily split the table.
3) High availability, setting up the m-s is not only convenient and fast, MongoDB can also quickly, safely and automatically implement node (data center) failover.
4) Fast queries, MongoDB supports two-dimensional spatial indexes, such as pipelines, so that data can be obtained quickly and accurately from a specified location. MongoDB loads the data in the database into memory as a file map after it is started. This will greatly increase the query speed of the database if the memory resources are quite abundant.
5) The explosion of unstructured data increases, the column in some cases may lock the entire database, or increase the load resulting in performance degradation, due to MongoDB's weak data structure mode, adding 1 new fields will not have any effect on the old table, the whole process will be very fast.
3.MongoDB Disadvantages:
1) does not support transactions.
2) MongoDB occupies too much space.
3) MongoDB does not have a mature maintenance tool.
4.MongoDB Application Scenarios
1.) Suitable for real-time insert, update and query requirements, and with the application of real-time data storage required for replication and high scalability;
2) ideal for storing and querying in document format;
3.) Highly scalable scenario: MongoDB is ideal for databases consisting of dozens of or hundreds of servers.
4.) The focus on performance exceeds the requirements for functionality.
HBase
HBase is a subproject in Apache Hadoop and belongs to the open source version of BigTable, which is implemented in Java (and therefore relies on the Java SDK). HBase relies on Hadoop's HDFS (Distributed File System) as the most basic storage base unit.
1.HBase Features:
Language used: Java
Features: Support billions of rows x millions of columns
License for use: Apache
Agreement: Http/rest (Support Thrift, see note 4)
Modeling after BigTable
Adopt a distributed Architecture Map/reduce
Optimize for real-time queries
High-performance Thrift gateways
Pre-contract query operations through server-side scanning and filtering
Supports XML, protobuf, and binary http
cascading, Hive, and pig source and sink modules
Shell based on Jruby (JIRB)
Changes to configuration and minor upgrades will be rolled back
There is no single point of failure
Random access performance comparable to MySQL
- HBase Advantages
1) Large storage capacity, a table can accommodate hundreds of millions of rows, million columns;
2.) can be retrieved through the version, can search the required historical version of the data;
3.) When the load is high, the horizontal slicing expansion can be achieved by simply adding machines, and the seamless integration with Hadoop guarantees the high performance (MapReduce) of data Reliability (HDFS) and massive data analysis.
4.) On the basis of the 3rd, can effectively avoid the occurrence of single point of failure.
4.HBase Disadvantages
Java-based implementations and Hadoop architectures mean that their APIs are more suitable for Java projects;
Node development environment requires a lot of dependencies, configuration trouble (or do not know how to configure, such as persistent configuration), lack of documentation;
High memory consumption and low read performance due to the establishment of HDFS optimized for batch analysis;
- The API is relatively clumsy compared to other NoSql.
5.HBase Applicable Scenarios
1) bigtable type of data storage;
2) to the data version query requirements;
3) The need to expand simple requirements for large data volumes.
Redis
Redis is an open source API that is written in ANSI C, supports the web, can be persisted in memory, key-value databases, and provides multiple languages. The development work is currently being hosted by VMware.
1.Redis Features:
Language used: C + +
Features: Fast running abnormally
License for use: BSD
Protocol: Class Telnet
There are memory databases supported by hard disk storage,
However, the data can be exchanged to the hard drive since version 2.0 (note that the feature is not supported in version 2.4)! )
Master-slave copy (see note 3)
Although simple data or hash tables indexed with key values are used, complex operations, such as Zrevrangebyscore, are also supported.
INCR & Co (suitable for calculating limit values or statistical data)
Supports sets (also supports Union/diff/inter)
Support List (also supports queue; blocking pop operations)
Support for hash tables (objects with multiple domains)
Support for sorting sets (high score table, for range queries)
Redis Support Transactions
Support for setting data to outdated data (similar to fast buffer design)
Pub/sub allows users to implement message mechanisms
- Redis Benefits
1) very rich data structure;
2.) Redis provides the functionality of the transaction, which guarantees the atomicity of a sequence of commands and is not interrupted by any action;
3.) Data exists in memory, read and write very high speed, can reach 10w/s frequency.
3.Redis Disadvantages
1) After the Redis3.0 to come out of the official cluster scheme, but there are still some architectural problems;
2.) The persistence feature is poorly experienced-it takes a snapshot method to write data from the entire database to disk at intervals, and the AOF method tracks only the changed data, similar to the MySQL Binlog method, but the append log may be too large, At the same time, all operations must be re-executed again, the recovery speed is slow;
3) because it is a memory database, so, a single machine, the amount of data stored, with the memory size of the machine itself. Although Redis itself has a key expiration policy, it still needs to anticipate and conserve memory in advance. If the memory grows too fast, you need to delete the data periodically.
4.Redis Application Scenario:
Best practices: Applications where data changes quickly and database sizes are met (for memory capacity).
For example: Weibo, data analysis, real-time data collection, real-time communication and so on.
MongoDB, Hbase, Redis and other nosql advantages and disadvantages, application scenarios