In the past few years, relational databases have been the only choice for data persistence, and data workers are only looking at these traditional databases for filtering, such as SQL Server, Oracle, or MySQL. Even make some default choices, such as use. NET will generally choose SQL Server, Java may be biased to Oracle,ruby is Mysql,python is PostgreSQL or MySQL and so on.
The reason is simple: the robustness of relational databases has been proven in most applications over a long period of time. We can use these traditional databases for good control of concurrency operations, transactions, and so on. But if the traditional relational database has been so reliable, what is NoSQL? NoSQL survives and grows because it does what traditional relational databases can't do!
Problems existing in relational database
Impedance mismatch
We use Python, Ruby, Java,. NET languages, which have a common feature-object-oriented. But we use MySQL, PostgreSQL, Oracle, and SQL Server, and these databases have a common feature-relational database. This involves the term "impedance mismatch": The storage structure is object-oriented, but the database is relational, so we need to convert every time we store or query the data. ORM frameworks like Hibernate, Entity framework do simplify this process, but these ORM frameworks are stretched when there is a high performance requirement for queries.
Application size is getting bigger
The scale of Web applications is growing, and we need to store more data, serve more users, and demand more computing power. In order to cope with this situation, we need to constantly expand. There are two types of extensions: One is vertical expansion, namely, the purchase of better machines, more disks, more memory, and so on, and the other is scale-out, that is, to buy more machines to form a cluster. In a huge scale, the role of vertical expansion is not very large. First-machine performance improvements require huge overhead and a performance cap, and it's never possible to use a single machine to support all of the load on Google and Facebook. In view of this situation, we need a new database because the relational database does not run well on the cluster. You might as well build a relational db cluster, but they're using shared storage, which is not the type we want. So there's a nosql era in which Google, Facebook, and Amazon are trying to handle more transmissions.
NoSQL era
There are a lot of NoSQL databases now, such as MongoDB, Redis, Riak, HBase, Cassandra, and so on. Each one has one of the following features:
- No longer use the SQL language, such as MongoDB, Cassandra has its own query language
- Usually an open source project
- Run for the cluster
- Weak structuring-no strict restrictions on data structure types
Types of NoSQL databases
NoSQL can be broadly divided into 4 categories:Key-value, document-oriented, column-family Databases, and graph-oriented Databases. Here's a list of these types of features:
First, the key value (key-value) database
A key-value database is like a hash table used in traditional languages. You can add, query, or delete data via key, so you get good performance and scalability, given the use of primary key access.
Products: Riak, Redis, Memcached, Amazon ' s Dynamo, Project Voldemort
who is using: GitHub (Riak), BestBuy (Riak), Twitter (Redis and Memcached), StackOverflow (Redis), Instagram (Redis), Youtube (Memcached), Wikipedia (Memcached)
Applicable scenarios
Store user information, such as sessions, configuration files, parameters, shopping carts, and more. This information is usually linked to the ID (key), which is a good choice for a key-value database.
Scenario Not applicable
1. instead of querying by key, query by value. There is no way to query a value in the Key-value database.
2. the relationship between data needs to be stored. Data cannot be associated with two or more keys in the Key-value database.
3. support for the transaction. Cannot be rolled back when a failure occurs in the Key-value database.
Ii. Document-oriented (document-oriented) database
Document-oriented databases store data as documents. Each document is a self-contained unit of data that is a collection of data items. Each data item has a name and corresponding value, which can be a simple data type, such as a string, a number, a date, and so on, or a complex type, such as a sequence table and an associated object. The smallest unit of data storage is a document, and document properties stored in the same table can be different, and data can be stored in many forms, such as XML, JSON, or JSONB.
Products: MongoDB, CouchDB, RavenDB
who is using: SAP (MongoDB), Codecademy (MongoDB), Foursquare (MongoDB), NBC News (RavenDB)
Applicable scenarios
1. log. In an enterprise environment, each application has different log information. The document-oriented database does not have a fixed pattern, so we can use it to store different information.
2. analysis. Given its weak schema structure, it is possible to store different metrics and add new measures without changing the schema.
Scenario Not applicable
Adds a transaction on a different document. The document-oriented database does not support inter-document transactions and should not be chosen if there is a need for this.
Iii. columnstore (Wide column store/column-family) database
The Columnstore database stores the data in the column family (family), where a column family stores related data that is often queried together. For example, if we had a person class, we would usually look up their names and age together instead of salary. In this case, the name and age are placed in one column family, while the salary is in the other column family.
Products: Cassandra, HBase
who is using: Ebay (Cassandra), Instagram (Cassandra), NASA (Cassandra), Twitter (Cassandra and HBase), Facebook (hbase), Yahoo! (hbase)
Applicable scenarios
1. log. Because we can store the data in different columns, each application can write information to its own column family.
2. Blog platform. We store each message in a different column family. For example, tags can be stored in one, category can be in one, while the article is in another.
Scenario Not applicable
1. If we need acid transactions. Vassandra does not support transactions.
2. prototype design. If we analyze the data structure of Cassandra, we will find that the structure is based on the way we expect the data to be queried. At the beginning of the design of the model, we could not predict its Query method, and once the query mode changes, we must redesign the column family.
Four, figure (graph-oriented) database
The graph database allows us to store the data in a graph way. Entities are treated as vertices, and relationships between entities are used as edges. For example, if we have three entities, Steve Jobs, Apple and next, there will be two "founded by" sides connecting Apple and next to Steve jobs.
Products: neo4j, Infinite Graph, Orientdb
who is using: Adobe (neo4j), Cisco (neo4j), T-mobile (neo4j)
Applicable scenarios
1. in some highly relational data
2. recommended engine. If we show the data in the form of graphs, it will be very useful to recommend the formulation
Scenario Not applicable
Data model that is not appropriate. The application scope of the graph database is very small, because very few operations involve the entire diagram.
Clean sweep current types of nosql, applicable scenarios, and use of the company