Eliminate the current nosql type, applicable scenarios and use companies

Last Update:2018-12-05 Source: Internet

Author: User

Tags cassandra riak neo4j

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the past few years, relational databases have been the only choice for data persistence. Data workers only consider filtering these traditional databases, such as SQL Server, Oracle, or MySQL. You can even make some default options. For example, if you use. net, you generally select SQL server. If you use Java, you may prefer Oracle, Ruby is MySQL, and python is PostgreSQL or MySQL.

The reason is simple: the robustness of relational databases has been proven in most applications for a long time. We can use these traditional databases to control concurrent operations and transactions. However, if traditional relational databases are always so reliable, what else will nosql do? Nosql survive and develops because it does what traditional relational databases cannot do!

Problems in relational databases

Impedance mismatch

We use python, Ruby, Java,. net, and other languages to write applications. These languages share a common feature-object-oriented. However, we use MySQL, PostgreSQL, Oracle, and SQL server. These databases share a common feature-relational databases. The term "impedance mismatch" is involved here: the storage structure is object-oriented, but the database is relational. Therefore, we need to convert each time we store or query data. Orm frameworks like Hibernate and Entity Framework can simplify this process, but these ORM frameworks are stretched when there is a high-performance requirement for queries.

Application scale grows

As network applications grow, we need to store more data, serve more users, and require more computing power. To cope with this situation, we need to constantly expand. There are two types of Scaling: vertical scaling, that is, purchasing better machines, more disks, more memory, and so on; horizontal scaling, buy more machines to form a cluster. Under a large scale, vertical scaling does not play a very large role. First, the performance improvement of a single machine requires a huge amount of overhead and has a performance limit. Under the scale of Google and Facebook, it is never possible to use one machine to support all the loads. In view of this situation, we need a new database because relational databases cannot run well on clusters. You may also build a relational database cluster, but they use shared storage, which is not the type we want. As a result, there is a nosql era that uses Google, Facebook, and Amazon to handle more data transmission.

Nosql Era

There are already many nosql databases, such as MongoDB, redis, Riak, hbase, and Cassandra. Each has one of the following features:

No longer use the SQL language. For example, MongoDB and Cassandra have their own query languages.
Usually open-source projects
Created for cluster running
Weak structuring-data structure types are not strictly restricted

Nosql Database Type

Nosql can be divided into four types:Key-value, document-oriented, column-family databases, and graph-oriented databases.The following describes the features of these types:

I. Key-value Database

Key-value databases are like hash tables used in traditional languages. You can use keys to add, query, or delete data. Because primary key access is used, the performance and scalability are good.

Products: Riak, redis, memcached, Amazon's dynamo, project Voldemort

Who is using:GitHub (Riak), bestbuy (Riak), Twitter (redis and memcached), stackoverflow (redis), Instagram (redis), YouTube (memcached), Wikipedia (memcached)

Applicable scenarios

Stores user information, such as sessions, configuration files, parameters, and shopping carts. This information is generally linked to the ID (key). In this case, the key-value database is a good choice.

Unsuitable scenarios

1.Instead of querying by key, you can query by value. The key-value database does not use the value query method at all.

2.The relationship between data to be stored. In the key-value database, two or more keys cannot be used to associate data.

3.Transaction support. Rollback cannot be performed when a fault occurs in the key-value database.

Ii. Document-oriented database

Document-oriented databases store data as documents. Each document is a self-contained data unit and a collection of data items. Each data item has a name and a corresponding value. The value can be either a simple data type, such as a string, number, or date, or a complex type, such as an ordered list and associated objects. The minimum unit of data storage is document. The document attributes stored in the same table can be different. data can be stored in XML, JSON, jsonb, and other forms.

Product:MongoDB, couchdb, ravendb

Who is using:SAP (MongoDB), codecademy (MongoDB), Foursquare (MongoDB), and NBC News (ravendb)

Applicable scenarios

1.Logs. In an enterprise environment, each application has different log information. The document-oriented database does not have a fixed mode, so we can use it to store different information.

2.Analysis. Given its weak pattern structure, you can store different measurement methods and add new measurements without changing the pattern.

Unsuitable scenarios

Add transactions on different documents. The document-oriented database does not support inter-document transactions. If you have requirements for this, you should not choose this solution.

Iii. Column store (wide column store/column-family) Database

The column storage database stores data in the column family. A column family stores data that is frequently queried together. For example, if we have a person class, we usually query their names and ages together rather than their salaries. In this case, the name and age are put into one columnfamily, while the salary is in another columnfamily.

Product:Cassandra, hbase

Who is using:EBay (Cassandra), Instagram (Cassandra), NASA (Cassandra), Twitter (Cassandra and hbase), Facebook (hbase), Yahoo! (Hbase)

Applicable scenarios

1.Logs. Because we can store data in different columns, each application can write information to its own column family.

2.Blog platform. Each information is stored in different columns. For example, a tag can be stored in one, a category can be stored in one, and an article can be stored in another.

Unsuitable scenarios

1.If acid transactions are required. Vassandra does not support transactions.

2.Prototype design. If we analyze Cassandra's data structure, we will find that the structure is based on the expected data query method. At the beginning of model design, it was impossible to predict the query method. Once the query method changed, we had to redesign the column family.

4. Graph-oriented database

Graph database allows us to store data in graphs. Objects are used as vertices, while relations between entities are used as edges. For example, if we have three entities: Steve Jobs, Apple, and next, there will be two "founded by" sides connecting apple and next to Steve Jobs.

Product:Neo4j, infinite graph, orientdb

Who is using:Adobe (neo4j), Cisco (neo4j), T-Mobile (neo4j)

Applicable scenarios

1.In some highly correlated data

2.Recommendation engine. If we present the data in the form of graphs, it will be very beneficial for recommendation formulation.

Unsuitable scenarios

Unsuitable data model. Graph databases have a small scope of application, because few operations involve the entire graph.

Original article:
Nosql databases, why we shoshould use, and which one we shoshould choose (compilation/ZhongHao review/Zhou Xiaolu)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More