relational databases and NoSQL databases what is NoSQL
Have you ever heard of "NoSQL"? In recent years, this word has been of great concern. To see the word "NoSQL," you might mistakenly think it was "no! SQL "abbreviation, and deeply outraged:" How can SQL be unnecessary? " "But in fact, it is the abbreviation for" not only SQL ". Its meaning is: the application of relational database when the use of relational database, when not applicable to non-use of relational database is not necessary, you can consider the use of more appropriate data storage.
To compensate for the shortage of relational databases, a variety of NoSQL databases emerged.
To better understand the NoSQL database presented in this book, the understanding of relational databases is essential. So, let's take a look at the history, classification, and characteristics of relational databases.
A brief history of relational databases
In 1969, Edgar Frank Code (Edgar Frank Codd) published an epoch-making paper that first presented the concept of a relational data model. But unfortunately, the publication of the paper "IBM Research" is only IBM's internal publications, so the paper response is mediocre. In 1970, he once again published in the journal Communication of the ACM a paper entitled "A Relational Model of data for Large shared data Banks" (a relational model for large shared databases), which finally caused The attention of the people.
The concept of the relational data model proposed by Freudenberg has become the foundation of today's relational database. At that time, the relational database was not actually applied because of its poor hardware performance and slow processing speed. However, with the improvement of hardware performance, combined with the advantages of simple use and superior performance, the relational database has been widely used.
Versatility and High performance
Although this book is about the NoSQL database, there is an important premise, please do not misunderstand. This premise is "relational database performance is absolutely not low, it has very good versatility and very high performance". Undoubtedly, it is the most effective solution for most applications.
Outstanding Advantages
relational database, as a general-purpose database widely used, has the following main advantages:
- Maintain data consistency (transaction processing)
- Due to standardization, the cost of data updates is small (the same fields are basically only one place)
- Complex queries such as joins can be made
- There are a lot of practical results and technical information (mature technology)
Among these, the ability to maintain data consistency is the greatest advantage of relational databases. In the case of strict data consistency and processing integrity, it is certainly not wrong to use a relational database. However, some cases do not need to join, the advantages of the above-mentioned relational database is not particularly necessary, it seems that there is no need to rigidly adhere to the relational database.
The shortage of relational database
Not good at handling
As mentioned before, the performance of a relational database is very high. But it's a general-purpose database, and it's not fully adaptable to all uses. Specifically, it is not good at the following processing:
- Write processing of large amounts of data
- Index or table structure (schema) changes for tables with data updates
- Apply when field is not fixed
- Processing that requires a quick return of results for simple queries
。。。。。。
NoSQL Database
To compensate for the lack of relational databases (especially in recent years), NoSQL databases have emerged. The relational database is widely used and can handle complex processing such as transaction processing and join. In contrast, NoSQL databases are only used in specific areas and are largely non-complex, but they compensate for the shortcomings of the relational databases enumerated earlier.
Easy dispersion of data
As mentioned earlier, relational databases are not good at writing large amounts of data. The original relational database is the premise of join, that is, the correlation between the data is the main reason for the name of the relational database. For join processing, relational databases have to store data in the same server, which is not conducive to the dispersion of data. Instead, NoSQL databases do not support join processing, and each data is designed independently, and it is easy to spread the data across multiple servers. Because the data is scattered across multiple servers, the amount of data on each server is reduced, even if a large amount of data is written, it is easier to handle. Similarly, the read-in operation of the data is equally easy.
Increase performance and increase scale
Here is a little digression, if you want to make the server can easily handle a larger amount of data, then there are only two choices: one is to improve performance, the second is to increase the size. Let's take a look at the difference between the two.
First, boosting performance means improving the processing power by increasing the performance of the existing server itself. This is a very simple approach, and there is no need to change the program, but it requires some expense. To buy a server that doubles its performance, it often takes more than twice times as much money as it does, and can take up to 5 to 10 times times. This method is simple, but expensive.
On the other hand, increased scale refers to the use of multiple inexpensive servers to improve processing power. It needs to make changes to the program, but it can control costs because of the use of inexpensive servers. In addition, as long as the gourd painting to increase the number of inexpensive servers.
Is it not necessary to use large amounts of data for processing?
NoSQL databases are basically designed to "make it easier to write large amounts of data (making it easier to increase the number of servers)". But wouldn't it make sense to use NoSQL databases without manipulating large amounts of data?
The answer is in the negative. Indeed, it has an advantage in dealing with large amounts of data. But there are also a variety of features in the NoSQL database, which can be very helpful if these features are used properly. Specific examples will be introduced in chapters 2nd and 3rd, which will give you a sense of the benefits of using NoSQL.
- Want to do a smooth caching of data (cache) processing
- Expect high-speed processing of data for array types
- Want to save All
A variety of NoSQL databases
There are various kinds of nosql database, such as "Key-value storage", "Document Database", "Columnstore database", and each kind of database contains its own characteristics. In the next section, let's look at the types and characteristics of NoSQL databases.
What is a NoSQL database?
NoSQL is easy to say, but actually how many are there? When I was getting a pen, I confirmed to the NoSQL official website that there were 122 of them. In addition, the official website also introduced the book does not involve the graphics database and object database and other categories. Unconsciously, there have been so many NoSQL databases.
This section will introduce you to a representative NoSQL database.
Key-value Storage
This is the most common NoSQL database, and its data is stored in the form of Key-value. Although it is very fast to handle, it can basically only get data through a fully consistent query of key. Depending on how the data is saved, it can be divided into temporary, permanent and both three kinds.
Temporary
Memcached belongs to this type. The so-called temporary is the "data can be lost" meaning. Memcached keeps all the data in memory so that it can be saved and read very quickly, but when the memcached stops, the data does not exist. Data that exceeds the memory capacity cannot be manipulated because the data is kept in memory (old data is lost).
- Saving data in memory
- Enables very fast save and read processing
- Data is likely to be lost
Permanent
Tokyo Tyrant, Flare, Roma, etc. belong to this type. Contrary to the temporary, the so-called permanent is the "data will not be lost" meaning. Instead of saving data in memory like Memcached, the Key-value store stores the data on the hard disk. Compared with memcached in memory processing data, because of the inevitable to occur to the hard disk IO operation, so there is still a gap in performance. But the data is not lost is its biggest advantage.
- Saving data on a hard disk
- Very fast save and read processing possible (but not comparable to memcached)
- Data is not lost
Both
Redis belongs to this type. Redis is special, temporary and permanent, and combines the benefits of temporary key-value storage and permanent key-value storage. Redis first saves the data to memory and writes the data to the hard disk when certain conditions are met (by default, more than 15 minutes, more than 10 in 5 minutes, and more than 10,000 key changes in 1 minutes). This ensures that the data in memory is processed, and that the data is persisted by writing to the hard disk. This type of database is particularly well suited for working with array types of data.
- Save data on both memory and hard disk
- Enables very fast save and read processing
- The data saved on the hard drive will not disappear (can be restored)
- Suitable for handling data of array types
Document-oriented database
MongoDB, couchdb belong to this type. They belong to a NoSQL database, but differ from key-value storage.
Do not define table structure
A document-oriented database has the following characteristics: even if you do not define a table structure, you can use it just as you would define a table structure. Relational databases are more cumbersome to alter the table structure and need to be modified to maintain consistency. NoSQL databases, however, can save the hassle (usually the program is correct), which is really quick and easy.
You can use complex query conditions
Unlike Key-value storage, a document-oriented database can fetch data through complex query conditions. While there is no processing power for transactional and join these relational databases, other processing is basically possible. This is a very easy to use NoSQL database.
- No need to define table structure
- Can take advantage of complex query conditions
Column-oriented database
Cassandra, Hbase, hypertable belong to this type. This type of NoSQL database is particularly compelling because of the explosive growth in data volumes in recent years.
Row-oriented and column-oriented databases
The common relational database is to store the data in the behavioral unit, and is good at reading in the behavior unit, such as the acquisition of the specific condition data. Therefore, a relational database is also known as a row-oriented database. In contrast, a column-oriented database stores data as a unit, and is adept at reading the data in columns.
High scalability
Column-oriented databases are highly scalable, and they are primarily used in situations where large amounts of data are needed, even if the data is increased without reducing the processing speed (especially the write speed). In addition, it is useful to use the advantages of a column-oriented database to update large amounts of data as a batch program's memory. But because the column-oriented database is very different from the current thinking mode of database storage, it is very difficult to apply.
- High scalability (especially for write processing)
- Very difficult to apply
Recently, the advantages of a column-oriented database are very useful for some of the services that are needed to update and query large amounts of data, such as Twitter and Facebook, but this is not covered in detail because it has little to do with the content of this book.
Summarize:
NoSQL is not a no-sql, but a not-only SQL.
The advent of nosql is to compensate for the lack of performance in SQL database processing of large amounts of data and high concurrent requests due to mechanisms such as transactions.
NoSQL is not a substitute for SQL, it's a substitute, not a solution.
The vast majority of NoSQL products are based on large memory and high-performance random reads and writes (such as SSD arrays with higher performance), and small businesses in general must be cautious when choosing NoSQL! Do not nosql for NoSQL, it may lead to the cost of money and delay the project process.
NoSQL is not universal Can, but in large projects, you often need it!