The performance of relational databases is very high, but it is a general-purpose database and cannot fully adapt to all purposes. Specifically, it is not good at the following processing: writing a large amount of data. When an index or schema change field is not fixed for a table with data updates, the application must quickly return results for simple queries (1) handling a large amount of data
The performance of relational databases is very high, but it is a general-purpose database and cannot fully adapt to all purposes. Specifically, it is not good at the following processing: writing a large amount of data. When an index or schema change field is not fixed for a table with data updates, the application must quickly return results for simple queries (1) handling a large amount of data
The performance of relational databases is very high, but it is a general-purpose database and cannot fully adapt to all purposes. Specifically, it is not good at the following processing:
Write a large amount of data.
Index or schema change for tables with data updates
Apply when the field is not fixed
Quick Response to simple queries
(1) writing a large amount of data:
In terms of data reading, the master-slave mode generated by Replication (Data Writing is performed by the master database, and Data Reading is performed by the slave database ), you can simply scale up the database by adding slave databases. However, there is no simple solution to the scale problem in terms of data writing. For example, if you want to write data to a large scale, you can consider adding the primary database from one set to two, which can be used as a binary primary database for Mutual Association and replication. Indeed, it seems that the load on each primary database can be reduced by half, but the update processing will produce conflicts (the same data is updated to another value on both servers ), data inconsistency may occur. To avoid this problem, we need to allocate the requests of each table to the appropriate primary database for processing, which is not that simple.
You can also split the database into different database servers. For example, you can place this table on this database server and that table on that database server, database segmentation can reduce the amount of data on each database server, so as to reduce hard disk I/O processing and achieve high-speed memory processing, with remarkable results. However, since tables stored on different servers cannot be joined, you must consider these issues before splitting the database. After the database is split, if JOIN processing is required, association must be performed in the program, which is very difficult.
(2) index or schema change for tables with data updates
When using relational databases, you need to create indexes to accelerate the query speed. To add required fields, you must change the table structure. To perform these operations, tables need to be shared and locked. During this period, data changes (updates, inserts, and deletes) cannot be performed. If you need to perform time-consuming operations (for example, creating an index for a table with a large amount of data or changing its table structure), you must note that data may not be updated for a long time.
Shared lock: other connections can read data but cannot modify data. It is a read lock.
Exclusive lock: The write lock is used to read and modify data.
(3) Application When fields are not fixed
If the fields are not fixed, it is also difficult to use relational databases. It is very painful to add fields to the table structure every time they are repeatedly changed. You can also pre-set a large number of pre-fields, but in this case, it is easy to find out the corresponding status of the field and data (that is, the data stored by which field) for a long time ), therefore, it is not recommended.
(4). process the results returned quickly for simple queries
Relational databases are not good at returning structures quickly for simple queries. Because relational databases use specialized SQL languages to read data, they need to parse the SQL language, as well as additional costs such as table locking and unlocking. This does not mean that the speed of a relational database is too slow. Instead, we just want to tell you that if you want to process a simple query at a high speed, there is no need to use a relational database.
Relational databases are widely used for complex processing such as transaction processing and JOIN. Comparatively, NoSQL databases are only applied in specific fields and basically do not carry out complex processing, but they just make up for the shortcomings of the previously listed relational databases.
NoSQL databases originally do not support JOIN Processing. Each data is independently designed to easily distribute data to multiple servers. Because the data is distributed to multiple servers, the data volume on each server is reduced. Even if a large amount of data is written, the processing is easier. Likewise, Data Reading is certainly equally easy.
To improve the processing capability of big data, you can improve the performance (vertical) and increase the scale (horizontal) in two ways. To improve the performance, you can improve the processing capability by improving the performance of the current server. This requires a high cost.
Increasing scale means using multiple cheap servers to improve processing capabilities. It needs to change the program, but the cost can be controlled due to the use of cheap servers. In addition, you only need to increase the number of servers in the future.
Key Value storage:
This is the most common SQL database, and its data is stored in the form of key values. Although the processing speed is very fast, it can only obtain data through completely consistent query of keys. Data can be saved in either temporary or permanent mode.
Temporary: memcahced stores all data in the memory, so that the storage and reading speed is very fast.
Permanent: the data is stored on the hard disk. Compared with memcached's data processing in the memory, there is still a performance gap because I/O operations on the hard disk are required.
BOTH: Redis belongs to this type. First, save the data to the memory and meet the specific conditions (the default value is more than 15 minutes, more than 10 keys within 5 minutes, and more than 10000 keys within 1 minute) write Data to the hard disk. This not only ensures the data processing speed in the memory, but also ensures the data persistence by writing data to the hard disk.
Document-oriented databases do not define table structures. Different from key-value storage, document-oriented databases can obtain data through complex query conditions. Although it does not have the processing capabilities of transaction processing and JOIN relational databases, other processing capabilities can basically be achieved.
Column-oriented database:
Common relational databases store data in the unit of behavior. They are good at reading and processing data in the unit of behavior, such as obtaining data under specific conditions. Therefore, relational databases are also known as row-oriented databases.
A column-oriented database reads a few columns from a large number of rows and updates specific columns of all rows at the same time.
The column-oriented database has high scalability and will not reduce the processing speed (especially the writing speed) even if the data is increased. Therefore, it is mainly used when a large amount of data needs to be processed. In addition, it is also very useful to update a large amount of data by taking advantage of the column-oriented database as the memory of the batch processing program.