Data redundancy for new knowledge of database

Source: Internet
Author: User

Today to record a company's technology sharing, let me understand that the original data redundancy is also a tool.

Database paradigm

When you go to school to learn the database must be to talk about the paradigm, here is a brief review.
-First normal form (1NF)
All domains should be atomic, that is, each column of a database table is an indivisible atomic data item, not a collection, an array, a record, and other non-atomic data items, that is, no repeating fields.
-Second normal form (2NF)
On the basis of satisfying the first paradigm, it is required that each instance or record in a database table must be divided by a unique region.
-Third paradigm (3NF)
On the basis of satisfying the second paradigm, any non-primary attribute is not dependent on other non-principal properties.

A certain amount of redundancy can improve performance

Redundancy refers to the duplication of data in a data set called data redundancy.

1. Space Change Time

There is a dictionary table city which has ID and cityname two fields, there is a business table, which has ID, Cityid, xxx, xxx ... Field.
If you query the business table, you have to join the City dictionary table, if the business table is very large, then the query is very slow, this time we can use redundancy to solve this problem, the business table directly to replace the Cityid into CityName, This way we do not need to join the city's dictionary table when querying the business table. This is obviously not in accordance with the paradigm of our database design, but such redundancy may be necessary.

2. Querying a status Value data

The Business table has a field status that is used to store commits and uncommitted, assuming that the uncommitted data in this table is less than the data submitted, and when the user queries all uncommitted data, it needs to be in full data and then filter out the uncommitted data. If this business table is very large, then the efficiency of such a query is very slow.
At this point, we can redundant the uncommitted data in this business table into a new table , so that when the user queries the uncommitted data, they can query the uncommitted table directly, and the query speed commits a lot.

3. Splitting of active and inactive data

A business table has such a feature, users are often in the search for the last three months (or months) of data, the data is growing every day, due to the growth of the database table, the query becomes slower, performance bottlenecks. This time can be divided into three months according to business data and three months of data, three months outside the data can also be broken down by year (or month or quarter) into different shards. So the user's query will be a lot of hits with the table within three months, and this table data is limited, and the amount of data is not particularly large, so as to solve the performance bottleneck.

4. Aggregated data stored separately

A business table holds daily transaction data, and the user sometimes wants to see the total amount of a transaction for a particular quarter or year. Several users fortunately, when the user volume increases, the query number increases, then the real-time query method has not satisfied the request, this time we can put this kind of summary data alone into a table, by the time when the number of users in the night to calculate, When the user queries again, we directly display the data in this summary table, not by querying the transaction data table in real-time calculation, this can be a lot of performance improvement.

Data redundancy for new knowledge of database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.