Two Misunderstandings of database design that programmers should know

Source: Internet
Author: User

However, in the projects I have experienced, some database design problems may exist, especially the two points that will be described below. I personally think this should be avoided, summarize and discuss with you.

Alternative Field

Symptom description:

In a data table, we not only design the required fields, but also set aside several fields for backup.

For example, I have designed a Person table (Person) with various necessary fields added, including Name, Sex, and birthday. After the success, I suddenly thought that in the future, the system willThere should beMany other people-related content, such as graduate schools and work units, may be used in the future even though it is not required. Shoot your head, add five varchar2 fields, called Text1, Text2 ...... Text5, and then I thought that there should be some date fields that need to be used up. Then I created three date fields named date1, date2, date3 ,......

Cause analysis:

We should have seen the problem. In this data table, there are a large number of temporarily useless fields. We can call them backup fields. What are their functions? YesJust in case, guard against possible situations.

This seems to be a precaution. When it is necessary, you do not need to add new fields to the table, the data of a table should be stored in adjacent physical space, which is also good for performance.

Another reason is that, in the old database, if you change the definition of the database (including adding a field, changing the field type, deleting a field, and so on), all the data in the database will be lost, this is very troublesome. We need to first create a temporary table, back up the data, create a new table, import the data into it, and then delete the original table.

Problem:

This method may cause many problems for the project, and the problem that was originally intended to be solved may not be solved. If you do not believe it, please read on.

Problem 1: when a large number of standby fields are addedWaste of spaceAlthough there may be no specific data, only empty fields occupy a certain amount of space.

Question 2:Due to the naming features, it would not take long (maybe two or three years) to complete the document management process.No one can clarify which field represents what it means.. Even if there is document management, these management work will be troublesome, and each time you use it, you need to apply for it, there may be conflicts.

Question 3: will it be enough to add these standby fields?Not necessarily, because we only set aside a few backups for each type of field. If the number exceeds, or we want to use a special and uncommon type, we still need to add new fields. For example, in the above Person table, if we want to store photos, we may need to add a blob type photo field, which may not be set aside in the initial design. Without comprehensive management, who can clearly identify which field has been used and which field can be used? At that time, new fields will not be added.

Solution:

In fact, the above design method is an "over-Design". What we should do is "on-demand design". After detailed and effective analysis,Place only necessary fields in the data table, rather than leaving a large number of backup fields.

When you need to add relevant information, you need to analyze the specific situation:

If the number is small and the nature of the information is closely related to the original table, you can directly add fields to the original table and update the relevant data.

If the number is large, or it is not a crucial attribute of the original table object, you can add a new table and connect it with a key value.

For the performance problems caused by the storage location of table data, we can solve the problem by reorganizing the database data at a specific time. For a long-running database, it also needs to be conducted on a regular basis.

Misunderstanding 2 meaningful Coding

Symptom description:

Use meaningful Encoding As the ID of a record or even as the primary key of the database. For example, if an employee's code is set to 0203004, 02 indicates the employee's branch, 03 indicates the employee's department, 004 indicates the serial number of the employee entering the department.

Cause analysis:

The ID can be set in the following ways: one is a pure serial number, starting from 1, adding 1 at a time, or improving it to convert the number into a string format, for example, "0000001" is a meaningless random encoding, such as GUID, and a meaningful encoding. a specific number of digits indicates a certain meaning.

I think the reason why you like this method is mainly because you want to get some information from the encoding, and even some programs have special modules for encoding parsing. Just like our ID card number, you can see the ID card number to know the location, birthday, gender, and other information of your ID card.

Problem:

In fact, meaningful encoding may cause many problems. Please refer:

Question 1: waste of coding Resources. If it is a pure serial number, it can represent 10000 records from 1 to 10 thousand. However, if meaningful encoding is used, it is very likely that 1000 records will make the five-digit encoding not enough. I have met the real situation. The first place in the insurance number of our company is meaningful. It represents the channel to which the insurance policy belongs, followed by a long string of numbers (9 digits ). Theoretically, these codes will never be used up. However, the first three channels use three codes: 1, 4, and 7. However, a new insurance law is implemented, as a result, the original insurance policy was voided, and three numbers 2, 5, and 8 were enabled. Then the company changed its name and the three channels respectively decommissioned the insurance policy and re-enabled the new start number, in this way, in just a few years, all the insurance numbers have been used up. In fact, only 1 million insurance policies have been printed.

Problem 2: It is not necessarily unique and difficult to act as a primary key.. Think about it. Our ID card number is like this. In the first 15 digits, the last three digits are serial numbers, while men use odd numbers and women use even numbers. In this way, no more than 500 men and women can be in the same region for the same day, otherwise, the number will be duplicated. Although the probability of this phenomenon is relatively low, it still exists objectively.

Question 3: The meaning is not necessarily accurate.. For example, a meaningful code is used to define the employee's employee ID, which may have the meaning of Department, position, etc. However, if an employee is transferred between departments or the employee's rank changes, do you need to change his encoding? Change, so all the historical data will be modified once, and the workload will be very large. If you don't change it, the meaning will not be accurate, we cannot get the employee's accurate information from the code.

Solution:

Therefore, meaningful encoding is not recommended for encoding.Or use a pure serial number, but this may need to define a type with a relatively large range, and may not be enough for the data of massive records; then you can use a GUID, in this way, the encoding will never be repeated, and a large number of encoding resources will be available.

From the above two points, we can see that in the database design process, some methods have been used in many systems, but have brought many problems. In this case, we should think carefully and then make up our minds to resist it.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.