Foreword: Author Shivprasad Koirala, former Microsoft asp/asp. NET's Mvcproject league. Now he is in the CEO Office of India. The first time a student translates, if wrong. Advised.
Basic introduction
Before you start reading this article, let me tell you that I am not a master of database design. The following 11 guidelines are derived from the project, from my own experience and from my own understanding and learning.
I personally think that the use of these guidelines in database design can benefit me.
I also welcome any criticism or advice.
The reason why I write such an exhaustive article. As a result, many developers have embraced the "three paradigm" when designing databases (the Translator notes: But not the actual situation). They feel that the three paradigms are the only true meaning of database design. With the development of the project, the people who hold these ideas will often hit the wall and encounter endless troubles.
Suppose you don't quite know what a paradigm is. To be able to click here:3 normal forms: part1-part2-part3(Translator Note: The original yotube video, because it is known for reasons that cannot be viewed domestically, I have shared the original video to the csdn download area, download the free points, this tutorial will teach you these three paradigms.
"Paradigm" is a very important criterion. But suppose to obey completely. If you don't make the slightest change, you'll often get into trouble. When I design a database, I keep these 11 important guidelines in mind. Guideline 1: What is the nature of the application (OLTP or OLAP)?
(Translator Note: Online transaction processing or online analysis processing, that is, preference for additions or deletions or bias to data analysis)
When you start designing a database, the first thing to do is to analyze the nature of the application you're designing, whether it's transactional or analytic. You'll find that many developers simply don't consider what type of application they are, but all of them are designed according to the "paradigm", and then come across a variety of performance and customization issues. As I have said. There are two applications: transactional-based and analytic-based, so let's take a look at what these two types are in detail.
transaction type : This kind of application, finally the user to add and delete to change the attention, for example, adding, reading, updating and deleting records.
For this type of database, formally we call it OLTP (translator: Online Transaction processing, online transaction processing).
Analytic type : This kind of application, finally the user pays more attention to the data analysis, produces the report and to the data forecast and so on.
A database of this type rarely runs an operation that inserts or updates data. The main intent of such a design is to obtain data from the database at the fastest speed. and analyzed. For such a type of database. Formally, we call it OLAP (translator Note: Online analytical processing, on-line Analytical processing).
In other words, it is more important to assume that you are making inserts, updates, or deletions on the database today. Then you should follow the standard paradigm to design, or you should create a flat non-canonical database structure.
Shows how to use denormalized design ideas in the left-hand name and address table to create flat, non-canonical data structures on the right side that do not conform to the standard paradigm.
Guideline 2: Divide your data into several logical fragments. Make Life better
This rule is actually the first paradigm. One obvious feature of the violation of this rule is that your query statement uses a whole bunch of string parsing functions. such as substring, charindex and so on. Assuming this is the case, then you need to apply this rule.
For example, there is a field in the table "Student Name". If you want to query a student whose name has "Koirala" without "Harisingh", you can imagine what the query statement would look like when you wrote it.
So a better solution would be to break the field down into more specific logical fragments so that we can write more cleanly. Also more perfect query statements.
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvagfjdhjvea==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">
Guideline 3: Do not over-use guideline 2
The developers are all smart men. Suppose you tell them a way, then they always use this method.
Guideline 2 Using your head will result in something you don't want at all.
But guideline 2 is very practical in itself.
When you think of "Take this thing down", pause and ask yourself, is it really necessary to dismantle it? As I have just said. The split must be logical.
For example, you can see that there is a field in the phone number that you almost don't have on the number ISD (International subscriber dialing. International subscriber dialing) are managed separately (unless there is a real need in which project). So it's obviously wiser to put the number on it. If you really tear them apart, then you will have to face a lot of "split Sequela".
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvagfjdhjvea==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">
Guideline 4: Redundant, non-uniform data is your biggest enemy.
Be careful with redundant data. and refactor them. I'm not worried about redundant data taking up a lot of disk space, what I'm worried about is the confusion and confusion caused by these redundant data.
For example, in the, you will find that "5th standard" and "fifth standard" is a meaning, just descriptive narrative form slightly different. You might be explaining that. These data were previously recorded in Random. and has not yet been verified.
When you plan to export a report, two different reports will be generated for this field. When the user sees the report, they're afraid they're going to get messy in the wind.
One solution is to move the data to a new table. The original table then uses the data in a referenced manner. As you can see, I created a new table called "Standards" and joined the two tables with a foreign key.
Guideline 5: Beware of data separated by separators
The second section of the first paradigm specifies that multiple values cannot be stored in a single position. is a sample that has multiple values stored. Careful observation of the "syllabus" field will reveal that we've stuffed too much data into this field. Such fields we become "repeating groups", assuming that we want to manipulate the data, not only the query statement will write very complex (and right or wrong). And I very much doubt the efficiency of such queries.
Some of these columns are filled with very many data fields. To be particularly careful, a better solution is to move these fields to a different table and then connect them with foreign keys for better management of logic and operations.
We need to comply with the second rule of the first paradigm: a location cannot store multiple values. As you can see, I have created a separate "syllabus" table and then established a many-to-many association with the main table.
In such a way, the "Syllabus" field in the main table is no longer crammed into a whole bunch of data separated by separators.
It is logically easy to understand and more operational.
Guideline 6: Beware of partial dependence
Beware of fields that are partially dependent on the primary key, such as the primary key in roll number and standard.
Now look at "syllabus", which is associated with the standard field, not directly with the student ("Roll number" field).
The "Syllabus" field is associated with the "standard" field that the student is learning. Without being directly associated with the student.
So tomorrow I want to update the "syllabus", I have to have all the relevant "Student" field also updated. Man, this is a hard work of strength. And there's no logic! Therefore, it is more meaningful to pull the "syllabus" field separately and then associate it with "standard" through a foreign key.
You can see how I pulled out the "syllabus" field and then contacted "standards".
This rule is actually the second paradigm itself: all attributes must be all dependent, and there should be no fields that are partially dependent on the primary key.
Guideline 7: Select derived columns in detail
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvagfjdhjvea==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">
Suppose you are designing an OLTP database, a transactional database. It would be wise to avoid deriving columns. Unless there is a high demand for the performance of this piece of content.
But for an OLAP database that requires a lot of summation and computation. That is, an analytic database. These derived columns are very necessary and can make database performance more advantageous.
You can see. Average relies on total marks and total subject to calculate, which is one of the manifestations of data redundancy. So when you come across fields that derive from other columns, do you really need to think about this field?
As the third Paradigm says, a field cannot depend on any non-primary key field. I personally feel that we should not blindly follow the third paradigm, depending on the circumstances. Having redundant data is not always a bad thing. Assuming that these redundant data are computed, it is up to the situation to decide whether to follow the third paradigm.
Guideline 8: Assuming that performance is the key, do not strictly avoid redundant data
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvagfjdhjvea==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">
Don't think of "avoiding all redundant data" as a dead order. Assuming that the performance requirements are very high, consider the non-standardized bar. comply with the standards. You often need to use the join command to connect very many tables, but in non-normalization, set up some redundant information. There are not so many join commands. So the performance will be much better.
Guideline 9: Multidimensional data is a completely different beast.
Projects that use OLAP (analytic) databases mainly deal with multidimensional data. For example, you can see, you want to get sales per country, sales per customer, and other data. In short, the sales you're focusing on are the intersection of three dimension data.
In this case, it would be better to design a dimension. In simple terms. You can create a sales table with a "Sales Amount" field (which is the central table), and then set up some foreign keys to associate with other one-dimensional tables.
Guideline 10: Design of a centralized key-value table
I often come across key-value tables.
The key value table means that it stores the key and the data corresponding to the key (the translator notes: Like the use of maps in a data structure, it is a fairly simple table). For example, the following table you can see there is a currency table and a country table. Careful observation reveals that there is actually only one key and one corresponding value in both tables.
For such a type of data table, create a centralized table and then set up a "type" field to differentiate.
This will make your database work better.
Guideline 11: The primary key to reference itself for multilevel data. or set up a foreign key
I will often encounter multiple episodes of data. Consider a multi-level sales solution where a salesperson can have very many salespeople. In such cases, referencing your own primary key or setting up a foreign key will help you achieve the desired effect.
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvagfjdhjvea==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/southeast ">
This article does not allow you to not abide by the standard paradigm. Instead of blindly obeying, the first thing you need to consider is the properties of the project and what type of data you are dealing with.
Here is a simple school table to explain the three standard paradigm of video (translator Note: With the above "3 normal forms" for the same video, downloaded readers do not have to repeatedly download)
SQL Server--Can explain First,second and third normal form in SQL Server [Part1]
SQL Server--Can explain First,second and third normal form in SQL Server [Part2]
SQL Server--Can explain First,second and third normal form in SQL Server [PART3]
Applause
From the original: Http://www.codeproject.com/Articles/359654/11-important-database-designing-rules-which-I-foll
I'm going to abide by the 11 article Database Design Guide