Recently intended to write a database modeling article, so I intend to analyze Microsoft's official SQL Server sample database AdventureWorks, to see what is worth learning in this database.
First we need to download the installation of a SQL Server database engine, and then download the sample database, where the author is sql2008r2, so the download is adventureworks2008r2,:
http://msftdbprodsamples.codeplex.com/
You can see this database after you download the database and attach it to SQL Server.
This is a database of bicycle manufacturing and sales companies, the company builds its own sales website and offers online sales. First, look at the structure of this database, which establishes multiple schemas that divide the modules of the table, such as humanresources,person,production,purchasing and sales, by schema. If it is a very common table, such as a log table, then it does not belong to any module, using the system default schema:dbo.
For such a complex model, we can analyze it in the order of the relationship between the primary entity, the subsidiary entity, and the transaction entity.
Primary entity
For the system as a whole, businessentity is the core entity used to denote a "person", where the person is quoted because it can represent both a real natural person or a company, an organization, or even a store, which can be considered a legal entity. For this database model, there are 3 entities that inherit from BusinessEntity, which is Person,store,vendor.
?
For people who are natural, he may be a company employee or a client, so we have two entity employee and customer connected. It is important to note that in this model, he does not identify a natural person as a customer, but to a different store, which will form a different customer. In other words, for the company, he does not have customer master data, the same person in different stores to spend, then will be in different stores as a customer. Why this design, it is really strange, may be the business needs it.
?
This extends to the employee, you can humanresources the following entity analysis, it is clear that department is the main entity, as for the relationship between employee and department, we will then analyze, here we only find the main entity. Employee if in the sales department, then is a salesperson, so this entity is inherited from employee. In addition, there is an important entity product in production, which is used to represent the production and sales of products.
?
Affiliated Entities
The so-called subsidiary entity is the entity that is attached to the principal entity, which complements the attributes of the primary entity, and if the primary entity does not exist, then the data in the subsidiary entity is meaningless. For the main entity we are looking for, we have one analysis:
BusinessEntity
BusinessEntity has two subordinate entities: Businessentitycontact and businessentityaddress, for contact entities, is a many-to-many relationship with person, So Businessentitycontact is a multi-to-many intermediate table, plus contacttype describes the type of contact. And for the business entity address, the system also abstracted an address table, make businessentity and address form a many-to-many relationship.
?
Person
For the person table, the associated table is divided into two categories, one is a one-to-many or many-to-many common association tables, such as a man with multiple personphone, a person with multiple emailaddress, or a person holding multiple credit cards Personcreditcard. Here CreditCard and person set to many-to-many relationship, I think the creditcard here should include the company's business card situation, the real holder of this card is the company, but the company will be sent to sales, if sales leave, then this card will be withdrawn, Distributed to other employees, so this creates a many-to-many relationship. Another category is a one-to split or inheritance relationship, such as the password table. If it is a simple design, we can completely put the password related fields in the person table, and here is independent of the formation of a one-off relationship, mainly may be the following reasons:
Security Considerations: Password content is confidential, independent into the table can be separate to the table encryption, permission allocation and so on.
Performance considerations: Password's content is only used to authenticate when logging on to the system, and all subsequent queries do not use these fields, so it is not placed in the person table, and the system does not need to look up unwanted fields when querying the person table.
?
Employee
Here is mainly related to the table under the HumanResources, in addition to the basic information of employees also recorded the employee's resume, salary changes, department changes. An employee corresponds to multiple jobcandidate, why is it a one-to-many relationship? Because the candidate can make multiple versions of a CV, and then invest in different departments of the company, and finally if the candidate is admitted, then you can set the BusinessEntityID in JobCandidate as the employee ID, if the application fails, Then BusinessEntityID is null. EmployeePayHistory is the employee's payroll, but is not the payroll record sheet, just record the employee's salary basic information, if the wage change creates a new record. Employee and department are many-to-many relationships, not because one employee is working in multiple departments at the same time, but because the employee's departmental transfers are recorded, so all history is maintained and many-to-many relationships are formed. In addition to the general company's departmental staff table, the system also has a shift table shift, that is because this is a manufacturing company also has stores for sale, so it will be divided into the early, mid-shift and night shifts, a staff of the shift is fixed, if the change, such as the previous night shift, now to the early morning, A new record is also generated in the employeedepartmenthistory.
?
Sales
Sales inheritance to employee, there are sales areas, sales quotas and other additional attributes. Its own sales area and sales quotas can be seen as attributes of the sales table, but in order to record history, there is a one-to-many table: Salesterritoryhistory and SalesPersonQuotaHistory.
?
It is necessary to state that the SalesTerritory table is not a subsidiary table of sales, and that he is a separate entity.
Product
This entity should be the most complex entity in each of the main entities. Mainly divided into ProductModel and product two blocks.
First said ProductModel, can be understood as a sample, prototype or model, in the amount of pre-natal needs to produce ProductModel. For ProductModel, there are mainly parts of the product diagram illustration and description productdescription. ProductModel and illustration are common many-to-many relationships, a model has multiple part diagrams, and a part diagram can also be used in multiple prototypes. And for the description, in addition to the common many-to-many relationship, but also added a multi-lingual relationship. Thus, the culture table was added, forming a many-to-many relationship between the three tables. In fact, this multi-language model is not good, it is easy to produce errors, for multi-lingual processing, can build a better model.
?
Next is the product entity, which divides the related tables into three categories:
Many-to-one: Product classification category and the aforementioned ProductModel.
One-to-many: product cost history productcosthistory, product composition BillOfMaterials, product inventory productinventory, product price history productlistpricehistory, Product Review ProductReview.
Many-to-many: product documentation productdocument and product photo ProductPhoto.
?
Product Classification There is nothing to say, is the ordinary classification of two, a large category in the ProductCategory, two-class small classification in ProductSubCategory, and then all products must belong to the two-level small classification. Both Productcosthistory and ProductListPriceHistory are due to record a one-to-many relationship based on the history of the time period, where there must be startdate and enddate to divide the time interval. "Historical data Record"
With regard to product documentation and product photos, product photos can be reused due to the presence of reuse (e.g. the appearance of the product is identical, but some internal parameters are different). Therefore, many-to-many relationships are formed, and there is an intermediate table for many-to-many relationships. Product image because there will be detailed photos, various angles of the photos, so in the many-to-many relationship table also defined a primary field to indicate whether the currently selected photo is the main photo.
Transaction entity
The previously analyzed entities are the objects of the subject in the principal predicate, and next we want to analyze the objects which are related to each other and are generated after the transaction operation.
For salesperson, Product, and customer, it is the sales order that we associate with:
SalesOrder
As long as there is something involved in the form (sales order, expense statement, purchase order, invoice, etc.) most of the cases will be divided into header and itemdetail two tables, in the sales order header used to record the sales of documents, customers, the total amount of information, And the Itemdetail recorded the specific sales of products, quantity and other information.
The following first analyzes the header:
?
The header establishes the link between the salesperson and the customer, as well as some of the normalized fields, such as Shiptoaddress,billtoaddress,shipmethod. In addition to these entities, we need to analyze the following entities separately:
Territory, this is in the earlier introduction of sales, said that this sales area and salesperson is related, supposedly, the header table has been associated with the salesperson table, Why do we need to add this header to the territory if we can get the corresponding territory when we order it by salesperson? This is due to the performance of the increased redundancy, for the timeliness of the object, it is best to directly associate, rather than through the intermediate object Jion multiple tables to associate. Let's look at how complex our query is if we don't have a direct correlation with territory:
Select h. *,St. *
from Sales. SalesOrderHeader h
Left Join Sales. Salesperson SP
on H. SalesPersonID=sp. BusinessEntityID
Left Join Sales. Salesterritoryhistory STH
on SP. BusinessEntityID=sth. BusinessEntityID and H. OrderDate between sth. StartDate and sth. EndDate
Left Join Sales. SalesTerritory St
on Sth. TerritoryID=St. TerritoryID
Here's a look at the currency and exchange rate related table currency. In this system, the header does not directly indicate what currency to pay, what currency to settle, exchange rate is how much, but put these fields in the CurrencyRate table, by reference to CurrencyRate to represent. Although it is not straightforward to put in the Header table after independence, but reduce redundancy, only need to do a join to get the results, so performance is acceptable. "Although it is necessary to join the currencyrate and then joincurrency the table to be complete, the Currency table is generally used only for currencyrate qualification and does not need to use the Currency table at query time. Because CurrencyCode is an international standard code, it is enough to show code. 】
Header and SalesReason is a many-to-many relationship, in the customer orders when the user check to buy the reason, is because of the promotion, or read the magazine ads and so on, simple many-to-many relationship, this has nothing to say.
SalesOrderDetail
Header and OrderDetail is a one-to-many relationship, detail records the specific purchase of what products, purchase price, quantity, etc., so the associated is product, but in this system, he is not directly related to the Product object, Instead, a specialofferproduct is established between the multiple-to-many intermediate tables of product and Specialoffer.
?
Why would you do that? This is mainly related to the specific business. Product after production has a price product.listprice, but in the actual sale, the merchant will have a variety of promotional activities (such as buy more than 10 98 percent, more than 25 90 percent, etc.), so will form product and specialoffer many-to-many relationship, maintenance of which products can have what discount 。 In order to unify the model, if the product does not make any discount promotion, also maintains a record "no Discount" in the Specialoffer.
Here is a special technique, specialofferproduct is not independent of its own primary key, but using ProductID and Specialofferid as the joint primary key, Then, when OrderDetail references specific specialofferproduct, ProductID and Specialofferid are referenced to their columns. So on the model, it is the OrderDetail Association specialofferproduct, and then the product, but in the actual query, we can completely ignore the specialofferproduct table, Directly with the OrderDetail to join product, so there is no effect on performance, this is a beautiful design.
When employee, product, and vendor are together, we associate a purchase order:
PurchaseOrder
Like sales orders, purchase orders are also divided into PurchaseOrderHeader and PurchaseOrderDetail.
?
The system first uses ProductVendor to define which vendor can supply which products are generated based on the contents of the purchase order, but not directly on the model, because product belongs to the detail table, and vendor belongs to the Header table. This restriction cannot be passed by reference as mentioned in Specialofferproduct.
The header records the relationship between the purchaser's employee and the supplier vendor. A purchase order header contains multiple detail detail that document which product is purchased. The purchase order is much simpler than the sales order, the most buyers, not to record the promotion, the purchase reason and so on information. In addition, the purchase does not involve currency exchange rate issue, I estimate this is because the products are in the domestic procurement and settlement, so there is only one currency, and sales are oriented around the world, so it involves currency exchange rates.
WorkOrder
In addition to the sales orders and purchase orders mentioned earlier, there are production orders in the production process that represent the production of the product. There are mainly WorkOrder and workorderrouting of two entities.
?
WorkOrder records the quantity, scrap and time of the production of a product, while workorderrouting records what processes are in the specific production process of a particular product, the time and cost of each operation. Overall, this is a very, very simplified model of production work orders.
Other entities
In addition to the entities mentioned earlier, there are several other independent entities that need to be explained:
TransactionHistory
There is also an archive table transactionhistoryarchive, the structure and transactionhistory exactly the same, which is recorded in the production of work orders or purchase orders or sales orders of the 3 transactions of the product, date, quantity and other public information. This table can be considered as a transaction log table, usually do not participate in the query of various entities, only when the audit or tracking data changes.
The data in the TransactionHistory table is trigger automatically inserted into each order table, rather than being controlled by external program code. Because the data of the transaction table is relatively large, and this table has stored three data in the transaction table, so the growth is very fast, must archive operation, move old data to another archive table, so as to guarantee the speed of querying the new TransactionHistory table.
Awbuildversion
This is the version of the database that records the current database definition when it is created, or the version of the current database definition script. For a generic product, this table is important because the product may need to be upgraded, the upgrade program reads the table before the upgrade, knows what version the current database definition is, and can then query to the SQL that needs to be modified to upgrade the current version of the database to the new version of the database, and then execute the SQL.
The first thing the application does at run time is to check the version information in this table to ensure that the version of the database definition matches the version required by the program so that the program runs correctly.
For the enterprise internal system, there is generally only one instance, and by the enterprise internal IT personnel development and maintenance, so this table is not also no problem.
DatabaseLog
This is a log table that records database DDL (data definition language, such as Create, ALTER, drop, etc.) operations. This table is automatically maintained by database trigger, and when DDL is executed in this database, the system triggers trigger to record a single piece of data into the table. This is a good thing!
? There are also some because the normalization of the abstract out of the code table, I did not mention in the previous model, such as countryregion,stateprovince, such as these are relatively simple, there is no one tired of the.
This article I simply analyzed the next entity and entity relationship, the following article will further analyze the details, what are the pros and cons.?
?
?
?
?
?
Modeling from AdventureWorks Learning Database--Entity analysis