(Excerpt) _ Database Design entry classic: planning and preparation through analysis _ 9.2 Analysis

Source: Internet
Author: User
9.2 Analysis
As described above, analysis is the beginning of establishing a good relational database model. Analysis is about the operational factors and business affairs of a company. It has nothing to do with the technical aspects of computer systems. The Analysis and database model or the content that the database administrator and the programmer want and want to obtain have nothing to do. Analysts must understand the business. Participation from business personnel (technical and non-technical employees in the company (end users, and even the primary management) is critical to final success. Some companies develop software only by hiring on a temporary basis, rather than by engaging internal technical personnel. The result is a database model completely based on the end user. We need to strike a balance between our technical staff and our non-technical staff.
It is important to understand the design stage as a demand behavior. What content is required? When establishing a database model, application, or software product, it is important to understand the process of describing the content in the database. What tables do you need? In computer terms, these processes are generally called methodologies (a set of rules, the creator of a computer system can obtain results from A to B (from graffiti on a draft paper to useful computer systems ). Many people have spent many years thinking about these rules and refining them constantly to provide them with steps from A to B that are sometimes simple and complex.
Tip:
Standardization is a methodology. Normalization is a complex set of rules used to refine the table structure in a relational database. It is also a methodology to divide the database model design process into analysis, design, construction, and implementation steps.
The best database model can be obtained by paying attention to the details in the analysis phase. Before you start analysis in a hurry, it is very important to accurately understand what you need. If you need to modify the content later in the development phase, you can add the modification content later. However, modifying the database model used by the production system may cause many problems. This is not a good choice. This is because the application usually depends on the database and thus the database model.
Analysis is planning. It is of double importance to plan a database model. The reason is: for the entire company, the database model is the basic element of all database-based applications. With existing products, database models can drive identical or semi-dedicated applications for hundreds or even thousands of companies. It is very important to obtain the database model immediately. The more thorough understanding of the database model in the analysis phase, the better the database model and product will be obtained.
Some database modeling and design tools allow the generation of table scripts for different database engines. The tool used in database modeling is called Erwin. Erwin can be used to generate table creation scripts for many database engines. Microsoft
Access
It has a built-in ERD modeling tool. Database models can usually be designed with beautiful images (such as Erwin) in graphical database modeling tools. When designing a database model with beautiful images and intuitive graphical data packets, you can consider the analysis status and approaches, while ignoring the technical details. In fact, at the beginning, deep technical aspects (such as the Data Type and precise composition of fields) may affect the analysis process due to excessive complexity.
Tip:
The analysis focuses on the required content. The design focuses on how to provide the content through completed analysis.
When an existing system is overwritten (that is, the existing database model is re-constructed, the analysis only contains old systems obtained from access and discussion by end users and technicians. If a system is being rewritten, the original system may be inappropriate in some aspects. By executing the analysis process, we can understand the missing, wrong, and inappropriate aspects of the old system.
End users may tell us what tasks they cannot do in the existing system. End users may also list the features they want. For analysts, a secure approach is to evaluate the importance of enhanced functionality and new features. The reason for this is that the most important feature in the analysis phase is to estimate the cost of the entire process. As the workload increases, the development cost also increases.
Programmers and administrators may tell us about system errors. They will also tell us how to avoid these problems, such as kludge.
Tip:
Kludge is a term that computer programmers often use to describe a method that can solve problems but is not very useful. Kludge is often used to generate a computer system consisting of poorly matched elements.
9.2.1 considerations in Analysis
Since the analysis phase is the process of establishing and limiting the content required, we need to remember the following factors and considerations.
● Overall goal --
These goals include all the goals of a company when creating a database model. Including the content in the database model, expected results, and expected conclusions. For example, is this a new application or an application that needs to be rewritten?
● Company operation-these actions include the company's means of earning a living and the methods for calculating these measures.
● Business rules-this is the core of the analysis phase. Business Rules describe the analysis content and the content to be created to design the database model. Including the required tables and the basic relationships between tables.
● Planning and timetable --
For large projects, planning and timelines are very useful. A typical project plan includes the completion person, completion work, and completion time. To share these plans with more people and ensure task independence is met, more comprehensive plans combine multiple tasks. For example, if Task B needs to complete task a, one person can complete task a and Task B. If there is no dependency, then, two people can complete task a and Task B in the same time.
● Budget --
What is the total project cost? Are these expenses valid, worthwhile, and affordable? Whether these expenses will lead to bankruptcy before bringing the competitive advantages to the company. The budget includes the following considerations:
· Expenses required for hiring third-party personnel-do you need to hire third-party personnel? The overhead of the support staff is usually high. Are skilled personnel available? Do you need to add new recruits? Do you need to hire a consultant at a high price?
· Hardware overhead-evaluates hardware requirements based on speed and storage capacity needs. For OLTP databases, concurrency, on-board
Ram, dedicated hardware devices (RAID matrix), and network bandwidth are all important factors. The storage space design and I/O performance are also important for data warehouses.
· Maintenance-the simplicity of maintenance means long-term low overhead.
· Training --
End users and programmers must know how to use the created system. Otherwise, even if the system is available, it is difficult to maximize the utilization. If the Database Designer introduces a new database engine for a company (such as Oracle
Database), then training is a required requirement, and training must be budgeted. Technical training is very expensive and time-consuming.
● Other factors-other factors that are less noticed (but not less correlated) include the following:
· Data repeatability --
Avoid data duplication (without sacrificing performance) and granularity. Narrow granularity is often the result of over-standardization. Excessive standardization will make SQL code compilation too complex and difficult to execute, and cause many problems and high maintenance costs.
Tip:
Excessive normalization can clear potential errors in data (errors that violate the integrity of the reference ). This will also make ERD contain many tables and cause a lot of trouble for programmers. Over-standardization is a good thing for database designers, but it is a nightmare for programmers. Relational Database Model Design is a means to achieve the goal, but it is not an aim. In other words, the database model is created for programmers and end users. The database model should not be built on a mathematical rule that is too perfect. The consequence is that only database modelers can understand the model. We should keep the purpose of the design in mind at all times. The purpose of the design is to meet the needs of applications and provide the company with advantages, rather than making life more complex.
· Read-only or read/write --
Is the database read-only or fully writable? Databases in data warehouses are often partially read-only. The content of some static data in the OLTP database must be readable. Considering the flexibility of the OLTP database, a dedicated method is required for static tables.
Don't be surprised. The consideration of all goals, including company operations, business rules, plans and time limits, and budget, can be used as a framework for the entire analysis process. In fact, these considerations will be included in the development of case analysis examples later in this section. Before learning the case analysis examples, first understand the problems hidden in the analysis of enterprise database models.
Tip:
The "other factors" mentioned above are described in detail in this section. Theme-related factors have been described in the previous sections of this book.
9.2.2 potential problems and misunderstandings
When analyzing the company's database model, there are many negative factors worth considering and remembering. These factors include:
● Standardization and data integrity
● More normalization results in better query results
● Performance
● General and standard database models
Let's further discuss these contents.
1. Standardization and data integrity
Many analysts propose to normalize various layers by using all available paradigms to ensure that data is not lost (redundant) and fully match the integrity of the reference (isolated records ). A database is a repository of information. Poor application programming or incorrect database usage will produce problematic data. A highly standardized database model ensures good data integrity, but the database model itself does not produce such problematic data. Application programmers and end users may encounter these problems due to incorrect coding or incorrect database modification.
2. More normalization results in better query results
This is another point of view raised by many analysts. Some declarations are as follows:
● "Problem query is generally the product of an inappropriate (non-standardized) database structure ."
● "Compared with the standardized database structure, it is easier to write fast queries and updates ."
The author's point of view is totally different from the above two. The nonstandard structure generates fast queries and makes it easier for end users to write query operations. From an operational point of view, non-standardized tables are more realistic to commercial applications. In other words, busy executors responsible for writing ad-hoc queries can understand the contact methods between tables.
Creating a query between many tables results in a large number of join queries. Join queries are not only complex in writing, but also time-consuming in execution. For end users who have just started to use databases, writing too complex connection queries is almost impossible. This is unfair. Even for skilled programmers, it is often ridiculous to establish a connection query on the dknf-level standardized database model. For example, creating a connection query for 15 tables in a database will occupy 1 GB of space and 30 seconds. This approach is totally unacceptable. I have seen this situation in the past.
Tip:
Database Modeling should not be an expression of over-perfect mathematical rules, but rather a method to achieve the goal. This method is used for database modeling. The purpose is to meet user requirements. Our goal is not to generate a database model with the best granularity but not meeting user needs.
3. Performance
Some analysts claim that computer performance, applications, and database models are irrelevant to business analysis. I totally disagree with this statement. I have also seen this problem because I didn't take into account the most important performance factors at the beginning, resulting in the elimination of applications and the entire service environment. Performance has always been an important factor. Leaving the performance to analyze the database model is like buying a new ship but cannot afford it --
They just threw money into the water.
All SQL code depends on the underlying table structure and table content, and even the table ID and table information content. These contents should be determined in the analysis phase. Undoubtedly, decisions made during analysis will affect future performance. The performance of databases and applications is very important.
4. General and standard database models
We need to be careful to use specific industries or applications as standard general database models. Common Database models are usually used in the purchased semi-customized applications. The general model can meet the requirements of a variety of applications. If you are using your own database model, you can create a database model that meets your company's requirements. Many scopes of the general database model cannot be applied to specific company requirements. In other words, your database may have many useless empty tables. Technically, useless tables are not really a problem, but they can confuse programmers and end users. Another drawback of a general model is that it may not be able to meet all requirements accurately and may need to be modified.
Standard database models, especially models that meet the needs of specific industries and serve as a standard, are also very dangerous. Because they are probably outdated. Like General models, standard models may include redundant tables, which may cause all possible problems and deficiencies. If you are creating a database model, you will also analyze the actual situation and customize the model for your company and specific applications. However, the reduction of the scope will lead to inflexibility and ultimately cause limitations in the system. This requires a perfect balance.
Now we can understand the reasons for describing the above content, the reasons for writing this book, and the reasons for reading this chapter. How can we apply theory to practice? Let's start with the analysis.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.