Database knowledge point summary

Source: Internet
Author: User
1. Related Concepts and knowledge points 1. data Dependency: it reflects the constraint relationship between attributes within a link. It is an abstraction of the associations between attributes in the real world. It is a reflection of the inherent nature and semantics of data. 2. Standardization Theory: it is the basic theory used to design a good relationship model. It eliminates inappropriate Data Dependencies through the decomposition Relationship Mode.

1. Related Concepts and knowledge points 1. data Dependency: it reflects the constraint relationship between attributes within a link. It is an abstraction of the associations between attributes in the real world. It is a reflection of the inherent nature and semantics of data. 2. Standardization Theory: it is the basic theory used to design a good relationship model. It eliminates inappropriate Data Dependencies through the decomposition Relationship Mode.

1. Related Concepts and knowledge points

1.Data Dependency: it reflects the constraint relationship between attributes within a link. It is an abstraction of the associations between attributes in the real world. It is a reflection of the inherent nature and semantics of data.
2.Standardization Theory: it is the basic theory used to design a good relationship model. It eliminates inappropriate Data Dependencies by means of the decomposition Relationship Mode to solve insertion exceptions, deletion exceptions, update exceptions, and data redundancy issues.
3.Function dependency: To put it simply, for the two attribute subsets X and Y of the relational mode, if any value of X uniquely determines the value of Y, the Y function depends on X, record as X → Y.
4.Non-trivial function dependency: for the two attribute subsets X and Y of the relational mode, if X → Y, but Y !? X, which is called X → Y as a non-trivial function dependency; If X → Y, but Y? X, that is, X → Y is a non-trivial function dependency.
5.Full function dependency: for the two attribute subsets X and Y of the relational mode, if X → Y, and for any real subset X, there is no x' → Y, Y is called the full function dependency on X.
6.Paradigm: a set of relational patterns that conform to a certain level. When designing a relational database, it is defined as different paradigms based on different requirements that meet dependency requirements.
7.Normalization: refers to the process of converting the relational model of a lower-level paradigm into a set of Relational Models of several higher-level paradigms through pattern decomposition.
8.1NF: If all attributes of the relational mode are basic data items that cannot be divided, the relational mode belongs to 1NF.
9.2NF: 1NF relational mode. If the full function of each non-primary attribute is dependent on the code, the relational mode belongs to 2NF.
10.3NF: if each non-primary attribute in the relational mode does not depend on code or code, the relational mode is 3NF.
11.BCNF: if every determining factor in a relational model contains a code, the relational model belongs to BCNF.
12.Database Design: refers to the construction of an optimized database logical mode and physical structure for a given application environment, and the establishment of a database and its Application System accordingly, so that it can effectively store and manage data, meet the application requirements of various users, including information management requirements and data operation requirements.
13. Six basic steps for Database Design: requirement analysis, conceptual structure design, logical structure design, physical structure design, database implementation, database operation and maintenance.
14.Conceptual Structure Design: it refers to the process of abstracting the user requirements obtained from the requirement analysis into an information structure, that is, a conceptual model. That is, a conceptual model independent of a specific DBMS is formed by integrating, summarizing, and abstracting user requirements.
15.Logical Structure Design: converts a conceptual structure model (Basic E-R diagram) to a logical structure that complies with the data model supported by a DBMS product, and optimizes it.
16.Physical Structure Design: it refers to the process of selecting a physical structure most suitable for the application environment for a given Logical Data Model. Including designing the storage structure and access methods of databases.
17.Abstract: it refers to human processing of actual people, things, things, and concepts, extracting common characteristics of interest and ignoring non-essential details, these features are precisely described using various concepts, which constitute a model.

18. database design must follow the principle of combining structure design and behavior design.

19. The data dictionary consists of data items, data structures, data streams, data storage, and processing processes.
20. Three Common abstract methods are classification, aggregation, and generalization.
21. The conflict between partial E-R graphs is mainly manifested in three aspects: attribute conflict, Name Conflict and structure conflict.

22. Common Database Access methods include index, clustering, and HASHThree methods.
23. To determine the data storage location and storage structure, consider the following factors:Access time, storage space utilization, and maintenance cost.

Ii. elaborate on the three paradigm of Database

2.1 columns in the first paradigm (1NF) with no duplicates

In 1NF, each column of the database table is an inseparable basic data item.
The same Column cannot contain multiple values.
That is, an attribute in an object cannot have multiple values or duplicate attributes.
In short, the first paradigm is a non-repeated column.

In any relational database, the first paradigm (1NF) is the basic requirement for the relational model. databases that do not meet the first paradigm (1NF) are not relational databases.

2.2 The 2NF attribute of the second paradigm is completely dependent on the primary key [Removing the dependence of partial molecular functions]

To satisfy the second Paradigm (2NF), we must first satisfy the first paradigm (1NF ).

The second Paradigm (2NF) requires that each instance or row in the database table be able to be distinguished by a unique region.

To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance.
The second Paradigm (2NF) requires that the attributes of an object fully depend on the primary keyword. The so-called full dependency refers to the fact that there cannot be an attribute that only depends on a part of the primary keyword. If so, this attribute and this part of the primary keyword should be separated to form a new entity, the relationship between the new object and the original object is one-to-multiple. To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance. In short, the second paradigm is that attributes fully depend on the primary key.
2.3 The 3NF attribute does not depend on other non-primary attributes [eliminate transmission dependencies]

The third paradigm (3NF) must satisfy the second Paradigm (2NF) first ).

In short, the third paradigm (3NF) requires that a database table do not contain information about non-primary keywords already contained in other tables.

For example, there is a department information table, where each department has a department ID (dept_id), department name, Department profile, and other information. After the Department numbers are listed in the employee information table, you cannot add the Department name, Department profile, and other information related to the department to the employee information table. If the department information table does not exist, it should also be constructed based on the third paradigm (3NF), otherwise there will be a large amount of data redundancy. In short, the third paradigm is that attributes do not depend on other non-primary attributes.

2.4 analyze specific instances

The following lists examples of a school's student system to demonstrate the application of several paradigms.

Before designing the database table structure, we should first determine the content to be designed. Student ID, Student name, age, gender, course, course credits, department, subject score, subject address, telephone number, and other information. For simplicity, we only consider the field information. We are concerned with the following information.

1) what basic information does students have?
2) What are the scores of students who have chosen the courses?
3) What are the credits of each course?
4) What is the basic information of the student's department.

First, the first paradigm (1NF): Fields in the database table are single attributes and cannot be divided. This single attribute is composed of basic types, including integer, real number, complex type, logical type, and date type. In any current Relational Database Management System (DBMS), you are not allowed to divide one or more columns of the database table into two or more columns. Therefore, all the databases that conform to the first paradigm are made.

Let's look at the second paradigm.Put all the information in one table (student ID, Student name, age, gender, course, course credits, department, subject score, subject address, office phone number) the following dependencies exist.
1) (student ID) → (name, age, gender, Department, Department address, Department phone number)
2) (Course name) → (credits)
3) (student ID, course) → (subject score)

Based on the dependency, we can change the SelectCourse table to the following three tables:

Student: Student (Student ID, name, age, gender, Department, Department address, Department phone number );
Course: Course (Course name, credits );
Course Selection relationship: SelectCourse (student ID, course name, score ).

In fact, in contrast to the requirements of the second paradigm, This is a database table that meets the second paradigm. If it does not meet the second paradigm, the following problems will occur:
Data redundancy: The same course is selected by n students, and "Credits" are repeated for n-1 times. The same student takes m courses, and the name and age are repeated for m-1 times.

Update exception: 1) if the credits of a course are adjusted, the "Credits" value of all rows in the data table must be updated. Otherwise, different credits of the same course may occur.
2) If you want to open a new course, no one will take it now. In this way, the course name and credits cannot be recorded in the database because the "student ID" keyword is not yet available.

Deletion exception : Assuming that a group of students have completed the electives, These electives should be deleted from the database table. However, the course name and credit information are also deleted. Obviously, this will also cause insertion exceptions.

We will also consider how to change it to a database table that meets the third paradigm.Next, let's look at the Student table Student (Student ID, name, age, gender, Department, Department address, Department phone number). The keyword is a single keyword "Student ID" because of the following decision-making relationship:

(Student ID) → (name, age, gender, Department, Department address, Department phone number)
However, the following decision relationships still exist:
(Student ID) → (school location, school phone number)
That is, the transfer function dependency of the non-Keyword section "school location" and "college phone" on the keyword section "student ID" exists.
It also has data redundancy, update exceptions, insertion exceptions, and deletion exceptions (For details, refer to the analysis in the second paradigm ). Based on the third paradigm, the Student Relationship table can be divided into the following two tables to meet the third paradigm:

Student (student ID, name, age, gender, Department );
Department: (Department, Department address, Department phone number ).


As shown in the following example, after summarizing the basic knowledge of the database (especially the SQL statements and paradigms), let's take a small test. The difficulty of the questions is roughly the same as that of the university database principle and the final examination of the application course. Let's take a look at it and familiarize yourself with the knowledge through the questions.

Relational Database standardization is introduced to solve the (A) Problem in relational databases.

A. insertion, deletion, and data redundancy B. Improved query speed

C. Reduce the complexity of data operations D. ensure data security and integrity

The development of data management technology has gone through the manual management, file system, and database system stages. In these stages, the most independent data is stage.

A. Database System B. File System C. manual management D. data item management


The relationship between database (DB), Database System (DBS), and database management system (DBMS) is (C ).

A. DBMS includes DB and dbs B. DB includes DBS and DBMS

C. DBS include DB and dbms d. DBS are DB, that is, DBMS


The database management system can define, modify, and delete data tables, indexes, and other objects in the database. This type of language is called ().

A. Data Definition Language (DDL) B. Data manipulation language (DML)

C. Data Query Language (DQL) D. Data Control Language (DCL)


Any two tuples of the same relational model ().

A. It cannot be the same as B. It can be the same as C. It must be the same as D. None of the above


The conceptual model is (B)

A. It is used for modeling the information world. It is related to A specific DBMS.

B. It is used for information world modeling and has nothing to do with a specific DBMS.

C. It is used for realistic modeling, which is related to a specific DBMS.

D. It is used for realistic modeling and has nothing to do with a specific DBMS.


Physical Data independence refers to (D ).

A. The mode changes. The external mode and application do not change. B. The internal mode does not change.

C. Internal Mode change, mode unchanged D. Internal Mode change, external mode and application remain unchanged


The SQL language is (C.

A. Hierarchical database B. Network Database C. Relational Database D. Non-Database


Natural connection is an effective way to form a new relationship. Generally, when using a natural connection to the relationship R and S, it is required that R and S contain one or more common (D ).

A. tuples B. Row C. Record D. Attributes


The longest time-consuming operation in relational operations is (C ).

A. Projection B. Select C. Cartesian Product D. Division


The biggest difference between a file system and a database system is (C ).

A. Data Sharing B. Data independence C. Data structuring D. data redundancy


The SQL statement used for transaction rollback is (D)

A, create table B, COMMIT

C. GRANT, revoke d, and ROLLBACK


(C) used to record every update operation performed on the data in the database.

A. database file B. Buffer C. Log File D. backup copy


Data inconsistency caused by concurrent operations (D ).

A. Lost modifications, non-repeated reads, dirty data reading, and deadlocks

B. Repeatable reading, dirty data reading, and deadlock

C. Loss of modification, dirty data reading, and deadlock

D. Lost modifications, non-repeated reads, and dirty Data Reading


To improve efficiency, the relational database system must perform (B) processing.

A. Define View B. Query Optimization

C. Create an index D. normalize data to the highest paradigm


The physical storage method of the database is described as (B)

A. External mode B. Internal Mode

C. Concept mode D. logic mode


The main function of introducing A second-level image between database-Level 3 models is ()

A. Improve data and program independence B. Improve data and program Security

C. Maintain consistency between data and programs D. Improve data and program portability


A view is a "virtual table", and its structure is based on (C)

A. Basic Table B. View

C. Basic Tables or views D. Data Dictionary


Which of the following clauses in the SELECT statement does the π operator in relational algebra correspond? ()

A. select B. FROM

C. where d. GROUP


Converts a E-R model to a relational model (C) of a database ).

A. Requirement Analysis B. Conceptual Design

C. Logical design D. Physical Design


The purpose of transaction logs is (C)

A. Transaction Processing B. integrity constraints

C. Data Recovery D. Security Control

If transaction T has added the X lock to the data R, other transactions are on the Data R (D)

A. Only the X lock can be added. B. Only the S lock can be added.

C. S or X locks can be added. D. No locks can be added.


Differences and relationships between views and basic tables.

A view is a table exported from one or more basic tables. It is different from a basic table. It is a virtual table. In a database, only view definitions are stored, rather than View data, the data is stored in the original basic table. When the data in the basic table changes, the data queried from the view changes accordingly. As defined, a view can be queried or deleted like a basic table, or a new view can be defined on top of a view, but there are restrictions on the update operation of the view.


Briefly describe the features of transactions.

Transactions have four features: ACID:

(1) atomicity: All operations included in a transaction are either done or not done.

(2) Consistency: the transaction must change the database from one consistent state to another consistent state.

(3) isolation: operations and data used within a transaction are isolated from other concurrent transactions.

(4) Continuity: Once a transaction is committed, changes to the database are permanent.


Test the integrity rules of a relational model..

Integrity rules:

If attribute (or attribute group) F is the external code of the basic relationship R, it corresponds to the primary code Ks of the basic relationship S (the basic relationship R and S are not necessarily different), then the value of each tuples in R on F must be: take null values (each attribute value of F is null) or equal to the primary key value of a tuples in S.


The existing relational databases are as follows:

Student (student ID, name, gender, Major)

Course (course number, course name, credits)

Learning (student ID, course number, score)

Please useRelational algebra expressions and SQL statements.

1. Retrieve the student ID and score of all students whose course number is "C112;

2. Retrieve Information about the courses learned by English students, including student IDs, names, course names, and scores;

3. Retrieve the student ID, name, Major, and score of all students whose score is higher than 90 for the "database principles" course;

4. Retrieve the student information for the "C135" course, including the student ID, name, and major;

5. retrieve information about students whose course numbers are "C135" and "C219", including student IDs, names, and majors.


A teaching management database is set up with the following attributes: Student ID (S #), course number (C #), score (G), and instructor (TN ), the instructor's Department (D ). The data has the following semantics:

1. Each course a student takes has a score;

2. Each course has only one instructor, but each instructor can have multiple courses;

3. There are no duplicate teachers. Each teacher belongs to only one department.

(1) determine the function dependency set based on the preceding semantics.

(2) If all the above attributes form a relational mode, what is the mode of this relational mode? Examples are provided to illustrate exceptions during addition and deletion operations.

(3). Break it into 3NF with dependency persistence.

Solution:

(1). F = {(S #, C #) → G, C # → TN, TN → D}

(2). The link mode is 1NF.

The candidate keyword of the relational mode is (S #, C #)

The non-primary attributes include G, TN, and G.

In option F, C # → TN

Token has partial dependencies of non-primary attributes TN on candidate keywords (S #, C #)

That is, (S #, C #) --- → TN.

Exception:

§ If a new course has not been selected yet, the insert operation cannot be performed because the S # keyword is missing.

§ If a teacher is transferred out of school and wants to delete the relevant information, the information about the course (C #) that should not be deleted will be deleted.

(3). Then F = f' = {(S #, C #) → G, C # → TN, TN → D}

Limit P = {R1, R2, R3}

Where: R1 = (S #, C #, G)

R2 = (C #, TN)

R3 = (TN, D)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.