One, the three paradigms of relational database
The first paradigm requires that each column is not re-divided, that is, to satisfy the atomicity, which is the most basic requirement in relational database design. The second Paradigm is based on the requirement that all non-primary key columns be completely dependent on the primary key column, not just the partial primary key column (if this is the case, the table should be split). The third paradigm is that the second paradigm requires that all non-primary key columns must be directly dependent on the primary key column, rather than relying on a non-primary key column to indirectly rely on primary key columns, such as user tables with user IDs, user names, organization IDs, and organization name fields that do not meet the third paradigm. Because the organization name is directly dependent on the organization ID. However, this redundancy design is used in the design of relational database tables to reduce association table queries.
Connection relationship between database tables and links
The common connections between database tables are inner joins, outer joins, left (right) outer joins, and Cross joins. Within which the connection returns records that meet the criteria in two tables, outer joins (full outer joins) return all records in two tables (including records that meet the criteria, other left table records + blank on the right table), and the left table record + blank on the left side, and the outer joins to the left are whichever Returns all left Table records (no return null for associated records in the right table), which returns the product of the records on both sides.
Third, the database index concept and matters needing attention
The purpose of database indexing is to increase the efficiency of query sorting, to avoid the efficiency of full table scan, especially in the case of table data, the correct use of indexes can greatly improve the efficiency of database query. Usually the database index uses B + tree structure, other similar commonly used index structure has hash index, LSM tree index.
The index is divided into clustered and nonclustered indexes from the point of view of whether the record order is consistent with the physical disk order. Clustered index refers to the index record order consistent with the physical record order, the query more efficient, more suitable for the interval record query, each table can only build a clustered index, default in the table primary key when it is set to clustered index (if you want to set the other column clustered index, to specify the primary key before setting).
Common index classifications have normal indexes (common indexes), unique indexes (indexes with unique column values, null values allowed), primary key indexes (special unique indexes), singleton indexes, and federated indexes. Using indexes has the following considerations:
- Avoid excessive use of indexes, if the table records are not many, or is a lot of duplicate records (such as the sex column), or the increase, change, delete the table is very frequent, do not recommend the use of indexes, because the index effect is not obvious also increased maintenance costs.
- Proper use of indexes, especially federated indexes, should be preceded by a combination of more differentiated records, a pre-condition of the query, and a left-hand index rule like the MySQL database (assuming that the Union index is listed as a, B, and C if no column A in the condition does not work, Pgsql does not seem to have the problem).
- Any action on a column causes a table scan, which includes database functions, calculation expressions, and so on, to move the operation to the right of the equals sign whenever possible.
- The IN, or clauses often use worksheets to invalidate the index, and if you do not produce a large number of duplicate values, consider taking the sentence apart; the split clause should contain an index.
Which conditions need to be indexed:
1 primary key, unique index
2 fields that are frequently used as query criteria need to be indexed
3 fields that often require sorting, grouping, and statistics need to be indexed
4 fields associated with other tables in the query, foreign key relationships indexed
Which cases do not index:
There are too few records for 1 tables, the data below millions does not need to be indexed
2 frequently deleted and changed tables do not need to create an index
3 The data is duplicated and the distributed average field does not need to create an index, such as True,false.
4 frequently-updated fields are not suitable for creating indexes
5 fields not available in the Where condition do not need to create an index
Some summary of the index
Mysql Index Optimization Analysis
Iv. three types of values that should not be stored in a relational database table
- Large file, sound picture file, although the database field has a blog type, but finally do not directly save the file to the database, but by depositing the corresponding file path to resolve. Because the database operation is not as fast as the direct operation of the file, it will increase the database file backup migration data trouble. A netizen indicates that the image file column is stored in the MySQL database, although it slows down query efficiency even if the field is not included in the query.
- Temporary data, such as session, hourly, and day-to-date cleanup data should not exist in the database, which is more appropriate in caches such as Redis.
- A large number of log files, if you want to save the log to the database to facilitate the log query is also the best way to establish a separate log library, to avoid the frequent write log to affect the main business table access efficiency.
Common knowledge of Database Foundation