1. Establish A Logical Data Model as the first stage, including modeling the information that the application needs to process and store, and ensuring that all necessary data can be correctly, completely, and unambiguous. In the implementation of relational databases, this usually refers to the construction of a standardized entity-relational (E-R) model. 2. Map the Logical Data Model to the physical data model as the second stage. For relational databases, physical data models describe tables, indexes, views, keys, and other database features. 3. Third paradigm: All data in an object (table) depends entirely on the primary key. Duplicate attributes (columns) or attribute groups are not allowed. There is no entity data that depends only on some primary keys. There is no entity data dependent on other non-primary keys. Use a motto to describe: "the key, the complete key, except the key. "4. From a document or definition perspective, the data type, length, and accuracy of attributes defined in a logical model are advantageous. Since Oracle uses a general and flexible internal implementation method for physical storage of data types, from the perspective of storage or performance, the use of highly restrictive data types or precision is not advantageous. 5. No matter how limited the data type is, in most cases, Oracle uses a large range of high-precision floating point storage methods. 6. A string of the fixed length type occupies a fixed storage space regardless of the actual length of the string. Using a fixed length can reduce fragments, but it will lead to a large average length, which will increase the overhead of full table scanning. Therefore, unless the data length is fixed, the variable-length string type (VARCHAR) should be preferred ). 7. Artificial key: a numeric column generated by Oracle sequence. There is no meaning, just to uniquely identify the records in the object. It is never updated. Natural key: consists of multiple columns and can contain any data type. It is composed of a unique natural attribute in an object. If a natural key is updated, the foreign key that references it also needs to be updated, which significantly increases IO overhead and lock contention. Obviously, index queries must be performed on columns with natural keys. To meet this requirement, you can create common indexes or unique constraints on these columns. 8. The main purpose of the logic stage is to ensure that the design can meet the functional requirements of the application. The physical design stage aims to ensure that the database can meet the performance requirements of applications. When converting a logical self-ray into a table, you must avoid using a separate parent-class table and sub-class table method. Instead, you must map all sub-classes to a table, or use a separate subclass table instead of the upper-level table. 9. The table types include heap table, index organization table, and cluster. Hash Cluster: it can effectively optimize primary key queries for tables with relatively fixed sizes. Compared with B * tree indexes, hashed clustering can also reduce the contention for latches and locks on Hotspot blocks. Index Cluster: records in multiple tables that share the Cluster key value are stored together to optimize the multi-table join. Although multi-table join is enhanced, the performance of full table scan for a table in a cluster is reduced. Nested Table: optimizes the search of detailed rows for a row in the master Table. However, if you want to skip the primary table to retrieve detailed rows, the performance is generally compromised. Index organization table: if most tables are accessed through the primary key for query, and the table data volume changes significantly, it is not suitable for using hash clustering. Using indexes to organize tables will be more efficient. Object Table: All rows are defined as an Oracle Object data type. There are few scenarios where an Object Table needs to be used. 10. The significance of precision lies more in the constraints of data or definition documents, rather than in order to optimize performance. However, if a high-precision value is accidentally specified to a numeric field with a weak precision limit, setting an accuracy may be helpful for performance. For example, NUMBER (*, 2) will be truncated, and the length of the row will be reduced accordingly. Setting the precision of the numeric column can reduce the length of the row. 11. DATE can be accurate to seconds, TIMESTAMP can be sub-seconds, and the precision can be configured to the nanosecond level. The default precision is microsecond. 12. If the string is smaller than 4000 bytes, The VARCHAR2 type is preferred. If the string is larger than 4000 bytes, we recommend that you use the LOB type, CLOB to store character data, and BLOB to store binary data. The 11g provides a high-performance LOB storage type called SecureFiles. Each table can only have one LONG data column, which is compatible with earlier Oracle versions. If VARCHAR2 is not a VARCHAR, VARCHAR will be modified in the future to conform to ANSI standards. 13. B * tree indexes cannot store NULL values. Therefore, a full table scan is required to search for NULL values. Bitmap indexes and (partial columns with NULL) Multi-Column Composite indexes can store NULL values. Using NULL can reduce the average length of rows, thus improving the performance of full table scan to a certain extent. If most of the values in the column are NULL and only the non-NULL values need to be retrieved, the index of the column will be compact and efficient. Therefore, when determining whether a column can be NULL, consider whether the column needs to use the B * tree index to query NULL. If yes, the column cannot be specified as NULL. Instead, the column is defined as not null (NOT empty) and a default value is specified. If you want to query unknown values, you cannot define the column as null. Instead, you must define the column as non-empty and specify a default value. For character data, the default value can be A string, such as UNKNOWN or N/. It is difficult to select an appropriate default value for the numeric type. For example, to count records that contain the AGE field, you may need to find UNKNOWN records based on index scans or AGE column queries. If you specify a default value for AGE, an incorrect result may be returned when you query the average AGE, minimum AGE, and maximum AGE. In this case, it is necessary to use NULL, but the query cannot quickly return those records whose AGE is not certain, or perform de-normalization and add a tag column to mark whether the AGE is known, index the tag column to query records with AGE uncertainty (AGEKNOWN = N. NULL values can be stored in BITMAP indexes. In the preceding example, BITMAP indexes are not selected for this reason. If BITMAP indexes exist on the AGE column, NULL values can be retrieved efficiently. 14. Unless each field in the table is a fixed-length string, Oracle cannot know the specific position of a column in the row physical storage structure. Compared to the top columns, the back columns in the Access Table consume a small amount of CPU resources. Because Oracle must scan the row structure sequentially to obtain the location of a specific column, storing frequently accessed columns in front of the table will have some positive performance impact. NULL fields usually require a byte storage space. However, if the values of subsequent columns in this row are NULL, Oracle does not need to allocate any space for these NULL fields. If most of the NULL columns are stored at the end of the table, the actual length of the row is reduced, which helps improve the table scan performance. The performance improvements brought about by these adjustments are relatively small. Therefore, the logical sequence of columns should make the data model easier to understand and maintain, instead of confusing the order of columns in the logical model for these minor optimizations. 15. denormalization: refers to the process of re-introducing redundancy, repetition, or other non-normalized structures in the physical model. The main purpose is to improve performance. 16. If real-time Summary of data in the summary table is required, the summary data must be updated whenever the source data is updated. You can manually implement this through database triggers or materialized views. However, frequent updates may cause lock contention. If real-time summary information is not required, you can update the summary table through regular job scheduling-billing business peak. Use the Oracle materialized view mechanism. However, this may cause inaccurate summary information in real time. 17. Oracle 11g result set cache. 18. The query for aggregation operations usually occupies a large amount of database resources. It is best to use the materialized view to maintain the normalized summary information. 19. Generally, implementing the logical subtype as a slave table reduces the performance of common SQL operations. Vertical Partitioning. 20. View: Only one table is displayed. It is usually used by applications to query sub-tables to obtain the best performance, at the cost of complicate SQL applications. 21. Oracle sometimes allows users to physically split a table into multiple data segments and keep the impression of a single logical table. For example, LOB data is usually stored in a separate data segment. In the index organization table, some data is stored in the overflow section. If the table is large and is expected to scan frequently, You can migrate columns with long fields and infrequently accessed columns to a separate sub-table, to reduce the length and improve the table scan performance. 22. database Triggers are preferentially used to ensure consistency of denormalized data and avoid maintenance through application code. Database Triggers reduce the risk of data inconsistency, simplify application code, and improve efficiency. 23. fact table foreign keys are almost all artificial digital keys generated by sequences. From the storage point of view, if the fact table's Foreign keys are all meaningful strings, the storage cost is relatively high: for example, A product name may occupy 10 times the product ID space. Maintaining a shorter row length is important for fact tables because fact tables often have full (or partition) scans. 24. Avoid the snowflake mode whenever possible. When a dimension table does not contain foreign keys, the query performance is generally optimized. 25. By creating a hierarchical dimension, you can improve the performance of queries that need to use high-level aggregated data. For example, you do not need to accumulate the daily revenue records of the month to query the total monthly income, this greatly improves the performance. You can consider using multiple fact tables for multi-level aggregated data. Using multiple fact tables is a common data warehouse solution, however, Oracle also provides a more complex method called materialized view. Materialized View is essentially a physical table that contains records that will be returned by the view definition. If you think of a common view as a query statement stored in the database, the materialized view can be seen as the query result stored in the database. Oracle can automatically maintain materialized views to ensure they are up-to-date or regularly updated. 26. materialized view best practices: Create materialized view logs for each table included in the materialized view. Use the create dimension statement to identify the hierarchical relationship between dimensions. Set the QUERY_REWRITE_ENABLED parameter to enable the query rewrite function. Select an appropriate materialized view refresh policy. We recommend that you use the on demand refresh policy instead of the on commit refresh policy. This policy is suitable for refreshing large amounts of data when it is updated. Select QUERY_REWRITE_INTEGRITY. Create a materialized view reasonably. If you have doubts, you need to compare and evaluate the performance impact of creating or not creating materialized views on queries and DML statements, to measure whether the extra overhead of the materialized view can be compensated from the improvement of query performance.