Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Because the Data warehouse system stores a large amount of historical data and current data, and the data volume is increasing, in order to improve the efficiency of data storage and retrieval, to build an excellent data warehouse system, Data Warehouse designers and developers need to master the knowledge of Oracle optimization theory, which can help design, Developed a high-performance data Warehouse system.
1 External optimization principles
The CPU is an external factor that directly affects the performance of the database, and the memory size of Oracle also affects the SQL query
Efficiency, a lot of Net8 communication can also make SQL performance worse.
The following describes the specific adjustment methods of the external optimization principle.
1) appropriately increase the number of server CPUs, the performance of the server is limited by the CPU, the best way is to add extra CPU to the server , or shut down a lot of resources waiting to process the various components to reduce The waste of CPU resources.
2) when memory paging, if memory capacity is insufficient, the best way is to add more memory, reduce the size of the SGA, or turn off the Oracle Multi-threaded server to improve the efficiency of the database system.
2 basic rules for SQL optimization
1) in the large table for full table search, in order to avoid unnecessary full table search caused by a large number of I/O operations, the most common tuning method is to increase the index in moderation, speed up the query, to improve the efficiency of database operation.
2) ensuring optimal index usage is critical to improve the speed of queries and improve database performance . In some cases, you can also select multiple indexes for querying, and also include bitmap indexes and function-based index usage.
3 SQL Usage Specifications
1) Avoid using cursors as much as possible. Because cursors are inefficient, if the cursor is manipulating more than 1W of data, you should use a different approach: If a cursor is used, try to avoid table joins in the cursor loop.
2) do not perform functions, arithmetic operations, or other expression operations on the left side of the "=" in the WHERE clause, or the database system may not use the index correctly.
3) try to use exists instead of the Select COUNT (*) statement to determine if there are any records in the table that meet the criteria. The Count function is used only for all rows in the statistics table, and count (1) or count (' X ') is more efficient than count (*).
4) Note the data type of the associated field between tables, and avoid using different types of fields as association criteria for multi-table joins.
4 Index usage Specifications
1) the creation of the index should be considered in conjunction with the actual application, it is recommended that large online transaction processing system (OLTP) tables should not exceed 6 indexes, so as not to increase the unnecessary burden of the system.
2) use indexed fields as query criteria whenever possible to improve query efficiency.
3) try to avoid the use of full-table scanning in large table queries, and if necessary, consider rebuilding the index.
4) note Periodic maintenance of the index, which can periodically rebuild the index and recompile the stored procedure.
For example, use:
SELECT ' ALTER INDEX ' | | index_name| | ' REBUILD; '
From User_indexes
The statement queries All index rebuild statements under the current user, and then executes the rebuilt index statement in the next SQL window.
5 Temporary table Usage specifications
1) try to avoid using distinct, order BY,Group BY, have, join in temporary tables, because these statements can aggravate the burden of temporary tables. Also try to avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.
2) when creating a temporary table, if the amount of data inserted at one time is very large, you can use the SELECT INTO statement instead of the CREATE TABLE statement to avoid a large number of operations on the log, which can shorten the running time, if the amount of data is not large, To mitigate the resource consumption of system tables, it is recommended that you create a temporary table before inserting.
3) If the temporal table has a large amount of data, you will need to establish an appropriate index for the staging table, while maintaining the database system's use of the temporary table index.
4) If a temporary table is used in the stored procedure, be sure to delete all temporary tables at the end of the code, it is common practice to execute the TRUNCATE TABLE statement before executing the drop TABLE statement. This avoids the burden on the database system by preventing the database system from locking the connection Query and association of the temporary table with other tables for a long time.
6 Index Creation Principles
Index creation Principles
1) Insert the data first and then the index creation work.
2) limit the number of indexes per table and avoid a large number of indexes resulting in degraded performance of the database system.
3) specify tablespace for each index, which helps avoid i/0 collisions.
4) periodically rebuild the index to reduce the fragmentation of the index, improve the performance of the database system, in general, the enumeration type of fields may consider using bitmap index, rather than binary Tree class index.
Maintenance principles of the 7 Big Data Scale
1) If it is a large data volume of the table, and often to add, delete, modify, query and other operations, the best way is to regularly collect statistical information to the Oracle optimizer to improve the performance of the database system.
2) If it is a large data volume table, consider using time or hash partitioning techniques, and put different partitions into the specified table space to improve the efficiency of database queries.
8 View Creation Principles
1) It is absolutely forbidden to create a view on the basis of the view, which can seriously affect the performance of the database.
2) It is recommended to use materialized view technology to improve the efficiency of database table query, but the materialized view refresh can not be too high, because it will also affect the performance of the whole database system.
3) in order to ensure data consistency and security, it is recommended to set the normal view to read-only type, can not do any delete operation.
9 guidelines for using indexes in code programs
1) use the in operator as little as possible . Use exists instead of the in operator to enable the query's child tables to use indexes to improve the efficiency of database table queries.
where<> or !=. For example, the condition count<0 can be modified to count>0 or Count <0, this can play the role of index, improve the efficiency of database table query.
3) like statement try not to use the wildcard character "%" or "_" as the first character of a query condition. For example,the count like '%100% ', which performs a full table scan of a database table, severely affects the performance of a database table query, and can be modified to count as ' 2100% ' or count as ' 3,100% ', so The Count field uses the index on the field to make a different range of queries, greatly improving the query efficiency of the database tables.
4) because the fields of the operation cannot be indexed, the where clause should try to avoid any calculations on the indexed fields. For example,substr (count,1,3) = ' 100 ' can be modified to count like ' 100% '.
5) After the where condition, the table connection statement is generally written at the top, and the conditions that can filter out a large number of records are written at the end. Because Oracle performs a multi-table query, the SQ code is executed from the back forward .
10 some suggestions in the Code program
1) try not to use the SELECT * statement, the best way is to list the fields that will be queried to improve the efficiency of SQL operation. Because Oracle in the process of parsing, "*" will be converted to all column names, this work is done by querying the data dictionary, which means that more time and database resources will be spent.
2) Use the Execute immediate statement as much as possible to improve the efficiency of SQL execution.
3) When using cursors, if a large number of data sets are assigned to the cursor, the runtime will typically present a memory overflow error message. You need to adjust the size of the buffer, so be aware of the memory size when using cursors.
4) for a table with very large amounts of data, you should perform an operation that collects statistics at intervals. For example, execute the dbms_stats.gather_table_stats command to pass statistics to the Oracle optimizer to improve the performance of the database system.
5) An object-oriented best practice method is to define a get method for each attribute , but the object-oriented implementation method cannot be applied to the relational database. A table in a relational database cannot be equated to an object-oriented class, and a field in a table cannot be equal to a property of a class, so when querying a field in a data table, it should be taken out all at once.
6) Use the Custom function sparingly. Custom functions often affect the optimizer's role in optimizing queries.
7) Check whether a record that satisfies a condition exists, never use the select COUNT (*) statement to judge, and consider using the merge statement to determine if the record exists.
In short, the basic principle of writing a SQL code program is to minimize the connection of the database, minimize the association between tables (in the design of the table can be converted to the third normal form of the table into a second paradigm), less use of temporary tables, to avoid the large amount of data to be divided into small pieces of data to deal with.
In addition, it is recommended to avoid using large fields such as blobs, CLOB, etc. in the table design process, because doing so may cause unnecessary inconvenience to the database's migration and backup.
Elvis
2012.12.23
Knowledge sharing ~ Common progress
Reprint Please specify:
http://blog.csdn.net/elvis_dataguru/article/details/8393933
The theory of database optimization needed to be understood in the field of business intelligence