Large Oracle Database Design Scheme

Source: Internet
Author: User

Abstract This article mainly analyzes the ORACLE system structure and working mechanism from four different levels of Adjustment Analysis in the ORACLE environment of a large database, the optimization and adjustment schemes for ORACLE databases are summarized in nine different aspects.
Key words: ORACLE database environment Adjustment Optimization Design Scheme
There are four different adjustment levels for ORACLE database data access. The first adjustment is the operating system level, including the hardware platform, and the second adjustment is the oracle rdbms level adjustment, level 3 is the database design-level adjustment, and the last adjustment level is the SQL level. Generally, the database is adjusted and optimized based on the four adjustment levels, and the overall performance of the database will be greatly improved. The following describes the ORACLE database optimization design solutions from nine different aspects.
I. Optimize the free database structure OFA (Optimal flexible Architecture)
The logical configuration of the database has a great impact on the database performance. Therefore, ORACLE proposed an optimized table space structure OFA. Designing with this structure greatly simplifies data management in physical design. To optimize the free structure OFA, We can efficiently and freely distribute Logical Data Objects in the database, therefore, we must first classify the logical objects in the database based on their usage and physical structure impact on the database, this classification includes separating system data from user data, separating general data from index data, separating low-activity and high-activity tables.
The results of the logical design of the database should comply with the following guidelines: (1) storing the segments used in the same way together; (2) designing the system based on standards; (3) there is a separate area for exceptions; (4) Minimizing table space conflicts; (5) Separating data dictionaries.
Ii. Make full use of the system global region SGA (system global area)
SGA is the heart of oracle databases. The user's process sends transactions to this memory area and uses this as the cache to read hit data for accelerated purposes. Correct SGA size is critical to database performance. SGA includes the following parts:
1. data block buffer is a high-speed cache in SGA, accounting for 1%-2% of the size of the entire database, it is used to store data blocks (tables, indexes, clusters, etc.) that are reread from the database. Therefore, least recently used (LRU, least recently used) is used for space management.
2. Dictionary Buffer. The information in the buffer zone includes the user account data, data file name, segment name, disk location, table description, and permissions. It is also managed in LRU mode.
3. Redo the log buffer. The buffer is saved as a roll-forward operation during database recovery.
4. SQL sharing pool. The syntax analysis tree that saves the execution plan and SQL statements running the database. The LRU algorithm is also used for management. If the value is too small, the statement will be continuously loaded into the database cache, affecting system performance.
In addition, SGA also includes large pools, JAVA pools, and multiple buffer pools. However, it mainly consists of the above four buffers. Reasonable settings of these memory buffers can greatly speed up data query. A large enough memory zone can store the vast majority of data in the memory, only the data that is not frequently used, in this way, the hit rate in the memory area can be greatly increased.
Iii. Design a Database Based on specifications and anti-standards
1. Standardization
The paradigm is a set of relational patterns that conform to a certain level. According to the constraints, there are generally three paradigms: 1NF, 2NF, and 3NF. Standardization Theory is built around these paradigms. The basic idea of standardization is to gradually eliminate the inappropriate part of the data dependency, so that the relationship models in the model reach a certain degree of "separation", that is, to adopt the "one thing, one place" pattern design principle, therefore, standardization is essentially a simplification of concepts. The advantage of Data normalization in databases is that it reduces data redundancy, saves storage space, and reduces the number of logical and physical I/O operations. It also speeds up addition, deletion, and modification. However, a fully standardized design does not always generate the optimal performance, because more connection operations are usually required for database queries, thus affecting the query speed. So sometimes, in order to improve the performance of some queries or applications, it intentionally destroys the Standard Rules, that is, anti-standardization.
2. Anti-Standardization
(1) necessity of anti-Standardization
Is the higher the degree of standardization, the better? The answer is no. It should be determined based on actual needs, because the deeper the "separation", the more relations are generated, and the more complicated the structure. The more links there are, the more frequent connection operations, and the most time-consuming connection operations. In the database design, especially for the database design dominated by queries, frequent connections will seriously affect the query speed. Therefore, during the database design process, the non-normalized constraints are intentionally retained or denied after normalization. This is usually done to improve the query performance of the database and speed up the response of the database system.
(2) Anti-standard technology
Before the design of the anti-standard, we should fully consider the data access requirements, the size of common tables, special calculations, and physical storage of data. Common Anti-standard techniques include adding redundant columns, deriving columns, or reorganizing tables. The advantage of anti-normalization is to reduce the need for connection operations, reduce the number of external codes and indexes, and reduce the number of tables, thus improving the query speed. For database systems with relatively high performance requirements, it can effectively improve the system performance, but the corresponding problem is that it may affect the data integrity, speed up the query, and reduce the modification speed.
3. Optimization Strategies in Database Design
Data should be organized in two categories: frequently accessed data and frequently modified data. For data that is frequently accessed but not frequently modified, the internal design should be physically nonstandard. For data that is frequently modified but infrequently accessed, internal design should be physically normalized. The more complex method is to use standardized tables as the basis for logical database design, and then physically non-standardized data based on the needs of the entire application system. Both standards and anti-standards are constraints based on actual operations, and they are meaningless. Only by reasonably combining the two can they complement each other and give full play to their respective advantages.
4. Rational Design and Management of tables
1. Use table partitions
Partitions physically separate data. Data in different partitions can be stored in data files on different disks. In this way, when querying this Table, you only need to perform a row Scan in the Table partition, instead of performing a Full Table Scan (Full Table Scan), which significantly shortens the query time, in addition, partitions on different disks will also distribute the data in this table to different disk I/O, A well-configured partition can evenly distribute data transmission to disk I/O competition.
2. Avoid row connection and row migration
When creating a table, because the pctfree and pctused parameters are incorrectly set, the data in the data block will have row links and row migration, that is, the data of the same row is not stored in the same data block. If the data is encountered during data query, the head must be located again to read the data, which will inevitably greatly reduce the execution speed of the database. Therefore, when creating a table, we should fully estimate possible data changes in the future, correctly set these two parameters, and minimize the number of row links and row migration in the database.
3. Fragment Control
Fragmentation is a description of a group of non-adjacent database objects. Fragmentation means that additional resources (disk I/O, disk drive cycle delay, dynamic expansion, linked blocks, etc.) are required to execute database functions, and a large amount of disk space is wasted. When two or more data objects are in the same tablespace, the interval is crossed. In the dynamic growth, the intervals of objects are no longer adjacent to each other. In order to eliminate the cross-interval, static or small-sized tables are placed in a tablespace, and dynamic-increasing objects are placed in their respective tablespaces. When creating table, create index, create tablespace, and create cluster, the parameters in the storage clause can be reasonably set to reduce the generation of fragments.
4. Alias usage
Alias is an application technique for large databases. It means that a single letter is used as an alias for table names and column names. The query speed is 1.5 times faster than that for connecting tables.
5. Alternate use of rollback segments
The database configuration features relatively static data dictionaries and extremely high transaction rates for the Application tables. In addition, the system index and data segments of the database are relatively static. The highest load of concurrent applications is the tablespace of the rollback segment. Define rollback segments as alternate references, so that the rollback segments corresponding to the cyclic allocation transactions can be distributed evenly.
V. Index Optimization Design
1. Manage organization Indexes
The index can greatly accelerate the database query speed. The index maps the logical value in the table to a safe RowID, so the index can quickly locate the physical address of the data. However, some DBAs find that the indexing of a large table does not improve the data query speed, but affects the performance of the entire database. This is mainly related to the data management method of SGA. When ORACLE performs data block high-speed cache management, index data has higher resident permissions than normal data. During Space competition, ORACLE first removes normal data. When you query a large table with an index, index data may use up all the data block cache space. ORACLE has to perform disk read/write frequently to obtain data, therefore, after partitioning a large table, you can create a partition index based on the corresponding partition. If the data in such a large table is frequently queried, or the index is not created. In addition, when creating an index, DBA should try to ensure that the index is most likely to be used in the where clause. If only one index is created for the query, it will not necessarily speed up, because the index must specify a suitable access path. 2. Use of clustering
Oracle provides another method to increase the query speed, that is, clustering ). In simple words, clustering stores several tables together and stores them in a mixture of public attributes. Clustering stores data from multiple tables in the same Oracle block based on the common code value. Then, when a group of Oracle blocks is retrieved, the data from both tables is obtained, in this way, the Oracle block to be stored can be reduced to improve the application performance.
3. Optimize the configured index to make full use of it to speed up database access. ORACLE needs to use an index. There are some basic conditions: 1) this field in the where subname must be the first field of the composite index; 2) and where subnames should not be involved in any form of calculation.
Vi. Use of multiple CPU and Parallel Query PQO (Parallel Query Option)
1. Try to use multiple CPU processors for transaction processing and query
The rapid development of CPU makes ORACLE pay more and more attention to the application of multi-CPU Parallel technology. The access to a single database can be completed by the combination of multiple CPUs. In addition, distributed computing is already quite common, whenever possible, the CPU requests of the database server and the application should be separated, or the CPU requests should be moved from one server to another. For multiple CPU Systems, use Parallel Query Option (PQO, Parallel Query Option) for database operations.
2. Use Parallel Query Option (PQO, select Parallel Query) to Query data
With the PQO method, you can not only allocate SQL statement request processing among multiple CPUs, but also read data from independent processes when the queried data is on different disks.
3. Use the SQL * Loader Direct Path option to load a large amount of data
When this method is used for data loading, the program creates formatted data blocks and writes them directly to the data file. other I/O of the database kernel is not required.
7. Implement the system resource management allocation plan
ORACLE provides a Database Resource Manager (DRM) to control user Resource allocation. DBAs can use it to allocate system Resource percentages for user and job classes. In an OLDP system, 75% of CPU resources can be allocated to online users, and the remaining 25% is reserved for batch users. In addition, you can perform multi-level CPU allocation. In addition to CPU resource allocation, DRM can also limit the parallel operations on resource user groups.
8. Use the best database connection and SQL Optimization Solutions
1. Use the direct ole db database connection method.
You can use ADO to connect to a database in two ways. One is the traditional ODBC method and the other is the ole db method. ADO is based on the ole db technology. To support ODBC, you must establish the call conversion from the corresponding ole db to ODBC. However, you do not need to convert the data using the direct ole db method, this increases the processing speed.
2. Use the Connection Pool mechanism
In database processing, the biggest resource cost is to establish a database connection, and the user still has a long connection wait time. The solution is to reuse the existing Connection, that is, use the Connection Pool object mechanism.
The Connection Pool principle is: a Connection buffer Pool is maintained in the IIS + ASP system, so that when the next user accesses, a database Connection is directly obtained in the Connection buffer Pool, instead of reconnecting to the database, the system response speed can be greatly improved.
3. Efficient SQL statement Design
Generally, the following methods can be used to optimize the performance of SQL statements on data operations:
(1) Reduce the number of queries to the database, that is, reduce the number of queries to the database by using distributed database objects such as snapshots and graphs.
(2) try to use the same or very similar SQL statements for queries. This not only makes full use of the analyzed syntax tree in the SQL sharing pool, the possibility of hitting the data to be queried in SGA is also greatly increased.
(3) restrict the use of dynamic SQL. Although dynamic SQL is useful, dynamic SQL performs syntax analysis again even if there is a completely identical query value in the SQL sharing pool.
(4) Avoid executing SQL statements without any conditions. When an SQL statement without any conditions is executed, it is usually required to perform the FTS. The database first locates a data block and then searches for other data in sequence, this is a long process for large tables.
(5) If you have constraints on the data in some tables, it is best to use the description integrity of the SQL statements in the table to be created, rather than in the SQL program.
(6) You can cancel the automatic COMMIT mode to aggregate SQL statements into a set of execution before committing them in a centralized manner. The program can also use COMMIT and ROLLBACL to submit and roll back the transaction explicitly.
(7) It takes a long time to retrieve a large amount of data. setting the number of row prefetch can improve the system performance and set a maximum value. When the SQL statement returns a row that exceeds this value, the numeric database temporarily stops execution unless the user sends a new command to start organizing and displaying data, rather than waiting for the user to continue.
9. Make full use of the data background processing solution to reduce network traffic
1. Create a temporary table or view reasonably
Creating a temporary table or view is to create a new table or view on the basis of the database as needed. A new table can be created for querying information after multi-Table Association. For a single-Table query, a view can be created, in this way, the database capacity and scalability can be fully utilized. All condition judgment and numerical calculation statistics can be processed in the database server background and then appended to the temporary table, the process of forming data results can be implemented by the database process or function.
2. full use of database Packaging Technology
Use the database Description Language to compile the database process or function, and pack the process or function into a uniform running package in the database background.
3. Use of data replication, snapshots, views, and Remote Procedure Call Technology
Data Replication: Copies data to the local database at a time, so that local data will be used for future queries, but it is only suitable for those data with little changes. Snapshots can also be used to dynamically copy data between distributed databases, define the automatic snapshot refresh time or manual refresh, to ensure the integrity of reference data. The Remote Call process also greatly reduces network congestion caused by frequent SQL statement calls.
In short, there is no uniform solution to all performance problems, but ORACLE provides a rich selection environment, the architecture, software structure, model object, and specific business and technical implementation of the ORACLE database can be considered as a whole. To improve the system performance, a system-wide approach is required. During database optimization, the application, I/O subsystem, and operating system (OS) should be optimized accordingly. Optimization is the process of purposefully changing one or more components of the system to meet one or more goals. For Oracle, optimization aims to adjust the component level to improve performance, that is, to increase throughput and reduce response time. If DBA can comprehensively consider the optimization solution from the nine aspects above, it is believed that most ORACLE Applications can achieve optimal data access.

Author: ERDP Technical Architecture"

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.