Caused by an issue on the Forum (being modified)

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is the process of a command in Oracle?
The client sends a command to connect to the listnener of Oracle through network connection,
The listener of Oracle is responsible for Fork (generate one) shadow process in the memory. The shadow process (also called the server process) is responsible for interacting with the client.
The Shadow process submits various commands to the Oracle instance. The statements submitted by the customer are divided into three types: DML, DDL, and DCL.
Using a common DML statement, the DML statement is sent and received by the shadow process.

1. First, it will be in the memory area of the database. Specifically, it is the dictionary cache in the share pool of the SGA to analyze whether the user has the right to execute this statement.
(The schema information of this user can be obtained from the dictionary table of the database .)
2. If the user does not have the right, the shadow process will return an error and submit it to the customer.
If the user has the right to execute this statement, OK, at this time, the shadow process submits the SQL statement
Library cache in the share pool. At this time, several steps are taken for the DML statement.
3. the normal execution steps are divided into parse-> execute-> fetch
Parse consumes a lot of CPU resources. I understand that the CPU is used to analyze this statement and
The analysis results generate an execution plan.

If the same and similar DML statements exist before, a soft parse may occur. Execute the new statement directly using the old execution plan.
If there are no similar DML statements before. A new execution plan will be generated completely. This is hard parse, which consumes CPU resources. (If you want Oracle to adopt an execution plan with similar statements, the cursor_share parameter is involved. We recommend that you do not set it to force, however, when writing a program, try to make the statement conform to reusable specifications)

Execute is to hand over to the CPU to execute this execution plan.
Generally, the sort and Union operations are designed. This operation is executed on the work station of the PGA in the memory. The PGA of the new version of Oracle is automatically adjusted by default. You can also cancel Automatic Adjustment to set parameters such as sort_area in PGA.

4. The fetch operation is performed by the Oracle process to capture data from the memory (specifically in the buffer cache) or from the disk (if the buffer cache cannot be found) and then modify the data.
Specifically, data is captured from the buffer cache in the memory. There are two linked lists in the buffer cache. One is the LRU list and the other is the dirty list.
The previous linked list contains pointers to all cache blocks, which are sorted by the nearest minimum algorithm.
The dirty list contains all buffer caches that have been modified but not written back to the disk.
Data.

5. Finally, the Oracle shadow process returns the captured data to the user.

2 What is the Oracle performance optimization solution?

Performance Optimization: you can write a book.
But first, what is the purpose of Oracle performance optimization?
I think the purpose of performance optimization is to minimize disk I/O while maintaining the application system logic.
What are the results of performance optimization?
Less response time and faster response speed.

All the methods used for this purpose can be summarized into performance optimization.
What are the features of performance optimization?

I feel like
Code layer optimization, middleware layer optimization, and database underlying optimization.
The Optimization of the Code layer is to make the design code as scientific and reusable as possible. For example, to write an identical statement, the probability of soft parse is only high,
Less CPU pressure on the database.
Java code, C code, or PLSQL code all have their own specifications to ensure that the statement writing is more efficient.

Middleware layer Optimization
If there is middleware in an application, try to get some information that can be obtained locally,
For example, select sysdate from dual. Such statements should not be executed on the database any more. Directly grabbing them from the middleware layer will greatly reduce the loss.
The middleware layer can also set more caches to save information that is often captured from the database.
In addition, try to maintain some connection with the database background, because the middleware and database are connected and disconnected.
There will also be a lot of memory and CPU resource consumption.

Background optimization:
How to Set better memory (OLAP system, DSS decision-making system, OLTP system)
(Allocation of SGA and PGA,
Share pool, buffer cache, and allocation under SGA.
Allocation of sort area size under PGA.
How to Set logbuffer?

)
How to plan disks better
(The underlying disk array level Lun is dispersed.
Does the operating system-level LV need to be strip?
On redo log and datafile must be separated?
Using bare devices, file systems, or ASM?
)
Logical Structure Design of databases
(How to create an index?
How to Create a table? (How much is the most reasonable PCT free setting? Do you want to store frequently used small tables in the keep pool? What are the new methods in LRU list when a large table sort is not commonly used ?)
Do you want to use materialized views?
Top SQL optimization?
Top session analysis.
)

3. What is the difference between Oracle functions and stored procedures?

4. Will shell be used? Will ETL tools be used?

What is ETL?

From Baidu

FunctionETL extracts data from distributed and heterogeneous data sources, such as relational data and flat data files, to a temporary middle layer for cleaning, conversion, and integration. Finally, it loads the data to a data warehouse or a data set, it is the basis for Online Analytical Processing and data mining.

ETL is an important part of data warehouses. It is a necessary step. Compared with relational databases, data warehouse technology does not have a strict mathematical theory basis, and it is more oriented to practical engineering applications. Therefore, from the perspective of engineering applications, data is loaded and processed in a series based on the requirements of the physical data model. The processing process is directly related to experience, at the same time, this part of work is directly related to the data quality in the data warehouse, thus affecting the quality of online analysis and processing and data mining results.

A data warehouse is an independent data environment. You need to import data from the online transaction processing environment, external data sources, and offline data storage media to the Data Warehouse through the extraction process. Technically, ETL mainly involves Association, transformation, increment, scheduling and monitoring. data in the data warehouse system is not required to be synchronized in real time with data in the online transaction processing system, so ETL can be performed on a regular basis. However, the time, sequence, and success/failure of multiple ETL operations are critical to the effectiveness of information in the data warehouse.

ETL (extract-transform-load) is the core and soul of Bi/DW (Business Intelligence, the ability to integrate and increase the value of data according to unified rules is the process of converting data from the data source to the target data warehouse and an important step for implementing the data warehouse. If the model design of the data warehouse is the blueprint of a building and the data is brick, ETL is the process of building. The most difficult part of the project is the user requirement analysis and model design, while the ETL rule design and implementation are the largest workload, accounting for about 60% of the total project ~ 80%. This is a general consensus obtained from many practices at home and abroad.

ETL is a process of data extraction (extract), transformation (Transform), cleaning (Cleansing), and loading (load. Is an important part of building a data warehouse. The user extracts the required data from the data sourceData cleansingFinally, the data is loaded into the data warehouse according to the pre-defined data warehouse model.

Information is an important resource of modern enterprises. It is the basis for enterprises to use scientific management and decision analysis. At present, most enterprises spend a lot of money and time building an online transaction processing OLTP business system and office automation system to record various related data of transaction processing. According to statistics, the data volume is every 2 ~ In three years, the data will multiply. This data contains huge commercial value, and enterprises generally only focus on 2% of the total data volume ~ About 4%. Therefore, enterprises still do not maximize the use of existing data resources, which wastes more time and money and loses the best chance to make key business decisions. Therefore, how enterprises use various technical means and convert data into information and knowledge has become the main bottleneck for improving their core competitiveness. ETL is a major technical means. How to select the ETL tool correctly? How to correctly apply ETL?

Currently, typical ETL tools include Informatica, datastage, owb, and Microsoft DTs ......

Open-source tools include s etl plug-ins. Http://www.cloveretl.org/clover/

Data integration: Quick ETL implementation

ETL quality problems are embodied in the following features: correctness, integrity, consistency, completeness, effectiveness, timeliness, and availability. There are many causes of quality problems. The main causes of system integration and historical data include: inconsistent data models between systems in different periods of the business system; changes in business processes in different periods of the Business System; inconsistency of the old system modules in operations, personnel, finance, office systems and other related information; inconsistency resulting from incomplete data integration of legacy systems and new businesses and management systems.

To implement ETL, you must first implement the ETL conversion process. It can be embodied in the following aspects:

Null Value processing can capture the null values of a field, load or replace the data with other meanings, and load the data to different target databases based on the null values of the field.

The canonicalized data format defines the Field Format constraints. You can customize the loading format for time, value, character, and other data in the data source.

The split data can be decomposed based on business requirements. For example, the caller ID 861084613409 can be used to separate the region code and telephone number.

To verify data correctness, you can use the lookup and splitting functions to verify data correctness. For example, if the caller ID is 861084613409, after the domain code and phone number are decomposed, you can use lookup to return the caller id region recorded by the gateway or switch for data verification.

Data replacement can replace invalid data and missing data due to business factors.

Lookup detects lost data lookup for subqueries and returns missing fields obtained by other means to ensure field integrity.

Illegal data with no dependency on the primary and foreign key constraints of the ETL process can be replaced or exported to the error data file to ensure the loading of the unique primary key record.

In order to better implement ETL, we recommend that you pay attention to the following points during the ETL implementation process:

First, if conditions permit, operational data can be pre-processed using the data transfer area to ensure the efficiency of integration and loading;

Second, if the ETL process is "pull" rather than "push" internally, the controllability will be greatly enhanced;

Third, process-based configuration management and standard protocols should be developed before ETL;

Fourth, key data standards are crucial. At present, the greatest challenge facing ETL is the heterogeneous and low quality of data from various sources when receiving data. Taking China Telecom as an example, system a manages data based on statistical code, system B manages data based on account numbers, and system c manages data based on voice IDs. When ETL needs to integrate these three systems to gain a comprehensive view of the customer, this process requires complex matching rules, name/address normalization and standardization. ETL defines a key data standard in the processing process, and develops corresponding data interface standards on this basis.

The ETL process is largely influenced by enterprises' understanding of source data, which means data integration is very important from the business perspective. An excellent ETL design should have the following functions:

Simple management; UseMetadataMethods: centralized management; strict specifications on interfaces, data formats, and transmission; installation of software from external data sources whenever possible; automation of data extraction system processes and automatic scheduling functions; the extracted data is timely, accurate, and complete. It can provide interfaces with various data systems and is highly adaptable to the system. It also provides software framework systems and system functions that change, applications can adapt to changes with few changes; they are highly scalable.

Data Model: standard definition data

Reasonable business model design is crucial to ETL. Data Warehouse is the only, true, and reliable integrated data platform for enterprises. Data warehouse design and modeling are generally based on the three paradigm, star model, and snowflake model. No matter which design concept, key business data should be maximized, the messy and disordered data structures in the operating environment are unified into a reasonable, associated, and analytic new structure, while ETL extracts data sources according to the definition of the model for conversion and cleaning, and finally load the data to the target data warehouse.

The important thing about the model is to standardize and define the data to achieve unified coding, unified classification and organization. Standardized definition includes: Unified Standard Code and unified business terms. ETL performs data integration based on models, such as initial loading, incremental loading, slow growth, slow change, and fact table loading, based on business needs, the corresponding loading policies, refresh policies, summary policies, and maintenance policies are formulated.

Metadata: expands new applications

Metadata refers to the description and definition of the business data itself and its running environment ). Metadata is the data that describes the data. In a sense, business data is mainly used to support data of business system applications, metadata is indispensable for new applications such as enterprise information portals, customer relationship management, data warehouses, decision support, and B2B.

A typical example of metadata is the description of an object, that is, the description of a database, table, column, column attribute (type, format, constraint, etc.), primary key/external key Association, and so on. In particular, when the heterogeneity and distribution of current applications become more and more common, unified metadata becomes more and more important. "Information islands" was a type of complaint and generalization made by many enterprises on their application status quo, while reasonable metadata effectively depicts the relevance of information.

Metadata shows the following in a centralized way for ETL: define the data source location and data source attributes, determine the corresponding rules from the source data to the target data, determine the relevant business logic, and other necessary preparations before the data is actually loaded, it generally runs through the entire data warehouse project, and all ETL processes must reference metadata to the maximum extent, so as to quickly implement ETL.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Caused by an issue on the Forum (being modified)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Caused by an issue on the Forum (being modified)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support