TOM's oracle programming art 9i, 10g, 11g
PMON
PMON, process monitoring. PMON has three main purposes:
1. Perform cleanup after the process is interrupted abnormally. For example, if the dedicated server fails or is killed for some reason, PMON works in two ways. The first is to restore or revoke the work done by the dedicated server. Second, release the resources occupied by the dedicated server. PMON will roll back the uncommitted work of the failed process, release the lock, and release the SGA space.
2. After the process abort, PMON cleans up the job. PMON monitors other oracle background processes and recreates them as needed. If the shared server or dispatcher fails, PMON will intervene in the process and re-build a shared server or dispatcher after the failed process is cleared. For example, when the LGWR process fails to write logs in the database, this is a very serious error. The safest way to solve this problem is to immediately interrupt the instance and restore it.
3. The third purpose of PMON is to register instance information with Oracle TNS listener. When the instance is started, PMON queries whether the default oracle port (port 1521) is in the working state. If the port is already working, the instance can be started. PMON tells the listener about the instance, including the service name and instance information. If the listener is not started, PMON periodically tries to connect to the listener. Note that if oracle does not use the default port 1521, but uses other ports, the connection process between PMON and listener is very similar to that of port 1521. Besides, when using a non-default port, the listener address must be specified in the LOCAL_LISTENER parameter.
SMON
SMON, system monitoring. The work of SMON is as follows:
1. Clear the temporary space.
2. Aggregate free space. If you use the dictionary-managed method to manage tablespaces, SMON is responsible for aggregating idle extent into large idle extent. This happens only when the tablespace is managed by dictionary-managed and the PCTINCREASE parameter is set to a non-zero value.
3. transaction recovery for unavailable files. When the database is started, SMON will resume failed transactions, which are skipped during instance recovery or crash recovery. For example, if a brother file is unavailable on the disk and the file is available again, SMON will restore it.
4. recover an instance from a single node failure of RAC. In the RAC environment, if one of the cluster instances fails (for example, the machine where the instance is located crashes ), other nodes in this cluster will open the redo log of the failed instance and restore the failed instance.
5. Clear OBJ $. OBJ $ is a low-level data dictionary that contains almost all the objects entries in the database. Most of the time, the objects of some entries has been deleted, or the current entry does not represent the latest objects. SMON is responsible for deleting the entry information.
6. Shrink undo segments. SMON automatically contracts the rollback segment to the optimum size.
7. Offline rollback segments. DBA may need to take the rollback segment of an active transaction offline. At this time, if the transaction is using this offline rollback segment, the segment is not really offline, but marked as "pending offline ". In the background, SMON tries to go offline until the segment is successful.
In addition, SMON refreshes DBA_TAB_MONITORING statistics in the view. SMON consumes a lot of CPU. SMON is periodically or awakened by other background processes to perform cleanup.
CKPT
CKPT, the checkpoint process. The CKPT process does not perform a checkpoint as its name says. Running a checkpoint is a task of DBWn. It only updates the data file header. Before oracle8.0, CKPT is only an optional process. However, after oracle8.0, The CKPT process is opened. In the past, the checkpoint information of the data file header was updated by LGWR. However, with the increase of database files, the burden of LGWR is getting heavier and heavier. If LGWR needs to update the 100 or even 1000 file headers, there will be a lot of sessions waiting for a long time to commit. All CKPT took over the work.
DBWn
DBWn: Data writing process. DBWn writes dirty data in the buffer zone to the disk. Checkpoint occurs when switch log files occurs in oracle. After a checkpoint occurs, the data in the redo log can be overwritten. If the redo log is filled up and the redo log is used to store new data, and the checkpoint is still completed, oracle returns "checkpoint not complete".
DBWn performance is very important. If DBWn does not write data fast enough, the idle buffer will not be released quickly. Then the value of Free Buffer Waits and Write Complete Waits will grow rapidly.
Oracle can configure up to 36 DBW processes. From DBW0 to DBW35. most systems only have one DBW process, but there may be more than one DBW process in multiple CPU Systems. This aims to distribute the Data Writing burden and ensure that there is sufficient free space in the SGA.
During optimization, DBW writes data to the disk through asynchronous (asynchronous) I/O. Through asynchronous I/O, DBW first combines blocks into a batch (a bundle) and then submits the batch to the OS. DBW does not wait for the OS to write the batch to the disk, but returns it, continue to collect the next batch. After the OS completes writing, the DBW process will be notified asynchronously, And the batch has been successfully written to the disk.
Finally, the DBW process discretely writes data to the disk. LGWR writes redo logs consecutively. Distributed writing takes more time than continuous writing. However, DBW is distributed in the background, while LGWR is used for continuous writing to reduce user waiting time.
Doubt: TOM said that DBWn made up blocks into a batch, and then handed it to the OS asynchronously to write the OS to the disk. Why does DBWn distribute data writing? Isn't the operating system responsible for data writing?
LGWR
LGWR: log writing process. LGWR writes the redo log buffer information in SGA to the redo log file process. LGWR will occur in the following situations:
1. Perform LGWR once every 3 seconds
2. Any transaction has been committed.
3. When the redo log buffer is 1/3 full or contains 1 MB of data
For the above reasons, it is unnecessary to set the redo log buffer.
ARCn
ARCn, archiving process. After LGWR fills up the onlone redo log, ARCn copies the redo log file content to other places. Archived logs can be used for media recovery. Online redo log is used to restore data files when the instance fails. Archive logs are used to restore data files during media recovery.