Data guard learns physical standby redo transmission services

Source: Internet
Author: User

When watching movies or TV series, many people often think that the key to influencing the plot development is the fate of the main character. So I have to say that you only see the surface of the problem, what really affects the development of the story ............... is the director. For data Guard data applications, the director behind the scenes is log_archive_dest_n. We will learn a lot in this chapter. This time, I hope you can clarify the key points of your study.

Redo transport services not only controls the transmission of redo data to other databases, but also manages the process of unreceiving archive files due to network interruptions.

I. How to send data
In the primary database, mongouard can use the archive process (arcn) or log write process (lgwr) to collect redo data and transmit it to standby. However, whether you select the archiving process or log write process, all are controlled by a core parameter, which is log_archive_dest_n. Therefore, let's first:

1. Understand the log_archive_dest_n Parameter
Log_archive_dest_n (N from 1 to 10) defines the redo file path. This parameter specifies the path of the archive file through location or service. Location indicates the local path. The service is usually the net service name, which is the standby database that receives redo data.

For each log_archive_dest_n parameter, there is also a corresponding log_archive_dest_state_n parameter. The log_archive_dest_state_n parameter is used to specify whether the corresponding log_archive_dest_n parameter takes effect. It has four attribute values:
Enable: the default value, indicating that the transmission service is allowed.
Defer: This attribute specifies the corresponding log_archive_dest_n parameter.
Alternate: transmission is prohibited. However, if the connection to another destination fails, it will become enable.
Reset: similar to the defer attribute. However, if an error has occurred before the destination, it clears all its error messages.

For example, specifying the local archiving path
Log_archive_dest_1 = 'location =/arch1/Chicago /'
Log_archive_dest_state_1 = Enable

For example, specify the redo transmission service.
Log_archive_dest_2 = 'service = jsspdg'
Log_archive_dest_state_2 = Enable
Of course, the attributes of the log_archive_dest_n parameter are far more than those.

These parameters can be directly modified online through the alter system set statement. The modification will take effect in the next log switch of primary. Of course, if you directly shut down and restart the database, the modification will take effect immediately. For example:
SQL> alter system set log_archive_dest_2 = 'service = jsspdg valid_for = (online_logfiles, primary_role )';

In addition to show parameter log_archive_dest, you can also view the parameter configuration by querying the V $ archive_dest view, and the V $ archive_dest view can also see more detailed synchronization information.

2. Use arcn to archive redo data
By default, redo transmission uses the arcn process to archive redo logs. However, the arcn archiving process only supports the highest-performance protection mode. If the standby database is in another protection mode, you must use lgwr to transmit redo data (Why? Think twice, we know that for the two modes of maximum protection and maximum availability, the redo data must be applied to the standby database in real time. What does archive mean? It is a redolog that has completed switching. What does the redolog that has completed switching represent? It indicates that all data in the redo file has been submitted to the data file. Now let's look back at it. The redo file that has been submitted and switched has been copied as an archive, at this time, we are ready to start transmitting data to the standby database. This is not as bad as the real-time application required by the maximum protection and maximum availability. Now, do you understand why the arcn archiving process only supports the highest-performance protection mode ).

1). initialization parameters control arcn archiving Behavior
The initialization parameters that affect the arcn process include:
Log_archive_dest_n and log_archive_max_processes.

The configuration of the log_archive_dest_n parameter is described earlier. We know that some of the attributes of this parameter are related to the arcn process. The log_archive_max_processes initialization parameter can be tailored to the arcn process, this parameter is used to specify the maximum number of arcn processes that can be called. Note that we specify the maximum value, that is to say, during the running process, the database automatically adjusts the number of archiving processes according to the heavy workload of archiving tasks. Of course, if you think that your system archiving tasks are heavy, you can increase the archiving concurrency by setting a large number of archiving processes, but this number is not as high as possible, an excessively high number of archiving processes may affect the system performance. (This is what we call the best practice. Therefore, you need to balance these processes. Of course, optimization is more involved in this aspect, it is not the focus of this chapter ). You can use the following statements to adjust this parameter:
Alter system set log_archive_max_processes = N;
Note: n> 0 and n <= 30

2). arcn archiving process
Archive will be started when the primary database log is switched:
In the primary database (assuming there are two archiving processes), once the arc0 Process completes redolog archiving, The arc1 process begins to transmit the archive to the specified path of the standby database.
In the standby database, The RFS process writes redo data to the standby redo log in turn, and then the arcn process in the standby database writes it to the archive, then, the data is applied to the standby database through the redo application or SQL application.

In addition, because there is no connection between the local archiving process and the remote archiving process, note that if there is a local deletion policy for archiving the backup, before deleting the archive, you must first confirm that the archive has been transferred to the standby database.

3. Use lgwr to archive redo data
The process of using lgwr is significantly different from that of using arcn. lgwr does not need to wait for log switching and archiving.
The lgwr process of the standby database selects a standby redo log file to map the sequence (and file size) of the current redolog of the primary database. Once the primary database generates redo data, depending on the sync or async attribute settings in the log_archive_dest_n initialization parameter, the data is transmitted to the standby database in synchronous or non-synchronous mode.

To continue with the following content, we must first understand the attributes of several log_archive_dest_n parameters closely related to the lgwr archiving process. If you select lgwr to archive redo data, in log_archive_dest_n, the service and lgwr attributes must be specified to allow the log transmission service to transmit redo data to a remote archiving destination using the lgwr process. We also need to specify the sync (synchronous) or async (asynchronous) transmission mode. If the sync attribute is specified (if not explicitly specified, the default value is sync ), then, any redo operation generated by the primary database will synchronously trigger network I/O, and the following submission will continue until all network I/O operations are completed. If the async attribute is specified, the operation of the primary database will first record online redologs and then be transmitted to standby. The following describes the process in detail:
1). lgwr synchronization archiving process
For example, the initialization parameters include the following settings:
Log_archive_dest_1 = 'location = E:/ora10g/oradata/jssweb /'
Log_archive_dest_2 = 'service = jsspdg lgwr sync net_timeout = 30'
Log_archive_dest_state_1 = Enable
Log_archive_dest_state_2 = Enable
If you set the sync attribute of the log_archive_dest_n initialization parameter, we recommend that you set the net_timeout attribute at the same time. This attribute controls the timeout of network connections. If the timeout still does not respond, an error message is returned.

Demonstrate the process of synchronizing the redo data to the standby database while writing online redologs in the primary database lgwr.
We can still explain it in two parts:
In the primary database, lgwr submits redo data to the lnsn (lgwr network server process) process (n> 0), and The lnsn starts network transmission.
The RFS (Remote File Server) of the standby database writes the received redo data to the standby redolog. Note that during this period, the transactions of the primary database will remain until all the log_archive_dest_n specified paths with the lgwr sync attribute have been received.

Once the primary database performs log switching, it will trigger the standby arcn to write standby redo to the archive, and then use the redo application (MRP process) or SQL application (LSP process) read the archive file and apply the data to the standby database. (If real-time applications are enabled, MRP/LSP will directly read standby redolog and apply it to the standby database without waiting for archiving ).

2). lgwr does not synchronize the archiving process
For example, the initialization parameters include the following settings:
Log_archive_dest_1 = 'location = E:/ora10g/oradata/jssweb /'
Log_archive_dest_2 = 'service = jsspdg lgwr async'
Log_archive_dest_state_1 = Enable
Log_archive_dest_state_2 = Enable
Net_timeout does not need to be specified for async archiving. Because lgwr and lnsn are no longer associated, It is not affected to specify net_timeout. Therefore, for asynchronous transmission, even if the communication between primary and standby is interrupted due to a network failure, the submission of the primary database is not affected.

The lnsn process asynchronously transmits redo data to the RFS process of the standby database.

The general steps are the same as synchronous transmission. The difference is only in the lnsn process. lgwr writes data to online redolog, the lnsn process accesses the online redolog and transmits data to the Rfs of the remote standby instead of being associated with the local lgwr. The processing logic of standby databases remains unchanged.

Ii. When to send
This section contains two content: What to send and who to send:
1. Specify the transmission and receiving objects through the valid_for attribute.
Valid_for is effective based on XX. In combination with its redo_log_type and database_role attributes, we can basically understand it as: Setting the log file archive path for a specified role, the main purpose is to assist the normal operation of the database after the role switching operation.
Redo_log_type can be set to online_logfile, standby_logfile, all_logfiles
Database_role can be set to: primary_role, standby_role, all_roles
Note that the valid_for parameter has a default value. If this parameter is not set, the default value is equivalent:
Valid_for = (all_logfiles, all_roles)
We recommend that you set this parameter rather than the default value. In some cases, the default parameter value is not necessarily appropriate. For example, logical standby is not like physical standby, and logical standby is in open mode, not only redo data, but also multiple log files (online redologs, archived redologs, and standby redologs ). In most cases, the online redologs generated by logical standby and standby redologs are generated in the same directory. Therefore, we recommend that you set the appropriate valid_for attribute for each * DeST.

2. Specify the database through the db_unique_name attribute
The db_unique_name attribute mainly specifies a unique database name for a database, which makes it possible to dynamically add standby to the DG configuration of the primary database containing the RAC structure, and for the service attribute in log_archive_dest_n, the value of the attribute must correspond to db_unique_name, and because of db_unique_name, the redo data can be transmitted to the database you want to be transmitted during transmission. Of course, to ensure that the redo data is transmitted to the specified server, in addition to db_unique_name and log_archive_dest_n, there is also an initialization parameter: log_archive_config.

For details about log_archive_config, we have some contacts before. In section 2, we also have some simple introductions. The initialization parameters of log_archive_config also include several attributes that can be used to control the transmission and receipt of databases, send, nosend, receive, noreceive:
Send allows the database to send data to the remote end
Receive allows standby to receive data from other databases
Nosend and noreceive are naturally forbidden.

For example, if the primary database does not receive any archive data, you can make the following settings:
Log_archive_config = 'noreceive, dg_config = (jssweb, jsspdg )'

Of course, you also need to note that once the above settings are made, assuming that the server has a role switch, it still does not have the ability to receive redo data.

Iii. Error
For archiving failure handling, the log_archive_dest_n parameter has several attributes that can be used to control the measures to be taken when a fault occurs during archiving. They are:
1. Reopen tries archiving again after a specified time
Use the reopen = seconds (default = 300) attribute to repeat archiving to the archiving destination at the specified time. If this parameter is set to 0, once the redo Data fails, it will not try again and send it again until the next time the redo data is archived, but it will not be archived to the failed archiving destination, instead, it is sent to the backup archiving destination.

2. Alternate specifies the backup archiving destination
The alternate attribute defines a backup archive destination. The so-called backup means that once the master archive destination is unavailable for various reasons, it is temporarily written to the path specified in the alternate attribute.
Note that the reopen attribute has a higher priority than the alternate attribute. If you specify a value greater than 0 for the reopen attribute, lgwr/arch first tries to write data to the primary archive destination, write to the path specified by the alternate attribute is not performed until the maximum number of retries is reached.

3. max_failure controls the number of failed attempts
The max_failure attribute specifies the maximum number of failed attempts. You should be able to think of reopen. Yes, the two should usually be used in combination. Reopen specifies the retry time period after failure, max_failure controls the number of failed attempts, for example:
Log_archive_dest_1 = 'location = E:/ora10g/oradata/jsspdg/reopen = 60 max_failure = 3'

4. Select data protection mode
Everyone has heard of the glorious deeds of the three models. If you do not know them, Please consciously repeat the article "Mixed Face first, let's show you the superb skills of the three models.

For better understanding, we first draw a table that describes the attributes that should be set by the log_archive_dest_n parameter in different protection modes:

Maximum protection, maximum availability, and maximum performance
Redo write process lgwr or arch
In network transmission mode, sync is used when the lgwr process is sync or async, while the arch process is sync.
Disk write operations: affrem or noaffrem
Whether standby redologs is required. Yes, but none are recommended.

All the requirements above are the minimum requirements to meet the protection model. These requirements are based on their characteristics. At the same time, you must understand that, these requirements only ensure that you complete the settings of the Data guard protection mode. If you really want to understand the reasons for satisfying the requirements of their features, you still need to do a lot of necessary work. For example, for the highest availability, the root goal is to restore services in a very short time when one or more master or slave databases are paralyzed, this requires you not only to correctly configure parameters in the highest availability protection mode, but also to have enough standby databases. Do you understand? What, Wood? 555555555. You are cracking down on your language description capabilities ~~~

Again: both the maximum protection and maximum availability require the standby database to be configured with standby redo logs (if role switching is considered, the master database must also be configured ), for more information about standby redo logs, see Chapter 1 1.3 in section 2.

Next we will go to practice to change a data guard configuration from the highest-performance mode to the highest-availability mode:

1. First, check the current protection mode-primary database operations
SQL> select protection_mode, protection_level from V $ database;

Protection_mode protection_level
Maximum performance

2. Modify the initialization parameter-primary database operation
SQL> alter system set log_archive_dest_2 = 'service = jsspdg
2 optional lgwr sync affrem valid_for = (online_logfiles, primary_role)
3 db_unique_name = jsspdg ';

The system has been changed.

3. Set a new data protection mode and restart the database-primary database operation
The statement is very simple, as follows:
SQL> alter database set standby database to maximize availability;

The database has been changed.
Tip: Maximize and {protection | availability | performance} correspond to the maximum protection, maximum availability, and maximum performance respectively.

Drop the database and restart
SQL> shutdown immediate
The database has been closed.
The database has been detached.
The Oracle routine has been disabled.
SQL> startup
The Oracle routine has been started.

Total system global area 167772160 bytes
Fixed size 1289484 bytes
Variable Size 121635572 bytes
Database buffers 37748736 bytes
Redo buffers 7098368 bytes
The database has been loaded.
The database has been opened.

4. Check the current protection mode-primary database operations

SQL> select protection_mode, protection_level from V $ database;
Protection_mode protection_level
Maximum availability

5. Modify the standby initialization parameter settings (mainly considering Role Switching. skip this step if you only want to test) --- standby database operations

SQL> alter system set log_archive_dest_2 = 'service = jssweb optional lgwr sync affsert
2 valid_for = (online_logfiles, primary_role) db_unique_name = jssweb ';

The system has been changed.
View the current protection mode
SQL> select instance_name from V $ instance;


SQL> select protection_mode, protection_level from V $ database;
Protection_mode protection_level
Maximum availability

Configuration successful. Test the configuration on the front.

6. Stop the standby database and check the status of the primary database.
SQL> select protection_mode, protection_level from V $ database;
Protection_mode protection_level
Maximum availability Resynchronization
After the standby database is shut down, the protection level of the primary database is switched to be synchronized.

5. Manage archive interruptions
If the archive log in the primary database fails to be successfully sent to the standby database, archive interruption occurs. Of course, you usually don't need to worry about this, because DG will automatically detect and try to copy the Lost Archive to solve the interruption problem. What can be done? Fal (fetch archive log ).
FAL is divided into server and client. We should have noticed the initialization parameters in the previous steps. The FAL client automatically requests the transmission of the archive file, and the FAL server responds to the request sent by the FAL client. What a wonderful couple.

The FAL mechanism automatically handles the following similar archive interruptions:
When a physical or logical standby database is created, the FAL mechanism automatically obtains the archive files generated during the hot backup of the primary database.
When the following problems occur in the received archive file, the FAL will automatically obtain the archive file to solve the problem:
When the received archive is deleted after the application;
When the disk of the received archive is damaged;
When multiple physical standby instances exist, the FAL mechanism automatically attempts to obtain the lost archive files from other standby servers.

Let everyone know that this is not to let you do anything, but to let you rest assured that Oracle will manage this well. If it is not managed well, you will understand this principle, you can also analyze why it fails to be managed and how it can be used to restore its ability to manage effectively. Of course, you must do it yourself, the following example shows how to manually handle archive interruptions (Note: I will only describe the steps and will not do the demonstration. Because think twice is currently the highest availability mode of land use, it will not be lost in archiving, wow, haha :))

1. First, check whether the archive interruption exists in standby.
You can query the V $ archive_gap view. If there is a record, there is an interruption.
SQL> select * from V $ archive_gap;

The purpose of this step is to retrieve the corresponding low_sequence # And high_sequence #. If RAC exists, pay attention to thread #.

2. Find the archive file corresponding to sequence

SQL> select name from V $ archived_log where thread # = 1 and dest_id = 1 and sequence # between 7 and 10;

3. Copy the corresponding archive file to standby
NOTE: If it is RAC, find the right machine :)
Then run the alter database register logfile command to re-register these files. For example:
SQL> alter database register logfile 'e:/ora10g/jsspdg/xxxx. arc ';

Restart the redo application.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.