Drbd configuration parameters

Source: Internet
Author: User
Tags failover hmac

 

User Manual: Workshop

 

Related Terms in drbd and its configuration file:

Failover: Transfer failure. In other words, when a cannot serve the customer, the system can automatically switch over so that B can continue to provide services to the customer in a timely manner, the customer does not feel that the object providing services for him has been replaced. (Note: It is from Baidu Baike .) Diskless mode: diskless mode. In this mode, the node abandons its underlying device and then reads and writes data from the network to a remote node. Reference: http://www.drbd.org/users-guide/s-handling-disk-errors.html peer: Peer, in two namenode environment, refers to another namenode stonith: shoot the other node in the head short, blow other node head! It is a measure to avoid split brain. If split brain occurs, one node sends a command to shut down the other node, so that both nodes do not float into primary. However, this function also has some risks. Fence-peer: drbd has a mechanism that isolates peer nodes when the replication connection is interrupted. Drbd defines an interface for this mechanism, And the drbd-peer-outdater helper bound to heartbeat is a tool to implement this mechanism through this interface. However, we can easily implement our own peer fencing Helper Program, and one of the prerequisites for implementing our own peer fencing program is, the fence-peer processor must be defined in the handlers configuration section! Reference: http://www.drbd.org/users-guide/s-fence-peer.html standalong mode: the state in which the node cannot communicate with other nodes. Reference: http://www.drbd.org/users-guide/ch-admin.html#s-connection-states write barriers: a protection mechanism in the log file system to maintain data consistency. The log file system maintains a log in a dedicated area of the disk. When the file system detects metadata changes, it first writes these changes to the log, instead of immediately changing the metadata. After these changes (event information) are written to the log, the file system adds a "submit report" to the log to indicate that the changes to the metadata are legal. The kernel will write the metadata only after the "Report submitted" is written into the log. Before writing "report submission", you must ensure that the event information has been written into the log. This is not enough. At the same time, the drive also maintains a huge internal caches, and for better performance, it will re-sort the operations, so before writing "submit Report, the file system must explicitly instruct the disk to obtain all the log data and store the data in the media. If "report submission" is written into the log before event information, the log will be damaged! The I/O subsystem of the kernel uses barriers to implement the above functions. Generally, after a barrier is generated, it will prohibit all subsequent data blocks from being written until the data blocks before the barrier are written to the media. Note: The ext3 and ext4 file systems do not use barriers by default. administrators must explicitly enable them to use this function. In addition, not all file systems support barriers. Reference: http://lwn.net/Articles/283161 meta data: drbd stores a variety of information about the data in a dedicated area, which is metadata, which includes the following information: 1. The size of the drbd device. 2. Generation identifier (GI). 3. Activity Log (al). 4. Quick-sync bitmap. There are two storage methods for metadata:InternallyAndExternally. The storage method is specified in each resource configuration segment. The two storage methods have their own advantages and disadvantages. Internal metadata: a resource is configured to use internal metadata, which means that drbd stores its metadata in the same underlying physical device as the actual production data. This storage method sets aside an area at the end of the device to store metadata. The advantage of this storage method is that metadata is closely related to the actual production data. If the hard disk is damaged, no additional work is required by the Administrator, because metadata will be lost with the loss of actual production data, it will also be restored with the recovery of production data. Its disadvantage is that if the underlying device has only one physical hard disk (opposite to raid), this storage method has a negative impact on the write throughput, because the Write Request of the application triggers the update of metadata of drbd. If metadata is stored on the same disk of the hard disk, the write operation will lead to two additional head read/write operations. Note: If you plan to use internal metadata in the underlying Device of existing data, you need to calculate and set aside the space occupied by the metadata of drbd and take some special operations, otherwise, the original data may be damaged! For special operations, refer to the link article at the end of this entry. What I want to say is, it is best not to do this! External metadata: This storage method is simple. It stores metadata in a dedicated device block separated from production data. Its advantages: it provides some potential improvements for some write operations. Disadvantages: Metadata is separated from production data. If the hard disk is damaged, the Administrator must perform manual intervention after the hard disk is changed, complete data synchronization from other surviving nodes to the replaced hard disk. When should I use exteranl storage: The device already contains data, and the device does not support scaling (such as LVM) or shrinking ). If the space required to calculate the metadata of drbd is:

Note: The images are from the drbd official website.

CS is the number of sectors on the disk, and MS is also described by the number of sectors. To convert it to MB, divide it by 2048. You can use the formula echo $ (blockdev -- getsize64Device)/512. For more information, see http://www.drbd.org/users-guide-emb/ch-internals.html, and http://www.wenzizone.cn /? P = 280

DrbdConfiguration file:

After drbd is installed, a configuration file/etc/drbd is automatically created. conf, the content is empty, but you can find a template file/usr/share/doc/drbd... /drbd. conf. The template file has only two include statements. Why is the template file so concise? The configuration file of drbd is divided into three parts: Global, common, and resource. Although the three parts can all be written into one/etc/drbd. CONF file, but according to the official practice, these three parts are separated:/etc/drbd. the global_common.conf file in the d directory contains the global and common configuration segments. the res file contains a resource configuration segment.About GlobalConfiguration section:If all configuration segments are in the same drbd. conf file, the configuration must be placed at the top. Common options: minor-count: number of devices. value range: 1 ~ 255. The default value is 32. This option sets the number of allowed defined resources. When the resource to be defined exceeds this option, you need to re-load the drbd kernel module. Dialog-RefreshTime:TimeThe value is 0 or any positive number. The default value is 1. I do not understand the official explanation of this option. This option is rarely enabled. Disable-IP-verification: whether to Disable IP check usage-count: whether to participate in user statistics. Valid parameters include yes, no, and ask. According to the official example, generally, you only need to configure the usage-count option. But to be honest, drbd's official documentation is really tricky and vague in some places. Of course, there is also a reason for my limited level.About commonConfiguration section:This configuration segment is not required, but can be used to set options that are common to multiple resources to reduce repetitive work. This configuration segment also contains the following configuration segments: disk, net, startup, syncer, and handlers.DiskConfiguration section:This configuration section is used to precisely adjust the attributes of the underlying storage of drbd. Note: This is a very important configuration segment. If an error or damage occurs to the underlying device (Disk), this configuration segment will take effect according to the set policy. Common options: On-io-error option: This option sets a policy. If the underlying device reports an I/O error to the upstream device, the policy will be followed. Effective policies include pass_on: Reporting I/O errors to upper-layer devices. If an error occurs on the primary node, report it to the file system and the upper-layer device will handle the error (for example, it will cause the file system to be remounted in read-only mode ), it may cause drbd to stop providing services. If it occurs on the secondary node, ignore this error (because the secondary node has no upper-layer device to report ). This policy was used as the Default policy, but is now replaced by detach. Call-local-io-error: Call the predefined local-io-error script for processing. This policy needs to predefine a local-io-error command call in the Handlers section of the resource configuration segment. The Administrator uses the local-io-error command (or script) to control how to handle I/O errors.

Detach: if an I/O error occurs, the node will discard the underlying device and continue working in diskless mode. In diskless mode, as long as there is a network connection, drbd reads and writes data from the secondary node without failover. This policy may cause some losses, but the benefits are also obvious. drbd services will not be interrupted. Official recommendations and default policies.

Note: If the detach policy is used to handle underlying I/O errors, will this policy report errors to the administrator so that the administrator can manually change the disk of this node? The handler processor should have this content. Fencing option: This option sets a policy to avoid split brain conditions. Valid policies include: Dont-care: Default policy. No isolation measures are taken. Resource-only: in this policy, if a node is in the split brain state, it will try to isolate the disk of the Peer end. This operation is implemented by calling the fence-peer processor. The fence-peer processor will reach the peer node through other communication paths and call the drbdadm outdate res command on the peer node. Resource-and-stonith: in this policy, if a node is in the split brain state, it stops the I/O operation and calls the fence-peer processor. The processor reaches the peer node through other communication paths and calls the drbdadm outdate res command on the peer node. If the peer node cannot be reached, it will send a shutdown command to the peer. Once the problem is solved, I/O operations will be performed again. If the processor fails, you can use the resume-io command to restart the I/O operation.NetConfiguration section:This configuration section is used to precisely adjust the attributes of drbd and network-related attributes. Common options: sndbuf-size: This option is used to adjust the TCP send buffer size. For versions earlier than drbd 8.2.7, the default value is 0, which means automatic size adjustment; the default value of the new drbd version is 128kib. In a high-throughput network (such as a dedicated gigabit network card or a connection bound to the Server Load balancer), it is appropriate to increase to K or a higher value, but it is best not to exceed 2 m. Timeout: This option sets a time value, in the unit of 0.1 seconds. If the partner node does not send a response packet within this time period, the peer node is considered dead and the TCP/IP connection is disconnected. The default value is 60, that is, 6 seconds. The value of this option must be smaller than the connect-int and Ping-int values. Connect-int: If the remote drbd device cannot be connected immediately, the system will try to connect intermittently. This option sets the interval between two attempts. The Unit is seconds. The default value is 10 seconds. Ping-int: This option sets a time value, in seconds. If the idle time of the TCP/IP connected to the remote drbd device exceeds this value, the system generates a keep-alive packet to check whether the peer node is still alive. The default value is 10 seconds. Ping-Timeout: This option sets a time value, in the unit of 0.1 seconds. If the peer does not respond to the keep-alive packet within this time period, it is considered dead. The default value is 500 ms. Max-Buffers: This option sets the maximum number of requests allocated by drbd. The unit is page size (page_size). In most systems, the page size is 4 kb. These buffers are used to store the data to be written into the disk. The minimum value is 32 (128kb ). This value is big. Max-epoch-size: This option sets the maximum number of data blocks between two write barriers. If the option value is less than 10, the system performance will be affected. A little bigger. Another part of the user manual also mentions max-Buffers and max-epoch-size, with drbd. the conf part is slightly different: drbd. in the conf help document, Max-Buffers sets the maximum number of requests and max-epoch-size sets the maximum number of data blocks, these two options affect the Write Performance of the secondary node. Max-Buffers sets the maximum number of buffers, which are allocated by the drbd system for data to be written to the disk; max-epoch-size specifies the maximum number of requests allowed between two write barriers. In most cases, these two options should be set in parallel, and the values of the two options should be consistent. The default values of the two options are 2048. In most reasonable high-performance hardware raid controllers, it is better to set them to 8000. Based on the interpretation of the two parts, Max-Buffers sets the maximum number of data blocks, and Max-epoch-size sets the maximum number of blocks that can be requested. Reference: http://www.drbd.org/users-guide-emb/s-throughput-tuning.html#s-tune-disable-barriers Ko-count: This option sets a value, multiply the value set for this option by the value set for timeout, get a number N, if the secondary node does not complete a single write request within this time period, it will be removed from the cluster (that is, the primary node enters standalong mode ). Value Range: 0 ~ 200. The default value is 0. This function is disabled. Allow-two-primaries: this is a new feature supported by drbd8.0 and later versions. It allows a cluster to have two primary nodes. This mode requires the support of a specific file system. Currently, only ocfs2 and GFS are supported. The traditional ext3, ext4, and XFS are not supported! Cram-HMAC-ALG: This option can be used to specify the HMAC algorithm to enable peer-to-peer node authorization. Drbd strongly recommends that you enable peer-to-peer node authorization. You can specify any algorithm recognized in the/proc/Crypto File. You must specify an algorithm here to explicitly enable the peer-to-peer node authorization mechanism. Shared-secret: used to set the password used for node authorization. It can contain a maximum of 64 characters. Data-Integrity-ALG: This option sets an algorithm supported by the kernel for User Data Consistency Verification on the network. The common data consistency check is performed by the 16-bit checksum contained in the TCP/IP Header. This option can use any algorithm supported by the kernel. This function is disabled by default. In addition, there are three options, after-sb-0pri, after-sb-1pri, and split Brian.StartupConfiguration section:This configuration segment is used to more precisely adjust the drbd attribute. It acts on the configuration node when it is started or restarted.

Note: I have tried not to configure this part. When I first enable the drbd block device of the second node, I encountered a problem: it has been waiting for a peer node to appear, if you are not willing to start the service, you must manually stop it. As follows:

 

 

Common options: WFC-Timeout: This option sets a time value, in seconds. When the drbd block is enabled, the initialization script drbd blocks the startup process until the peer node appears. This option is used to limit the wait time. The default value is 0, that is, no limit, always wait. Degr-WFC-Timeout: This option also sets a time value, in seconds. It is also used to limit the waiting time. It only applies to different scenarios: it acts on the waiting time when a downgrade cluster (that is, the clusters with only one node left) is restarted. Outdated-WFC-Timeout: Same as above. It is also used to set the wait time, in seconds. It is used to set the time to wait for an expired node.SyncerConfiguration section:This configuration segment is used to more precisely adjust the synchronization process of the service. Common options: Rate: Set the synchronization speed. The default value is kb. The default unit is KB/sec, and K, M, and g are allowed, such as 40 m. Note: The speed in syncer is set by Bytes instead of bits. The rate set by this option in the configuration file is permanent, but you can use the following command to temporarily change the rate value: drbdsetup/dev/drbdNumReplace the red num in the preceding command with the slave device number of your drbd device. You can only run this command on one of all nodes. To restore the data to the speed set in the drbd. conf configuration file, run the following command: drbdadm adjustResourceOfficial tip: it is best to set the speed to 30% of the valid available bandwidth. The so-called effective available bandwidth refers to the minimum network bandwidth and disk read/write speed. There are two examples: If the I/O Sub-system can maintain a read/write speed of 180 Mb/s, and the network throughput speed of 110 Mb/s can be maintained by a gigabit network, the effective available bandwidth is 110 Mb/s, the recommended value for this option is 110x0.3 = 33 Mb/s. If the I/O speed is 80 Mb/s and the network connection speed can reach 1 Gigabit, the valid available bandwidth is 80 Mb/s, and the recommended rate should be 80x0.3 = 24 Mb/s al-extents: this option is used to set the number of hot areas (that is, active set). The value range is 7 ~ 3843. The default value is 127. Each block is 4 MB of storage (that is, the underlying device ). Drbd automatically detects the host area. If the master node is accidentally disconnected from the cluster, when the node joins the cluster again, the areas covered by the hot area must be re-synchronized. Each change of hot area is actually a write operation on the metadata area. Therefore, the larger the value of this option, the longer the synchronization time, however, the less meta-data to be updated. As mentioned in another part of the user manual, Al-extents adjusts the activity log size. If the write operations of drbd applications are intensive, we recommend that you use a large activity log (active set ), otherwise, frequent metadata update operations may affect the write operation performance. Recommended Value: 3389. Reference: http://www.drbd.org/users-guide-emb/s-throughput-tuning.html#s-tune-al-extents verify-ALG: This option specifies an algorithm for online verification, the kernel generally supports MD5, sha1, and crc32c verification algorithms. Online verification is disabled by default. You must set parameters in this option to enable online device verification explicitly. Drbd supports online device verification. It efficiently verifies data consistency between different nodes. Online verification may affect CPU load and usage, but the impact is slight. Drbd 8.2.5 and later versions support this function. Once this feature is enabled, you can use the following command for online verification: drbdadm verifyResourceThis command checks the specified resource. If data blocks are not synchronized, it marks these blocks and writes a message to the kernel log. This process does not affect the programs that are using the device. If unsynchronized blocks are detected, you can run the following command to re-synchronize them after the check is completed: drbdadm disconnectResourceDrbdadm connetcResourceYou can schedule a Cron task to automatically perform online verification, such as: 42 0 ** 0 root/sbin/drbdadm verifyResourceIf online check is enabled for all resources, run the following command: 42 0 ** 0 root/sbin/drbdadm verify all csums-ALG: This option specifies a validation algorithm, used to mark data blocks. If this option is not enabled, rsync sends all data blocks from the source to destination. If this option is enabled, rsync only exchanges data blocks with different check values, this option is useful when the network bandwidth is limited. This option also reduces CPU bandwidth usage when a crashed primary node is restarted. Handlers: This configuration segment defines a series of processors to respond to specific events.

Drbd configuration parameters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.