Recently encountered a strange problem
Receive SMS alert says disk IO is very high replication latency
Iostat-x 1 10 information is as follows:
The QPS is as follows:
The load is very low pressure is very low this is very no solution. Only one MySQL
Actually, it's a hardware problem.
Megasas RAID card BBU learn cycle background
Recently encountered some of the server with Megasas RAID card, at the peak of business suddenly IO load soared high, IO performance dropped sharply, checked the log and various settings finally found that the RAID card cache write policy from writeback into Writethrough. The deeper reason is that the BBU enters the learn cycle and automatically changes the cache policy to Writethrough.
Writeback and Writethrough
Before I start, I need to mention two words: WriteBack, Writethrough
- WriteBack: When writing, writes data to the RAID card cache and returns directly, and the RAID card controller writes the data to the hard disk when the system load is low or the cache is full. This setting greatly improves RAID card write performance, which in most cases reduces the system IO load. The reliability of the data is guaranteed by the raid card's BBU (Battery Backup Unit).
- Writethrough: Data writes do not use caching and data is written directly to disk. RAID Card write performance is degraded, and in most cases this setting causes system IO load to rise.
Cache policy for Megasas RAID cards
For LSI's Megasas RAID card, the default cache policy is: WriteBack, Readaheadnone, Direct, No Write cache if bad BBU
How to view the RAID card cache policy
(Target id:0) Name : Raid level : Primary-1, Secondary-0, RAID level qualifier-0size : 557.861 gbmirror Da Ta : 557.861 gbstate : Optimalstrip Size : Kbnumber of Drives : 2Span Depth if Bad Bbudefaul T access policy:read/writecurrent access Policy:read/writedisk Cache Policy : disabledencryption Type : Nonei S-VD cached:noexit code:0x00
- Default Cache policy: There are different settings for each raid.
- Current cache policy: The caching strategy that is currently in effect.
Policy description
- First paragraph: WriteBack, Writethrough
The second paragraph: readaheadnone, Readadaptive, ReadAhead.
- Readaheadnone: Do not turn on pre-reading. This is the default setting
- ReadAhead: In the read operation, the data in the back order is loaded into the cache in advance, which improves performance in sequential reads, but decreases the performance of random reads.
- Readadaptive: Adaptive pre-reading, when the cache memory and IO idle, take sequential pre-reading, balance the continuous reading performance and random read performance, need to consume a certain amount of computing power.
The third paragraph: Direct, Cached.
- Direct:direct IO mode, the read operation is not cached in the cache memory, the data will be transferred to the cache and the application, if the next to read the same data block, then directly from the cache memory. This is the default setting
- Cached:cached io mode, all read operations are cached in the cache memory.
Fourth paragraph: Write cache OK if bad Bbu, No Write cache if bad Bbu
- Write Cache OK If bad BBU: if there is a problem with the BBU (such as a battery failure), the write cache is still used and there is some risk of data loss.
- No write Cache if bad BBU: Do not use the write cache when there is a problem with the BBU
Issues with automatic policy switching
Since the Megasas RAID card defaults to the no write cache if bad Bbu setting, the write cache policy change may occur (changed from writeback to Writethrough), which results in a degraded writing performance, If this automatic change occurs at peak business hours and the system IO load is high, unpredictable problems, such as a card machine, can be raised. The following reasons will cause changes to the write cache policy.
- RAID card enters the BBU learn Cycle: detailed introduction See below
- Detection of some battery failure, such as low battery capacity, is generally the impact of battery aging, IBM recommends replacing the RAID card battery once a year
- No battery is installed, some servers are purchased without battery, which is automatically set to Writethrough
How can I temporarily force the write Cache on a problem with the BBU?
./megacli-ldsetprop cachedbadbbu-lall-aall./megacli-ldsetprop Wb-lall-aall#以下命令可以把设置修改回去./megacli-ldsetprop Nocachedbadbbu-lall-aall
BBU Learn Cycle
The BBU consists of a lithium-ion battery and an electronic control circuit. Lithium-ion battery life depends on its aging degree, from the factory, whether it is charged and its charge and discharge times and less, lithium-ion battery capacity will be slowly reduced. This means that an old battery cannot last as long as a new battery. It also determines the relative charge status of the BBU (Relative State of Charge) does not equal the absolute charge status (Absolute State of Charge).
In order to record the discharge curve of the battery so that the Controller understands the status of the battery, such as the maximum and minimum voltages, and in order to prolong the battery life, the automatic calibration mode (autolearn modes) is enabled by default. During the learn cycle, the RAID card controller does not enable the BBU until it completes the calibration. The entire process may take up to 12 hours. In this process, the writeback mode is disabled to ensure data integrity while causing performance degradation. The entire learn cycle is divided into three steps:
- The controller fills the BBU battery (this step may be charged after discharging or directly charged, and if the battery is just full, go directly to the second stage)
- Start calibration to perform discharge on the BBU battery
- After the discharge is completed, the calibration is completed and the charge is re-started, and the entire learn cycle is completed. Note: If the second or third phase is interrupted, the recalibration task will stop without re-executing
IBM's server default setting is 30 days for learn Cycle, while Dell is 90 days. It is not recommended to turn off Auto learn mode, through this calibration, can prolong the battery life, not the battery calibration RAID card, battery life will be reduced from normal 2 years to 8 months
View the current BBU learn settings
time:394618008 Sec Learn Delay interval:0 Hoursauto-learn mode:enabled
- Auto Learn Period: Auto calibration interval, per second, IBM server default setting is 30 days for learn Cycle, and Dell is 90 days. This setting cannot be modified.
- Next learn time: The next automatic calibration, the number of seconds from January 1, 2000, this setting cannot be modified, based on the last automatic calibration completion time plus the automatic calibration interval calculated. When this time is converted to the actual time, it is necessary to add the RAID card time error, and some of the raid card time is still wrong after turning into GMT time.
The actual time calculation method, pseudo code is as follows
' UTC 2000-01-01 + $RealTime secs '
- Learn delay Interval: Automatic calibration After the start of delayed time, unit hours, the maximum setting is 7 days. This setting is only for the next learn cycle, and the value will be automatically zeroed when the next learn cycle completes.
- Auto-learn mode: Turn automatic calibration on or off
View the status of the current BBU
[Email protected]:~# MEGACLI-ADPBBUCMD-GETBBUSTATUS-AALLBBU Statusfor adapter:0batterytype:ibbuvoltage:3837 mvcurrent: -152 matemperature:23 CBattery State:operat Ionalbbu Firmware status:charging status:discharging voltage:ok Tempe Rature:ok Learn cycle Requested:yes learn cycle Active : Yes Learn cycle Status:ok learn cycle timeout:no i²c Errors Detect Ed:no Battery Pack missing:no Battery replacement Required:no R Emaining capacity Low:no Periodic learn Required:no Transparent learn : No no space to cache Offload:no Pack are about to fail & Should be Replaced:no Cache offload premium feature required:no Module microcode update Required:no ... The next bit ...
- Charging status: Current battery in what state, there is charging, discharging, none equivalent, respectively, representing the battery charge, discharge, and no charge and discharge operation of the State
- Learn cycle Requested:learn cycle request, when yes, and the following learn cycle active is no, indicating that the first phase of learn cycle has started, when the policy starts to become Writethrough, The battery will undergo a charging or charging process after discharging.
- Learn cycle Active: Whether it is in the calibration phase of learn cycle, if yes, enters the second stage of learn cycle and the controller begins to calibrate the battery.
- Battery Replacement Required: If the battery needs to be repaired, if yes, replace the battery 5 as soon as possible. Remaining capacity Low: residual capacitance, if yes, need to replace the battery
How to force the Learn cycle operation to start
command to force automatic calibration, after which the command will be delayed for a few seconds, and the policy will automatically become Writethrough
# Megacli-adpbbucmd-bbulearn-aall
This command allows you to roughly adjust the next execution time for auto-calibration, but not 100% accurate:
- The completion time of this learn cycle cannot be calculated accurately, depending on the discharge and charging speed of the battery. * The next battery relearn task may be postponed for some reason, such as when the battery is charging, and the entire relearn operation will be postponed until after charging.
How to see if the current cache policy has changed
The difference between the default cache policy and the current cache policy is that the strategy changes
# Megacli-ldinfo-lall-aall
How do I change the Learn mode to manual?
' autolearnmode=1 ' >/tmp/megaraid.confmegacli-adpbbucmd-setbbuproperties-f/tmp/megaraid.conf-aall# 1 is disable, 0 is enable, the relearn operation executes immediately when switching from disable to enable #确认是否生效MegaCli-adpbbucmd-getbbuproperties-aall
Suggestions
Recommended cache policy: Use the no Write Cache if bad Bbu, sacrificing performance to ensure data security in the event of a BBU problem.
WriteBack, Readaheadnone, Direct, No Write Cache if Bad BBU
Here are a few options to choose from
- The learn cycle is forced on the BBU at non-business peaks, but the next automatic learn cycle is delayed for 5-6 hours (depending on the time required for the entire learn cycle). Each time the Learn cycle executes, the next learn cycle execution time is shifted backwards, and the elapsed time is determined by the duration of the last entire learn cycle, and the next execution time is typically approximately 5 hours (the time of the Learn cycle). It is recommended that you do a manual learn Cycle (typically 02:00~05:00) on a non-business peak based on the actual delay effect.
- Switch to Manual mode, by crontab or other manual periodic trigger learn cycle, in this way need to depend on different hardware to determine the interval of learn cycle, take the wrong interval will lose the battery life. IBM's 30 days, Dell's machine for 90 days.
- Check the time of the next learn cycle, before entering learn cycle, set to write Cached OK if bad BBU, so that the write cache policy does not change during learn cycle, after learn cycle, the switch will be the original Configuration, this way during the Learn cycle (about 5 hours) data will not be insured and data loss will occur if a power outage is encountered. * Detect the time of next learn cycle, 1-2 days in advance, trigger learn cycle in the non-business peak. This method works best and is most convenient, requiring a dedicated script to calculate the next learn cycle time
Recommended Practice: While retaining the auto learn mode, periodically through crontab to the RAID card to perform forced relearn operation, detection of the next learn cycle time, 1-2 days in advance, in the non-business peak of the early triggering learn cycle (typically 02:00 ~05:00).
Resources
- Ask for RAID and BBU self-answer
- Serveraid-m Series Battery Backup Unit (BBU) charge cycle behavior and cache MODES-IBM System x
MySQL disk Io%util high RAID card BBU learn cycle period