我們的mysql 備份系統遭遇嚴重bug
源於 開源軟體 xtrabackup 的一個bug
https://bugs.launchpad.net/percona-xtrabackup/+bug/722638
之前我們的大規模部署都沒有遇到這問題。
在做計數器轉mysql 後,我們部署了備份系統,屢屢備份失敗,於是決定徹底的解決這個問題,
經過一系列測試後,發現在備份過程無法跨越 計數器的資料入庫操作,
備份系統報錯:
[code]
[01] Copying ./cnt_it/cnt_referrer_channel_2011.ibd
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_referrer_channel_2011.ibd
[01] ...done
[01] Copying ./cnt_it/cnt_goals_abandon_201109.ibd
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_goals_abandon_201109.ibd
[01] ...done
[01] Copying ./cnt_it/cnt_referrer_search_keyword_201107.ibd InnoDB: Error: tablespace id is 43167 in the data dictionary
InnoDB: but in file ./cnt_it/cnt_referrer_summary_work.ibd it is 43178!
110610
18:37:57 InnoDB: Assertion failure in thread 1201920320 in file
/home/buildbot/slaves/percona-server-51-12/TGZ_CentOS_5_x86_64/work/xtrabackup-1.6/Percona-Server-5.5/storage/innobase/fil/fil0fil.c
line 780
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_referrer_search_keyword_201107.ibd
[01] ...done
[01] Copying ./cnt_it/cnt_goals_referrer_201205.ibd
to /usr/local/mysql/crontab/cnt_it/backup/innodb/full/2011-06-10_18-18-25/cnt_it/cnt_goals_referrer_201205.ibd
[01] ...done
./backup.sh: line 109: 24002 備份失敗 xtrabackup
--defaults-file=$CNF --backup --target-dir=$BACKUP/$ENGINE/full/$day
--datadir=$DATADIR
+ return 1
+ critical
+ df -h
[/code]
上面是什麼問題呢?
就是說在備份過程中,資料庫的表不能rebuild 操作,比如: truncate table , drop table ,並重建立表 這樣的操作。
從報錯資訊上看,應該是xtrabackup 已經考慮到這個問題了,只是當時沒有處理,於是在相關的代碼處加了一個assertion
代碼這個地方出錯,就退出。
這個bug 在1.5,1.5.1 ,1.6 版本都存在這個問題。 要到1.7版本才能修複。
慢慢等吧!
目前替代方案,採用備份從庫解決。