Mha gtid based failover code parsing

Source: Internet
Author: User

Mha gtid based failover code parsing
As a supplement to the following article, it describes the processing process of mha gtid based failover.
Http://blog.chinaunix.net/uid-20726500-id-5700631.html

MHA determines that GTID based failover must meet the following three conditions (refer to the get_gtid_status function)
All nodes gtid_mode = 1
Executed_Gtid_Set of all nodes is not empty
At least one node Auto_Position = 1


GTID basedMHA failover

  1. MHA: MasterFailover: main ()
  2. -> Do_master_failover
  3. Phase 1: Configuration Check Phase
  4. -> Check_settings:
  5. Check_node_version: View MHA version information
  6. Connect_all_and_read_server_status: Check whether the MySQL instances of each node can be connected.
  7. Get_dead_servers/get_alive_servers/get_alive_slaves: double check the status of each node
  8. Start_ SQL _threads_if: Check whether Slave_ SQL _Running is Yes. If not, start SQL thread

  9. Phase 2: Dead Master Shutdown Phase: for us, the only function is to stop IO thread
  10. -> Force_shutdown ($ dead_master ):
  11. Stop_io_thread: stop all slave IO threads (stop master)
  12. Force_shutdown_internal (in fact, it is to execute master_ip_failover_script/shutdown_script in the configuration file. If not, it will not be executed ):
  13. Master_ip_failover_script: If the VIP is set, switch the VIP first.
  14. Shutdown_script: If the shutdown script is set, run

  15. Phase 3: Master Recovery Phase
  16. -> Phase 3.1: Getting Latest Slaves Phase (obtain latest slave)
  17. Read_slave_status: obtains the binlog file/position of each slave.
  18. Check_slave_status: Call "show slave status" to obtain the following slave information:
  19. Slave_IO_State, Master_Host,
  20. Master_Port, Master_User,
  21. Slave_IO_Running, Slave_ SQL _Running,
  22. Master_Log_File, Read_Master_Log_Pos,
  23. Relay_Master_Log_File, Last_Errno,
  24. Last_Error, Exec_Master_Log_Pos,
  25. Relay_Log_File, Relay_Log_Pos,
  26. Seconds_Behind_Master, Retrieved_Gtid_Set,
  27. Executed_Gtid_Set, Auto_Position
  28. Replicate_Do_DB, Replicate_Ignore_DB, Replicate_Do_Table,
  29. Replicate_Ignore_Table, Replicate_Wild_Do_Table,
  30. Replicate_Wild_Ignore_Table
  31. Identify_latest_slaves:
  32. Compare Master_Log_File/Read_Master_Log_Pos in each slave to find the latest slave
  33. Identify_oldest_slaves:
  34. Compare Master_Log_File/Read_Master_Log_Pos in each slave to find the oldest slave

  35. -> PHP 3.2: Determining New Master Phase
  36. Get_most_advanced_latest_slave: Find the top Slave (Relay_Master_Log_File, Exec_Master_Log_Pos)

  37. Select_new_master: selects a new master node.
  38. If preferred node is specified, one of active preferred nodes will be new master.
  39. If the latest server behinds too much (I. e. stopping SQL thread for online backups ),
  40. We shocould not use it as a new master, we shocould fetch relay log there. Even though preferred
  41. Master is configured, it does not become a master if it's far behind.
    Get_candidate_masters:
    Is the node configured with candidate_master> 0 in the configuration file.
    Get_bad_candidate_masters:
    # The following servers can not be master:
    #-Dead servers
    #-Set no_master in conf files (I. e. DR servers)
    #-Log_bin is disabled
    #-Major version is not the oldest
    #-Too much replication delay (the binlog position difference between slave and master is greater than 100000000)
    Searching from candidate_master slaves which have received the latest relay log events
    If not found:
    Searching from all candidate_master slaves
    If not found:
    Searching from all slaves which have stored ed the latest relay log events
    If not found:
    Searching from all slaves

    -> Phase 3.3: Phase 3.3: New Master Recovery Phase
    Recover_master_gtid_internal:
    Wait_until_relay_log_applied
    Stop_slave
    If the new master is not an Slave with the latest relay
    $ Latest_slave-> wait_until_relay_log_applied: wait until the newest relay Slave has Exec_Master_Log_Pos equal to Read_Master_Log_Pos
    Change_master_and_start_slave ($ target, $ latest_slave)
    Wait_until_in_sync ($ target, $ latest_slave)
    Save_from_binlog_server:
    Traverse all binary servers and run save_binary_logs -- command = save to obtain the binlog
    Apply_binlog_to_master:
    Binlog obtained by the application from binary server (if any)
    If master_ip_failover_script is set, call $ master_ip_failover_script -- command = start to enable vip.
    If skip_disable_read_only is not set, set read_only = 0.

    Phase 4: Slaves Recovery Phase
    Recover_slaves_gtid_internal
    -> Phase 4.1: Starting Slaves in parallel
    Run change_master_and_start_slave on all Slave instances.
    If wait_until_gtid_in_sync is set, use "SELECT WAIT_UNTIL_ SQL _THREAD_AFTER_GTIDS (?, 0) "waiting for Slave Data Synchronization

    Phase 5: New master cleanup phase
    Reset_slave_on_new_master
    Clearing the New Master is actually resetting the slave info, that is, canceling the original Slave information. So far, the entire Master failover process has been completed.



The online switching process when GTID is enabled is the same as that when GTID is not enabled (the only difference is that the change master statement is executed), so it is omitted.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.