Example of using Python to incrementally delete MySQL table data

Source: Internet
Author: User
Tags chr rowcount sleep

Scenario:

There is a business database that uses MySQL 5.5 and writes a large amount of data every day. It is easy to delete data from multiple tables before the specified period from time to time, write a few WHILE loops. Although MySQL also has similar functions, I am not proficient in it, so I use Python to implement it.


Script:


# Coding: UTF-8
Import MySQLdb
Import time

# Delete config
DELETE_DATETIME = '2017-08-31 23:59:59'
DELETE_ROWS = 10000
EXEC_DETAIL_FILE = 'exec_detail.txt'
SLEEP_SECOND_PER_BATCH = 0.5

DATETIME_FORMAT = '% Y-% m-% d % x'
# MySQL Connection Config
Default_MySQL_Host = 'localhost'
Default_MySQL_Port = 3358
Default_MySQL_User = "root"
Default_MySQL_Password = 'roo @ 01239876'
Default_MySQL_Charset = "utf8"
Default_MySQL_Connect_TimeOut = 120
Default_Database_Name = 'testdb001'


Def get_time_string (dt_time ):
"""
Obtains the time string in the specified format.
: Param dt_time: the time when the string is to be converted.
: Return: returns a string in the specified format.
"""
Global DATETIME_FORMAT
Return time. strftime (DATETIME_FORMAT, dt_time)


Def print_info (message ):
"""
Output the message to the console and write the message to the log file.
: Param message: string to be output
: Return: no return
"""
Print (message)
Global EXEC_DETAIL_FILE
New_message = get_time_string (time. localtime () + chr (13) + str (message)
Write_file (EXEC_DETAIL_FILE, new_message)


Def write_file (file_path, message ):
"""
Append the incoming message to the file specified by file_path.
First, create the directory where the file is located.
: Param file_path: Path of the file to be written
: Param message: information to be written
: Return:
"""
File_handle = open (file_path, 'A ')
File_handle.writelines (message)
# Append a line feed to facilitate browsing
File_handle.writelines (chr (13 ))
File_handle.close ()


Def get_mysql_connection ():
"""
Returns the database connection according to the default configuration.
: Return: database connection
"""
Conn = MySQLdb. connect (
Host = Default_MySQL_Host,
Port = Default_MySQL_Port,
User = Default_MySQL_User,
Passwd = Default_MySQL_Password,
Connect_timeout = Default_MySQL_Connect_TimeOut,
Charset = Default_MySQL_Charset,
Db = Default_Database_Name
    )
Return conn


Def mysql_exec (SQL _script, SQL _param = None ):
"""
Execute the input script and return the number of affected rows.
: Param SQL _script:
: Param SQL _param:
: Return: number of rows affected by the execution of the last statement of the script
"""
Try:
Conn = get_mysql_connection ()
Print_info ("execute script on server {0}: {1}". format (
Conn. get_host_info (), SQL _script ))
Cursor = conn. cursor ()
If SQL _param is not None:
Cursor.exe cute (SQL _script, SQL _param)
Row_count = cursor. rowcount
Else:
Cursor.exe cute (SQL _script)
Row_count = cursor. rowcount
Conn. commit ()
Cursor. close ()
Conn. close ()
Except t Exception, e:
Print_info ("execute exception:" + str (e ))
Row_count = 0
Return row_count


Def mysql_query (SQL _script, SQL _param = None ):
"""
Run the input SQL script and return the query result.
: Param SQL _script:
: Param SQL _param:
: Return: returns the SQL query result.
"""
Try:
Conn = get_mysql_connection ()
Print_info ("execute script on server {0}: {1}". format (
Conn. get_host_info (), SQL _script ))
Cursor = conn. cursor ()
If SQL _param! = '':
Cursor.exe cute (SQL _script, SQL _param)
Else:
Cursor.exe cute (SQL _script)
Exec_result = cursor. fetchall ()
Cursor. close ()
Conn. close ()
Return exec_result
Except t Exception, e:
Print_info ("execute exception:" + str (e ))


Def get_id_range (table_name ):
"""
Obtain the maximum ID, minimum ID, and total number of rows to be deleted from the input table.
: Param table_name: table to be deleted
: Return: returns the maximum ID, minimum ID, and total number of rows to be deleted.
"""
Global DELETE_DATETIME
SQL _script = """
SELECT
MAX (ID) AS MAX_ID,
MIN (ID) AS MIN_ID,
COUNT (1) AS Total_Count
FROM {0}
WHERE create_time <= '{1 }';
". Format (table_name, DELETE_DATETIME)

Query_result = mysql_query (SQL _script = SQL _script, SQL _param = None)
Max_id, min_id, total_count = query_result [0]
# There is a pitfall, where total_count is not 0 but max_id and min_id are None.
# Determine whether max_id and min_id are NULL
If (max_id is None) or (min_id is None ):
Max_id, min_id, total_count = 0, 0, 0
Return max_id, min_id, total_count


Def delete_data (table_name ):
Max_id, min_id, total_count = get_id_range (table_name)
Temp_id = min_id
While temp_id <= max_id:
SQL _script = """
Delete from {0}
WHERE id <= {1}
And id >={ 2}
AND create_time <= '{3 }';
". Format (table_name, temp_id + DELETE_ROWS, temp_id, DELETE_DATETIME)
Temp_id + = DELETE_ROWS
Print (SQL _script)
Row_count = mysql_exec (SQL _script)
Print_info ("affected rows: {0}". format (row_count ))
Current_percent = (temp_id-min_id) * 1.0/(max_id-min_id)
Print_info ("current progress {0}/{1}, remaining {2}, progress: {3} % ". format (temp_id, max_id, max_id-temp_id, "%. 2f "% current_percent ))
Time. sleep (SLEEP_SECOND_PER_BATCH)
Print_info ("The current table {0} has no data to be deleted". format (table_name ))


Delete_data ('tb001 ')
Delete_data ('tb002 ')
Delete_data ('tb003 ')

Execution result:

Implementation principle:

Because the table has an auto-increment ID, we can find the maximum and minimum values that meet the deletion conditions, and then increment by ID, delete each small range (such as 10000.

Advantages:

It achieves the effect of cutting an ax and cutting a firewood. The transaction is small and has little impact on the online. It prints the "ID" currently processed and can be closed at any time, you can start with this ID by slightly modifying the code.

Lack of implementation:

To prevent high master/slave latency, the replication link is deleted for 1 second each time, which is relatively rough. The best way is to periodically scan the replication link and adjust the SLEEP cycle according to the delay, all of them are scripted. How can we be more intelligent!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.