In Rails, a MySQL Partition Table is used to improve the performance. railsmysql
The partition table of MySQL is a simple and effective feature for processing large data tables. Through this feature, applications can efficiently process large data tables with few changes, however, due to some practices in Rails ActiveRecord design, some data processing may not take advantage of the Partition Table feature, but may become very slow. You must pay more attention when using the partition table.
The following is an example. In the light system, a data table is diet_items. The main fields are id, schedule_id, meal_order food_id, weight, calory, and so on, each record indicates a diet item in the daily weight loss program (diet + exercise plan) generated for the user. On average, there are more than 10 pieces of data in one plan, and the data volume is very large, it is expected that more than 1 million pieces of data will be generated every day. Therefore, the data is divided into 60 tables based on schedule_id hash, that is, the data is dynamically divided into 60 tables. The table creation Statement of diet_items after table sharding is as follows:
Copy codeThe Code is as follows:
Create table 'diet _ items '(
'Id' int (11) not null AUTO_INCREMENT,
'Schedule _ id' int (11) not null,
'Meals _ order' int (11) not null,
'Food _ id' int (11) default null,
....
KEY id ('id '),
Unique key 'index _ diet_items_on_schedule_id_and_id '('schedule _ id', 'id ')
)
Partition by hash (schedule_id)
PARTITIONS 60;
After Table sharding, schedule_id is required for all locations where diet_items is queried. For example, to obtain all diet_items of a schedule, schedule is used. diet_items, The diet_item that gets an id is also obtained through schedule. diet_items.find (id. There is no problem in generating diet_item because the diet_item is generated in schedule. diet_items.build (data) mode and schedule_id is included in the generation.
Observe the newrelic log and find that the update and destroy related requests of diet_item are very slow. After careful analysis, we find that these two operations are very busy because the SQL generated by ActiveRecord does not contain schedule_id. The SQL statement generated by the diet_item update operation ActiveRecord is similar to the update diet_items set... Where id = <id>. The statement generated by diet_item destroy is similar to delete diet_items where id = <id> because schedule_id is not included, mysql scans 60 partition tables to execute one statement, very slow!
After knowing the cause, you can easily change the original update and destroy calls to custom version update and destroy calls.
Change diet_item.update (attributes) to DietItem. where (id: diet_item.id, schedule_id: diet_item.schedule_id). update_all (attributes)
Change diet_item.destroy to DietItem. where (id: diet_item.id, schedule_id: diet_item.schedule_id). delete_all
The generated SQL statements all carry the schedule_id condition, which avoids scanning all partition tables and improves the performance immediately.