Exploring the massive data insertion in mysql (qualitative change due to quantitative change) classification: See the Visio diagram
Pay attention to the following points for importing large amounts of data:
Batch import: after importing a batch and finally committing (commit), you can use jdbc (executeBatch) for batch processing, but pay attention to its maximum limit. Otherwise, only part of the SQL statements will be executed, SQL statements that exceed the upper limit are automatically discarded;
Pay attention to memory usage (do not get out of memory, Java heap space;
Try to use a large number of import tools (bcp, sqlLoader) to complete (not tested );
When the data volume is large, it is best not to make too many judgments during import, which will affect the import speed;
Large data volume: whether to add a primary key when creating a table (assuming that the primary key is not a self-increasing ID column), insert data, or add a primary key after inserting data. The main problem of the former is that the insertion speed of a large amount of data slows down, frequent insert operations may lead to database crash. The latter problem is that a large amount of data tables are added with primary keys (repeated records need to be deleted and processed based on mysql's own primary key adding mechanism, slow speed, and frequent operations can also lead to database crash );
If an index exists in the original data table, delete the index first and create an index after importing the data. (the primary key is the unique index column, as described in section 5)
Due to the large data volume, you need to configure mysql's my. ini file. if you do not understand the relationship between parameters in this process, you need to modify a parameter to improve the overall performance of mysql;
In the processing of insert statements, replacing "+" with StringBuilder improves the performance;
Use the preparedstatement supported by jdbc to manually control transactions (split the inserted data into multiple commits ), thus, the optimized program can achieve the high-efficiency insertion of multi-thread Thread concurrency;
Use create table select * from soucret. the table creation method is different from the show create table tablename method. The table created in the latter is different from the original table source. the table structure is the same, but the table structure obtained by the former method does not have a primary key;