Whether in daily business data processing or database import/export, a large amount of data may be inserted. The insertion method and the database engine will affect the insertion speed. This article aims to analyze and compare various methods theoretically and practically, so as to facilitate future
Whether in daily business data processing or database import/export, a large amount of data may be inserted. Both the insertion method and the database engine will affect the insertion speed. This article aims to analyze and compare various methods theoretically and practically to facilitate the selection of insertion methods in future applications.
Insert analysis
The time required to insert a record in MySQL is composed of the following factors, and the number indicates the approximate proportion:
- Connection: (3)
- Send query to server: (2)
- Analysis query: (2)
- Insert Record: (1x record size)
- Insert index: (1x index)
- Close: (1)
If we execute an SQL statement every time we insert an entry, we need to execute all the steps except connect and close N times. this is very time-consuming and there are several optimization methods:
- Write multiple rows in each insert statement and insert multiple rows in batches
- Write all query statements into the transaction
- Use Load Data to import Data
The performance of each method is as follows.
Innodb engine
InnoDB provides MySQL with a transaction-safe (ACID compliant) table with transaction (commit), rollback, and crash recovery capabilities. InnoDB provides locking on row level and foreign key constraints ).
InnoDB is designed to handle large-capacity database systems. its CPU utilization is incomparable to other disk-based relational database engines. Technically, InnoDB is a complete database system on the MySQL background. InnoDB establishes a dedicated buffer pool in the primary memory for high-speed data buffering and indexing.
Test Environment
Macbook Air 12mid apache2.2.26 php5.5.10 mysql5.6.16
Data entries in total
After the data is inserted, the database size is 38.6 MB (no index) and 46.8 (with an index)
- Total insertion time of a single index: 229 s peak memory: 246KB
- Total insertion time with an index: 242 s peak memory: 246KB
- Total time consumed for batch insertion without Indexes: 10 s peak memory: 8643KB
- Total index insertion time: 16 s peak memory: 8643KB
- Total insertion time of non-index transactions: 78 s peak memory: 246KB
- Total insertion time of index transactions: 82 s peak memory: 246KB
- No index Load Data insertion time: 12 s peak memory: 246KB
- Total insertion time of index Load Data: 11 s peak memory: 246KB
MyIASM engine
MyISAM is the default storage engine of MySQL. The design is simple and supports full-text search.
Test Environment
Macbook Air 12mid apache2.2.26 php5.5.10 mysql5.6.16
Data entries in total
After the data is inserted, the database size is 19.1 MB (no index) and 38.6 (with an index)
- Total insertion time of a single index: 82 s peak memory: 246KB
- Total insertion time with an index: 86 s peak memory: 246KB
- Total time consumed for batch insertion without Indexes: 3 s peak memory: 8643KB
- Total index insertion time: 7 s peak memory: 8643KB
- No index Load Data insertion time: 6 s peak memory: 246KB
- Total insertion time of index Load Data: 8 s peak memory: 246KB
Summary
I tested a small amount of Data, but I can probably understand the impact of these insertion methods on the speed. the fastest way is to Load Data. This method is relatively troublesome because it involves writing files, but it can take into account both the memory and speed.
Test Code
-
- $ Dsn = 'MySQL: host = localhost; dbname = test ';
- $ Db = new PDO ($ dsn, 'root', '', array (PDO: ATTR_PERSISTENT => true ));
- // Delete the last inserted data
- $ Db-> query ('delete from 'test '');
- // Start timing
- $ Start_time = time ();
- $ Sum = 1000000;
- // Test options
- $ Num = 1;
-
- If ($ num = 1 ){
- // Insert a single entry
- For ($ I = 0; $ I <$ sum; $ I ++ ){
- $ Db-> query ("insert into 'test' ('id', 'name') values ($ I, 'tsetssdf ')");
- }
- } Elseif ($ num = 2 ){
- // Batch insert. to insert data to max_allowed_packet up to 0.1 million, insert data once every seconds.
- For ($ I = 0; $ I <$ sum; $ I ++ ){
- If ($ I = $ sum-1) {// last time
- If ($ I % 100000 = 0 ){
- $ Values = "($ I, 'testtest ')";
- $ Db-> query ("insert into 'test' ('id', 'name') values $ values ");
- } Else {
- $ Values. = ", ($ I, 'testtest ')";
- $ Db-> query ("insert into 'test' ('id', 'name') values $ values ");
- }
- Break;
- }
- If ($ I % 100000 = 0) {// Insert only in this case
- If ($ I = 0 ){
- $ Values = "($ I, 'testtest ')";
- } Else {
- $ Db-> query ("insert into 'test' ('id', 'name') values $ values ");
- $ Values = "($ I, 'testtest ')";
- }
- } Else {
- $ Values. = ", ($ I, 'testtest ')";
- }
- }
- } Elseif ($ num = 3 ){
- // Insert a transaction
- $ Db-> beginTransaction ();
- For ($ I = 0; $ I <$ sum; $ I ++ ){
- $ Db-> query ("insert into 'test' ('id', 'name') values ($ I, 'tsetssdf ')");
- }
- $ Db-> commit ();
- } Elseif ($ num = 4 ){
- // File load data
- $ Filename = dirname (_ FILE _). '/test. SQL ';
- $ Fp = fopen ($ filename, 'w ');
- For ($ I = 0; $ I <$ sum; $ I ++ ){
- Fputs ($ fp, "$ I, 'testtest' \ r \ n ");
- }
- $ Db-> exec ("load data infile '$ filename 'into table test fields terminated ','");
- }
-
- $ End_time = time ();
- Echo "total time consumed", ($ end_time-$ start_time), "seconds \ n ";
- Echo "peak memory", round (memory_get_peak_usage ()/1000), "KB \ n ";
-
- ?>
The above is the performance analysis and comparison of a variety of MySQL data insertion methods, hoping to help you.
Blog Source: http://yansu.org/2014/04/16/insert-large-number-of-data-in-mysql.html