[Reprint] Improve the efficiency of SQLite data insertion is low, slow method

Source: Internet
Author: User
Tags documentation sqlite sqlite database stmt


Reprint Address: http://blog.csdn.net/majiakun1/article/details/46607163, if there is infringement of the original, please leave a message to inform, I will promptly delete. Objective

Because of its simplicity, flexibility, lightness and open source, SQLite database has been applied to small and medium-sized applications more and more. It has even been said that SQLite can be used to replace file read and write operations in the C language. So when I recently wrote a program on remote sensing data processing, I also introduced sqlite in order to improve the structure of the data and to improve the processing power of Big data (SQLite supports 2PB size data). But at first, I found that the efficiency of inserting SQL statements directly is simply appalling. Then constantly check the documents, check the data, only to find a fast "data insertion" road. This paper takes the inserting data as an example, integrates the various methods of improving the efficiency of sqlite in the Internet and the Data book, and gives a complete method to improve the efficiency of sqlite data insertion. (Do not spray the gods!)




1 data
I use the computer is Win7 64-bit system, using VC2010 compilation, SQLite version of 3.7.15.2, computer CPU for the second generation i3 processor, memory 6G. Before you experiment, create a table to insert the data into:
  1. Create table t1 (id integer, x integer, y integer, Weight real)
2 Slow--the most brutal method
The
functions of SQL directly executed in SQLite's API are:
  1. int sqlite3_exec (   sqlite3*,    const char *sql ,    int  (*callback) (void*, int,char**,char**),    < span class= "keyword" >void *,   char **errmsg)   
Insert a string directly using the INSERT statement, Program section code (see the complete code below), as follows:
  1. for (int i=0;i<ncount;++i)
  2. {
  3. Std::stringstream SSM;
  4. ssm<<"INSERT into T1 values (" <<i<< "," <<i*2<<"," <<i/2<<"  , "<<i*i<<") ";
  5. Sqlite3_exec (Db,ssm.str (). C_STR (), 0,0,0);
  6. }
This program is running too slow, I have no time to wait, estimated, basically 7.826/s
3 Medium Speed--Explicitly open transaction
  So-called "transactions" refers to a set of SQL commands that either execute together or are not executed. In SQLite, each call to the sqlite3_exec () function implicitly opens a transaction, and if a data is inserted, the function is called once, and the transaction is repeatedly opened and closed, increasing the IO amount. If you explicitly open the transaction before inserting the data, and then commit it together, the IO efficiency is greatly increased, and the speed of data insertion is fast. To open a transaction, you only need to open and commit the transaction's command before and after the preceding code:

 

  1. sqlite3_exec (Db, " Begin; ", 0,0,0);   
  2. for (int i=0;i <ncount;++i)   
  3. {&NBSP;&NBSP;
  4.      std::stringstream ssm;  
  5.     ssm<< "Insert into t1 values (" <<i<<
  6.      Sqlite3_exec (Db,ssm.str (). C_STR (), 0,0,0);   
  7. }&NBSP;&NBSP;
  8. sqlite3_exec ( Db, "commit;", 0,0,0);   

 

After explicitly opening the transaction, the program runs significantly faster, with an estimated efficiency of 34,095/s, which is about 5,000 times times higher than the original method.


4 High Speed--write sync (synchronous)


I want to use a remote sensing processing algorithm to process the image of 10000*10000, there is a step in the middle to insert 100 million data into the database, if the speed after opening the transaction 34,095/s, you need 100000000÷34095 = 2,932 seconds = 48.9 points, Still not acceptable, so I went on to find a way to increase speed. Finally, in the information about the SQLite configuration, you see the "Write Synchronization" option.

In SQLite, the parameters of the database configuration are implemented by the compilation instructions (pragma), where the synchronous option is available in three selectable states, full, normal, off, respectively. This blog and the official documentation contain details of these three parameters. Briefly, the full write speed is the slowest, but the data is guaranteed to be secure, unaffected by power outages, system crashes, and off can speed up some operations on the database, but if the system crashes or loses power, the database may be damaged.

SQLite3, the default value for this option is full, which is more efficient if we change the data to off before inserting it. If you just use SQLite as a temporary database, there is absolutely no need to set it to full. In the code, the Setup method is to insert the following statement directly after opening the database:

  1. Sqlite3_exec (db,"PRAGMA synchronous = OFF;", 0,0,0);
At this point, after testing, the insertion speed has become 41,851/s, that is, inserting 100 million data, it takes 2,389 seconds = 39.8 points.



5 Speed--Execution preparation


Although write synchronization is set to OFF, there is a slight increase in speed, but it is still relatively slow. Once again I embarked on the road to find ways to improve the efficiency of sqlite insertion. Finally, I found that when SQLite executes the SQL statement, there are two ways: one is to use the function sqlite3_exec () mentioned earlier, which invokes the string containing the SQL statement directly, and the other is "perform preparation" (similar to a stored procedure) operation. That is, the SQL statement is compiled first, and then one step (or one line) to execute. If the former, even if the business, SQLite still want to loop every sentence of SQL statement "lexical analysis" and "parsing", which for the simultaneous insertion of large amounts of data, is simply a waste of time. Therefore, to further improve the efficiency of insertion, the latter should be used. The

Execution preparation is divided into three main steps:

1. Calling functions

  1. int Sqlite3_prepare_v2 (sqlite3 *db, const char *zsql, int nbyte, sqlite3_stmt **ppstmt, const CHA  R **pztail);
It also declares a pointer to the SQLITE3_STMT object that compiles the parameterized SQL statement zsql and stores the compiled state in ppstmt.


2. Call the function Sqlite3_step (), this function is to perform a step (in this case, insert a row), if the function returns a Sqlite_row, the description is still continuing to execute, otherwise the operation has been done;

3. Call function Sqlite3_finalize () to close the statement.


See the official documentation for the specific syntax for executing the prepared API. The C + + code to be prepared in this article is as follows:



At this point the test data insertion efficiency is: 265,816/S, that is, insert 100 million data, need 376 seconds = 6.27 points. This speed has been very satisfactory.





5 Summary





In summary, the most efficient way for SQLite to insert data is: Transaction + turn off write synchronization + Execute preparation (stored procedure), if there is a requirement for database security, turn on write synchronization.



Resources:


1. SQLite Official document: HTTP://WWW.SQLITE.ORG/DOCS.HTML2. "Resolving sqlite3 Insert Data slow problem": http://blog.csdn.net/victoryknight/ Article/details/74617033. "The Definitive Guide to SQLite" Apress Published: Http://www.apress.com/9781430232254 (This is a good book)



The final complete code is attached:


#include <iostream>
#include <string>
#include <sstream>
#include <time.h>
#include "Sqlite3.h"
const int ncount = 500000;
int main (int argc,char** argv)
{
SQLITE3* DB;
Sqlite3_open ("testdb.db", &db);
Sqlite3_exec (db,"PRAGMA synchronous = OFF;", 0,0,0);
Sqlite3_exec (db,"drop table if exists T1", 0,0,0);
Sqlite3_exec (DB,"CREATE TABLE T1 (ID integer,x integer,y integer, Weight real)", 0,0,0);
clock_t t1 = clock ();
Sqlite3_exec (db,"begin;", 0,0,0);
Sqlite3_stmt *stmt;
const char* sql = "INSERT into T1 values (?,?,?,?)";
SQLITE3_PREPARE_V2 (Db,sql,strlen (SQL), &stmt,0);
For (int i=0;i<ncount;++i)
{
//Std::stringstream SSM;
//ssm<< "INSERT into T1 values (" <<i<< "," <<i*2<< "," <<i/2<< "," <<  i*i<< ")";
//Sqlite3_exec (Db,ssm.str (). C_STR (), 0,0,0);
Sqlite3_reset (stmt);
Sqlite3_bind_int (Stmt,1,i);
Sqlite3_bind_int (stmt,2,i*2);
Sqlite3_bind_int (STMT,3,I/2);
Sqlite3_bind_double (Stmt,4,i*i);
Sqlite3_step (stmt);
}
Sqlite3_finalize (stmt);
Sqlite3_exec (db,"commit;", 0,0,0);
clock_t t2 = clock ();
Sqlite3_close (DB);
std::cout<<"cost Tima:" << (T2-T1)/1000.<<"s" <<std::endl;
return 0;
}





[Reprint] Improve the efficiency of SQLite data insertion is low, slow method


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.