In many cases, when we use big data, we will find that although the implementation of the SQLite database is very fast, but can not meet our needs, it is easy to consider the use of concurrent access to the SQLite database, However, the unique mechanism of SQLite data will allow us to encounter various problems in the use, such as deadlock, error and so on. In the afternoon to detail the business of SQLite, to understand that sqlite transactions for our concurrent operation of the SQLite database is of great help.
The preliminary knowledge of this article
Let us first understand the basic process of performing transactions in SQLite, the state change process, and then analyze how to use the more excellent. The status of the locks defined by SQLite is as follows:
- UNLOCK: The initial state, without any locks on the database;
- shared: Share state, allow to read data, but not write and modify, at the same time allow more than one shared, shared lock is only for the operating system disk cache;
- RESERVED: This lock means that the process is going to write to the database. There can be only one reserved lock at a time, but reserved and shared locks can coexist, and a new shared lock can be added to the database. This state is introduced in order to improve the concurrency, in this state can first modify the cache data, until the changes are written to the disk when the addition of an exclusive lock;
- EXCLUSIVE: The process of actually writing data to disk, at this time not allowing any other write read operation, is an exclusive lock;
- PENDING: A process that can be understood as an intermediate state, from a small state to a limit of high state changes. For example, the transition from reserved to exclusive requires this state, waiting for the existing read-write connection to complete before entering exclusive.
The transaction changes between lock states during execution, as shown in:
Screen Shot 2017-03-28 at 10.55.14 pm.png
SQLite One transaction process
One transaction. png
The whole detailed process to see ?? Over here
In general, Reserved Lock and log files are one by one corresponding. An integrity check is done when pager first opens the database. If a log file is found but does not have reserved Lock, the database enters recovery mode.
Recovery.png
After entering recovery mode, it is directly from the shared state to the pending state. There is no other action before the database connection recovers the database successfully.
According to the normal business, one operation will probably go through:
- One-time file creation (rollback log)
- Two writes (modification of raw data to rollback log/data page before modification---Data page in system cache)
- Two flush files (rollback logs and database changes rushed to the local disk)
- Rollback LOG Delete Once
- Three times plus lock
We can optimize it based on the actual usage scenario.
Optimization points:
- You need to bulk update data, you can use transactions explicitly
- Disk synchronization Mode
- Set up an efficient logging mode
- Use transactions to avoid deadlocks
An explicit transaction
The creation and shutdown of a new transaction is very expensive because it needs to open, modify, and close the log file. By default, when you call SQLite statement to execute an SQL statement, a transaction is created, and the transaction is automatically closed after the statement is executed. So if we execute a lot of SQL in succession, we will create and close transactions constantly, which is very wasteful and has a very large impact on performance. In this case, we can use the BEGIN TRANSACTION and end transaction to help select the timing of transaction creation and submission, for example:
sqlite_exec(sqlitedb, "BEGIN TRANSACTION;",...); ... 执行N条SQL ... sqlite_exec(sqlitedb, "END TRANSACTION;",...);
After the begin end, n SQL creates only one transaction, and without it it will open n transactions to complete, the effect can be imagined.
Disk synchronization Mode
SQLite writes the data to the system (osbuffers) by the system to write to the disk, but in this process may have a power loss or write failures and other anomalies, if SQLite does not wait for the system to execute the results, it may be mistaken that the operation was successful, but in fact the data is inconsistent. For this scenario, SQLite offers 3 ways to sync:
PRAGMA synchronous = (0 | OFF) | (1 | NORMAL) | (2 | FULL)
In full mode, the SQLite database engine will always pause to make sure that the data has been written to disk, which ensures that the system crashes or the restart data is not corrupted after power-down, very safe but slow. In normal mode, the SQLite database engine is paused in most cases, but not as frequent in full mode, which is faster than full mode, but there is a minimum probability that the database will be corrupted when the system is down or fails. In off mode, SQLite does not wait for the result after submitting the data to the system, which means that the flush file operation is reduced two times during a transaction. In this mode, if the system crashes or becomes abnormal while writing, the database can be corrupted, but some operations can be an order of magnitude faster than full under this mode.
Synchronous_off.png
The default is normal, if the security requirements are very high, you can choose the full mode, if the pursuit of efficiency and do not mind the database corruption (such as regular database automatic backup, corruption can still be restored), you can choose off.
Set up an efficient logging mode
The log file is a key thing for SQLite to implement rollback. By default, SQLite brushes the modification log into the disk before writing the modification to the disk, then writes the modified page to the disk, and then cleans up the log after the write is completed. If Crash,sqlite is able to recover from the log file on the next boot, during the write process. However, this increases the additional disk read and write overhead, which affects the overall transaction execution time. However, SQLite provides a variety of log modes, which can be set by the following commands:
PRAGMA journal_mode = DELETE | TRUNCATE | PERSIST | MEMORY | WAL | OFF
Delete is the default, which is to delete the log file after the transaction executes, truncate the file contents without deleting the file (on many systems, this is faster than deleting the file), and the persist method does not delete the file, but instead, the Length field in the file header is set to 0. On some platforms this will be better than the previous two; memory means that the log is placed directly in the RAM, without disk storage, so that the speed will be very fast but if the outage, the log will be lost, data may be destroyed unrecoverable; WAL (Write-ahead Logging) is a Sqlite3.7 after the model, the principle is that the modification is not directly written to the database file, but written to another file called Wal, in the subsequent time is written back to the database file, this way can improve the concurrency of the transaction, but once entered this mode cannot change the page size, cannot be Read-only open the database, and all programs that access the database must be on the same host and support shared memory technology, for the read many write less scenes will be slower, such as this frequently read-write app, very suitable for wal;off is the ability to completely disable the rollback log.
In general, can choose truncate or persist mode, there will be performance help; For those scenarios where real-time requirements are high but data consistency is not very high, you can choose Memory mode 3.7 or more versions, consider the WAL mode if the amount of data modified is not particularly large or if it is not a scene that reads many writes less.
Use transactions to avoid deadlocks
The database file is locked when the transaction is created, so you need to pay attention to end the transaction in a multithreaded situation, otherwise it will affect other operations. Although SQLite has a mechanism to prevent deadlocks (the principle is to retry a finite number of times when acquiring a lock, to return a sqlite_busy error), and to avoid the program from dying, the following phenomenon is not what we want to see:
Deadlock.png
Both session A and session B fail, and this problem can be avoided by selecting the appropriate transaction type.
3 Types of transactions
- DEFERRED
- IMMEDIATE
- EXCLUSIVE
We can specify by following the BEGIN command:
BEGIN [ DEFERRED | IMMEDIATE | EXCLUSIVE ] TRANSACTION
Deferred is the default type, the transaction start will not acquire any locks, starting from the unlocked state, until the transaction needs to read or write to the database to obtain the corresponding lock, immediate start to try to obtain a reserved lock, to ensure that no other connection can write the database, But other connections can read the database, which means it blocks other connections from begin immediate or begin EXCLUSIVE, while EXCLUSIVE transactions try to get a EXCLUSIVE lock on the database, and once successful, The exclusive transaction guarantees that there is no other connection, so you can read and write to the database.
In the example, session A and session B both need to write the database, and if both are created with the immediate transaction selected, then this failure will not happen.
Thanks a lot, vedon_fu.
Links: http://www.jianshu.com/p/9d0a7d3e5001
Source: Pinterest
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.
SQLite Knowledge Digest---Transaction