SQLite profile temporary files, memory database

Source: Internet
Author: User
Tags sqlite terminates



One, 7 types of temporary documents
A different feature of SQLite is that a database is composed of a single disk file. This simplifies the use of SQLite because moving or backing up a database simply copies a single file. This also makes SQLite suitable for use as an application file format. However, when you store a database in a single file, SQLite uses many temporary files during processing of the database.
SQLite currently uses 7 different types of temporary files:
* Rollback log (Rollback journals)
* Master log (Master journals)
* SQL statement log (Statement journals)
* Temporary Database (temp databases)
* View and subquery persistence (materializations of views and subqueries)
* Temporary index (Transient indices)
* Temporary database for vacuum use (Transient databases used by vacuum)



     (1) rollback log
     Rollback log is a temporary file for atomic commit and rollback functionality. The rollback log is always located in the same directory as the database file, with the file name "-journal" after the database filename. Rollback logs are typically created at the first start of a transaction and are deleted when a transaction commits or rolls back. If the log is not rolled back, SQLite will not be able to roll back an incomplete transaction, and the database will be corrupted at some point in the middle of the transaction execution if a system crash or power outage occurs. Rollback logs are typically created and destroyed at the start and end points of a transaction, but there are also some exception rules.
     If a crash or power outage occurs at some point in the middle of a transaction, a rollback log is left on the hard disk. The next time another application tries to open a database file, it notifies the existence of a rollback log (what we call the "hot Log") and uses the information in the log to restore the database to the state before the start of the unfinished transaction. This is the basic principle that SQLite implements atomic submission.
     If the application uses the directive "PRAGMA locking_mode=exclusive;" When SQLite is placed in repel lock mode, SQLite creates a new rollback log at the beginning of a transaction with an exclusive lock mode session and does not delete the rollback log at the end of the transaction. The rollback log may be scaled down, or its head may be cleared by 0 (depending on the version of SQLite you are using), but the file will not be deleted until the rollback log is deleted when the excluded access mode exits.
     rollback log creation and deletion can also be changed using the log mode pragma directive. The default logging mode is delete, which deletes the rollback log at the end of each transaction. The Persist log mode discards the deletion of the log file, clearing the header of the log file to prevent other processes from rolling back the log, so this has the same effect as deleting the log file, although it does not actually delete the log file from disk. In other words, the log mode persist exhibits the same behavior as the exclusive lock mode. The off log mode lets SQLite discard the rollback log at the beginning, which disables the atomic commit and rollback features of SQLite, making the rollback command unavailable. If a transaction that uses off-log mode crashes or loses power at some point in the middle, the database file cannot be recovered and may be corrupted



(2) main log file
The primary log file is used in the atomic submission process for multi-database operations, where a transaction modifies multiple databases that are associated with a database connection through the Attach command. The primary log file is always located in the same directory as the primary database file (the primary database file is the database that was used when the database connection was created by calling Sqlite3_open (), Sqlite3_open16 (), or SQLITE3_OPEN_V2 (), followed by a random suffix. The primary log file contains all the associated secondary database names. The primary log file is deleted when multiple database transactions are committed.
The primary log file is created only in cases where a data connection is made to a session with two or more databases associated with attach, and one transaction modifies multiple database files. If there is no primary log file, the commit of multiple database transactions to each individual database is atomic, but it is not atomic for the entire multi-database. That is, if a commit is interrupted at some point in the middle because of a crash or power outage, changes to one database may be completed, and changes to another database are rolled back. The primary log file ensures that all changes to all databases are rolled back together or committed together.



(3) SQL statement log file
The SQL statement log file is used to roll back partial results of a single SQL statement in a large transaction. For example, suppose an UPDATE statement tries to modify 100 rows in the database, but terminates after 50 rows have been modified because of an unexpected condition. The SQL statement log is used to undo changes in these 50 rows so that the database reverts to the state before the statement was executed.
The SQL statement log is created only if one update or INSERT statement modifies multiple rows of the database, terminates unexpectedly, or throws an exception in the trigger and therefore needs to undo some of the results. If update or insert is not included in Begin...commit and there are no other active SQL statements on the same database connection, you do not need to create a statement log because the original rollback log can be used. If a reliable conflict resolution algorithm is used, the statement log is also ignored, for example:


 
UPDATE OR FAIL ...
UPDATE OR IGNORE ...
UPDATE OR REPLACE ...
INSERT OR FAIL ...
INSERT OR IGNORE ...
INSERT OR REPLACE ...
REPLACE INTO ....


SQL statement log files use random file names, not necessarily in the same directory as the primary database, and are automatically deleted at the end of a transaction. The spatial size of the SQL statement log is only the scale of the change portion of the UPDATE or INSERT statement completion.



(4) Temporary database
Tables created with the Create TEMP Table command are visible only on the database connection that executes the command. These temp tables, along with any associated indexes, triggers, and views, are stored in a separate temporary database file that was created the first time the Create TEMP Table command was encountered. This separate temporary database file also has an associated rollback log. The staging database used to store the temp table is automatically deleted when the database connection is closed using Sqlite3_close ().
The temporary data database file is very similar to the secondary database file that was added through the Attach command, but with some special properties. Temporary database files are always deleted automatically when the database connection is closed. The staging database always uses the two pragma instruction settings synchronous=off and journal_mode=persist. Also, the staging database cannot use detach, and other processes cannot associate the staging database through attach. The staging database file and its rollback log are only created when the application uses the Create TEMP TABLE command.



(5) Persistence of views and subqueries
A query command that contains a subquery must execute the subquery at some point and store the results in a temporary table, and then use the contents of the temporary table to execute the external query. We call this a "persistent" subquery. SQLite's query optimizer tries to avoid persistence, but sometimes it is unavoidable. Each temporary table created by the persistence process is stored in its own separate temporary file, which is automatically deleted at the end of the query. The size of these temporary tables depends on the number of data for the subquery entity.
Subqueries that are located to the right of the in operator must usually be persisted. For example:


SELECT *  from WHERE inch (SELECT from EX2);


In the query command above, the result of the subquery "Select B from Ex2" is stored in a temporary table (actually a temporary index), which determines whether a value ex2.b exists by binary search. Once this temporary table is created, run an external query, check to see if ex1.a is included in the temporary table for each expected result row, and if true, output the result row.
To avoid creating temporary tables, the query can be rewritten in the following form:


SELECT *  from WHERE EXISTS (SELECT1fromWHERE ex2.b=ex1.a);


If there is an index on the column ex2.b, then 3.5.4 and later versions of SQLite will automatically make such an override.
If the right part of the in operator is a list of values, like this:


SELECT *  from WHERE inch (1,2,3);


The list of values in the right of in is considered to be a subquery and must be persisted. In other words, this query behaves like this:


SELECT *  from WHERE inch (SELECT1UNION All SELECT 2 UNION  All SELECT 3);


When the on right side is a list of values, these values are held with a temporary index.
When a subquery appears in the FROM clause of the SELECT command, it is persisted as well. For example:


SELECT *  from JOIN (SELECT from as on t.b=ex1.a;


Depending on the query, SQLite may need to persist the "(SELECT b from EX2)" subquery into a temporary table and then perform a connection between the EX1 and the temporary table. The query optimizer attempts to "flatten (Flattening)" This query to avoid the persistence of subqueries. In this example, the query can be flattened, and SQLite will automatically convert the query into:


SELECT Ex1. *  from JOIN  on ex2.b=ex1.a;


More complex queries may or may not be flattened to avoid temporary tables. Whether flattening processing depends on whether the subquery or external query contains an aggregate function, an ORDER by or a GROUP BY clause, a limit clause, and so on.


(6) Temporary index
SQLite uses temporary indexes to implement many of the SQL language features, including:
* ORDER by or GROUP BY clause
* DISTINCT keywords in aggregate queries
* Composite SELECT statement with JOIN clauses such as union, except, or intersect
Each temporary index is stored in its own temporary file and is automatically deleted at the end of the SQL statement execution.
SQLite will attempt to implement an ORDER BY clause using an existing index. If an index already exists on the specified field, SQLite iterates through the index (instead of creating a temporary index) to extract the information it needs and output the resulting row in the specified sort. If SQLite does not find a suitable index, the query is executed and each row is stored in a temporary index, and the index keyword is the field specified by the order by. SQLite then returns and iterates through the temporary index, outputting each row with the specified sort.
For the GROUP BY clause, SQLite sorts the output rows according to the specified fields. Each output line is compared to the previous row to see if it belongs to the new group. The sort of the GROUP by field is the same as the order by field. If an index exists, it is used, and a temporary index is created if there is no existing index.
The DISTINCT keyword on an aggregate query creates a temporary index in a temporary file and stores each row of results in the index. For a new result row, if it already exists in the staging index, it is ignored.
The union operator of a composite query creates a temporary index in a temporary file and stores the left and right subquery results in the index, ignoring duplicate rows. When two subqueries are executed, the temporary index is traversed from beginning to end to produce the final output.
The except operator of a composite query creates a temporary index in a temporary file, stores the result of the left subquery in a temporary index, removes the result from the right subquery from the index, and then iterates through the temporary index to get the final output.
The except operator of a composite query creates two independent temporary indexes, which are located in two separate temporary files. The left and right subqueries are executed and placed in their own temporary index. It then traverses two indexes together, outputting the results that exist in two indexes.
Note that the union ALL operator of a composite query does not use a temporary index on its own, and of course the subquery on the left and right of the Union all may use the temporary index alone, depending on how they are compounded.


(7) temporary database used by vacuum command
The vacuum command creates a temporary file and then rebuilds the entire database and writes it to the temporary file. The contents of the temporary file are then copied back to the original database file, and the temporary file is finally deleted. The vacuum command creates a temporary file that is not larger than the original database file.



Second, Sqlite_temp_store compile-time parameters and pragma directives
Rollback logs, primary logs, and SQL statement log files are always written to disk, but other types of temporary files may be stored in memory without writing to disk (which can reduce the amount of IO operations), whether written to disk or in memory depending on the Sqlite_temp_store compile-time parameters. Temp_store pragma runtime directives, and the size of temporary files.
The Sqlite_temp_store compile-time parameter is a macro definition (#define) in the source code with a value range of 0 to 3 (the default value is 1), as follows:
* Equals 0 o'clock, temporary files are always stored on disk without regard to the settings of the temp_store pragma directive.
* Equals 1 o'clock, the temporary file is stored on disk by default, but the value can be overwritten by the temp_store pragma directive.
* Equals 2 o'clock, the temporary file is stored in memory by default, but the value can be overwritten by the temp_store pragma directive.
* Equals 3 o'clock, temporary files are always stored in memory without regard to the settings of the temp_store pragma directive.
The value range of the temp_store pragma instruction is 0 to 2 (the default is 0), which can be dynamically set when the program is run, as follows:
* Equals 0 o'clock, the storage behavior of temporary files is determined entirely by the Sqlite_temp_store compile-time parameters.
* Equals 1 o'clock, if the compile-time parameter sqlite_temp_store specifies to use memory to store temporary files, then the directive overrides this behavior and uses disk storage. Otherwise, the behavior of Sqlite_temp_store is directly used.
* Equals 2 o'clock, if the compile-time parameter sqlite_temp_store specifies that the disk is used to store temporary files, then the directive overrides this behavior and uses memory storage. Otherwise, the behavior of Sqlite_temp_store is directly used.
Again, the Sqlite_temp_store compile-time parameter temp_store pragma directive only affects temporary files except the rollback log and the primary log. Both of these logs are always written to disk.
For the above two parameters, there is a parameter value indicating that the default is stored in memory, only when the size of the temporary file exceeds a certain threshold, the data will be written to disk according to certain algorithm, so as to avoid the temporary file occupying too much memory and affect the execution efficiency of other programs.



Third, other temporary file optimization
SQLite uses the buffer optimization mechanism of page cache for the currently read-write database page, so even if the temporary file is specified to be stored on disk, it is only possible for SQLite to flush to the disk file if the size of the file grows to a certain size (causing the page cache to fill up). Until then, they will still reside in memory. This means that for most scenarios, if the amount of data in temporary and temporary indexes is relatively small (the page cache is sufficient to store them), then they will not be written to disk, and of course there will be no disk IO. They are flushed to the disk file only when they grow to the point where the memory cannot be accommodated.
Each temporary table and index has its own page cache, and the maximum number of database pages they can hold is determined by the Sqlite_default_temp_cache_size compile-time parameter, which specifies how much of a temporary table and index are occupied by page Cache only needs to be flushed to the disk file, the default value for this parameter is 500 pages. This parameter value cannot be modified at run time.




Iv. Memory Database
In SQLite, a database is usually stored in a disk file. In some cases, however, we can have the database always reside in memory. One of the most common ways is when calling Sqlite3_open (), Sqlite3_open16 (), or SQLITE3_OPEN_V2 (), the database file name parameter is specified as ": Memory:", such as:


= &db);


After calling the above function, no disk files are generated, and instead, a new database is created in pure memory. Because there is no persistence, the database disappears immediately after the current database connection is closed. It is important to note that each: Memory: The database is a different database, that is, with the file name ": Memory:" Open two database connections will create two separate intrinsic databases.
File name ": Memory:" Can be used anywhere the database file name is allowed. For example, it can be used in the attach command to have a memory database attached to the current connection like any other normal database, such as:


DATABASE ' : Memory: '  as Aux1;


Note When you create the in-memory database, you can only use the file name ": Memory:" and cannot contain additional text, such as "./:memory:", which creates a database based on the disk file. When using file names in URI format, you can also use ": Memory:", for example:


= Sqlite3_open ("file&db");


Or


DATABASE ' file::memory: '  as Aux1;


If the memory database is opened with a URI file name, it can use the shared cache. If a memory database is specified with an unmodified ": Memories" name, the database always has a private cache that is not visible to other connections. If you use a URI file name, the same memory database can be opened by two or more database connections, for example:


= Sqlite3_open ("file:: Memory:?cache=&db");


Or


DATABASE ' file::memory:?cache=shared '  as Aux1;


This allows multiple database connections to share the same in-memory database. Of course, these connections that share an in-memory database need to be in the same process. When the last database connection is closed, the memory database is automatically deleted.
If you need to use several different, but shareable, memory databases in a process, you can append the mode=memory query parameter to the URI file name to create a named memory database:


= Sqlite3_open ("file: Memdb1?mode=memory&Cache=&DB );


Or


DATABASE ' file:memdb1?mode=memory&cache=shared '  as Aux1;


A memory database named in this way will only share its cache with another connection with exactly the same name.




V. Temporary database for empty file names
When calling the Sqlite3_open () function or executing the attach command, if the database file parameter passes an empty string, a new temporary file will be created as the storage file for the staging database, such as:


= &db);


Or


DATABASE "'  as aux2;


Each time a different temporary file is created, and the memory database is very similar to the two connection created by the temporary database is also independent, after the connection is closed the temporary database will automatically disappear, and its storage files will be automatically deleted.
Although disk files are created to store data information in the staging database, the temporary database is in fact the same as the memory database, and the only difference is that when the amount of data in the staging database is too large, SQLite is guaranteed to have more memory available for other operations. As a result, some of the data in the staging database is written to the disk file, while the in-memory database always stores the data in memory.



SQLite profile temporary files, memory database


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.