Problem
We found a "table XXX already exists" error while using SQLite in a production environment, but there is no table in the DB file. Later found that this is sqlite in the implementation process of a bug, and this bug and data dictionary consistency, the following article mainly discusses the caching mechanism of SQLite, as well as the implementation of the cache consistency strategy, we hope to understand the SQLite caching mechanism has some help.
Cache
the cache in SQLite mainly includes two aspects, data dictionary cache and data page cache. SQLite itself is a file database, all the data in a DB file, the file is stored as a block (page), by default, each page is 1024 bytes. To avoid disk IO per access, a layer of caching is implemented within SQLite for data blocks
The role of Pagecache,pagecache is to cache page data. In SQLite, in addition to user data, there is also a part of the content is metadata information, including tables, views, indexes and triggers, and so on, this part of the metadata information in the database domain is generally referred to as a data dictionary, this part of the information also exists in the db file. Since each execution of the statement requires a data dictionary for semantic Analysis and execution plan optimization (whether the table exists, whether the column exists, whether the index is available, whether there is a trigger, etc.), it is very important to obtain the information from the DB file each time it is obtained. You might say, isn't there a pagecache already? Yes, the contents of the data dictionary are also cached in Pagecahce, but to know that the data in the page is binary, you need to parse the content to produce structured data to use. Therefore, in order to avoid parsing the statement, the data dictionary is parsed frequently, and the parsed data is cached for multiple use, which improves the efficiency.
data page Cache consistency
The data page cache We are discussing here corresponds to the concept of MySQL, Bufferpool, and of course other database oracle,sqlserver have similar concepts.
the database above the traditional PC is assigned the bufferpool of a certain size at one time according to the parameter setting value when the database service starts. SQLite uses lazy allocation strategy, that is "how much to allocate how much", pagecache default size is 2000 page,2000 page can be considered as a cache limit. The advantage of one-time allocation is that there is a continuous physical, not easy to generate memory fragmentation, and lazy allocation is more memory-saving, because SQLite is generally used for end devices, lazy distribution may be more affordable. SQLite's cache allocation strategy uses LRU, retaining the most recently visited page, eliminating the oldest page.
in SQLite, each database connection corresponds to a DB handle, the application operates the database through a DB handle, and Pagecache actually hangs in the DB handle as a member, so each DB handle has its own independent cache, which differs from the traditional PC database (for example, MySQL , all connections are shared Bufferpool). Since each DB handle has a separate cache, how are the caches synchronized? For example, Connection1 and Connection2 two connections, Connection1 first read from the file page_a and added to the cache, and then Connection2 also read page_a from the file, and updated So when Connection1 reads page_a again, how does Connection1 know that his cached page_a is not up-to-date and needs to be read back into the db file?
SQLite in order to deal with this problem, in the DB file control header stored in the DB version information, the start of SQL read the version of the DB information and cache, how to find that the version of the information and the previous differences, confirm that the DB file has been modified to clean up its own cache. Each time a transaction commits, Pager_write_changecounter is called to update, in the 24th byte of the first page, which occupies 4 bytes.
data dictionary cache consistency
The data dictionary we are discussing here corresponds to the concept of MySQL, which is the INFORMATION_SCHEMA system table, and the dictionary cache is the structured information store of the system table information. In SQLite, dictionary information is stored in hash table, including (Tblhash,idxhash,trighash and Fkeyhash, etc.) to determine whether an object exists based on whether the object exists in the hash table. The OpenDatabase function initializes the data dictionary by calling Sqlite3init and sets the tag. As with the data page cache, the dictionary cache is also a separate copy of each DB handle, and similarly, the SQLite file header holds the data dictionary version information in the 40th byte of the first page, which accounts for 4 bytes. When the DDL operation is performed (Create,drop,alter, etc.), the Sqlite3changecookie Update dictionary version number (Schema cookie) is called. When parsing a statement in the prepare phase, if the object is found to be absent, a schema cookie check is triggered, and if the data dictionary is not up-to-date, the sqlite3schemaclear is called for cleanup and the data dictionary is reloaded. In addition, SQLite's Data dictionary table is very simple, mainly in the Sqlite_master table, each object is a row of records, the record contains the table definition, loading the dictionary, the actual is to parse the table definition statement, by calling Sqlite3endtable to add the object to the hash table , very convenient.
Summary
Can see, regardless of the data page cache, data dictionary cache, SQLite is a version number to control the version information, very simple and practical, but the drawback is very large granularity. If the db write is very frequent, then each read will basically lead to physical IO, may be modified a table, access to B table also need to empty the cache. It is also possible to explain why page caching is lazy-loading mode, so the cost of emptying the cache is relatively small. For the data dictionary cache, the granularity is also very coarse, and each modification of a table, view, trigger, and other objects will trigger a data dictionary version update. Of course, SQLite will not be silly every time you execute SQL to determine whether their version is up-to-date, only when accessing the object, the object does not exist to check the version, which to a certain extent reduce the number of loading, but this also brings problems, the following back to the problem itself.
Back to the question
Before we throw a sqlite bug, here to elaborate the ins and outs. Suppose there are two DB handles, called A and B, respectively. Perform the following sequence: A:create table t (id int); B:drop table if exists t; A:create table T (id int); The second time a table will be reported "table T already exists" error, but the actual table no longer exists. The main reason is that the existence of a table when the 3rd step a builds a table does not trigger the logic to determine whether the data dictionary is up to date, leading to false positives. When you reproduce the problem, be careful to turn off Sharecache, because in Sharecache mode, all the DB handles share a buffer. The problem is simple, but it takes a bit of effort to guess the problem of recurrence.
Exploration of SQLite cache consistency caused by a bug