Detailed description of data types in SQLite, detailed description of sqlite Data Types
Most SQL database engines (all SQL database engines except SQLite as we know) use strict static types. When static type is used, the value type is determined by its container-the specific column of the stored value.
SQLite uses a more general dynamic type system. In sqls, the Data Type of a value is related to the value itself, rather than its container. The dynamic type system of SQLite is backward compatible with the common static type systems of other database engines. In this sense, SQL statements that work on static databases should work in SQLite in the same way. However, the dynamic type in SQLite allows it to do what traditional strict types of databases cannot do.
1.0 storage and data types
Each value stored in the SQLite database (or the value operated by the database engine) has the following Storage types:
- NULL. The value is NULL.
- INTEGER. The value is a signed INTEGER, which is stored in 1, 2, 3, 4, 6, or 8 bytes based on the value size.
- REAL. The value is a floating point number, which is stored in 8-byte IEEE floating point number.
- The TEXT. value is a TEXT string that is stored using database encoding (UTF-8, UTF-16BE or UTF-16LE.
- BLOB. A value is a data block, which is stored as it is input.
Note that the storage type is more general than the data type. Taking the INTEGER storage type as an example, it contains six INTEGER types with different lengths, which are different on disks. However, as long as INTEGER values are read from the disk to the memory for processing, they are converted to a more general data type (8-byte signed INTEGER ). Therefore, in general, there is no difference between the "storage type" and "Data Type". These two terms can be used interchangeably.
Any column in The SQLite Version 3 database, except the integer primary key column, can be used to store values of any storage type.
Any values in SQL statements, whether embedded into the literal volume of SQL statements or bound to parameters in pre-compiled SQL statements, have an implicit storage type. In the following cases, the database engine converts the data storage type (INTEGER and REAL) to TEXT during query execution.
1.1 Boolean Type
SQLite does not have a separate Boolean storage type, but stores the Boolean values as integers 0 (false) and 1 (true ).
1.2 Date and Time types
SQLite does not have another storage type to store date and time. SQLite's built-in date and time functions can save the date and time as TEXT, REAL, or INTEGER values:
- TEXT ISO8601 string ("YYYY-MM-DD HH: MM: SS. SSS ").
- The number of REAL Confucian days (Julian Day Numbers) is the number of days since noon, January 1, November 24, 4714 BC, according to the previous Gregorian calendar.
- INTEGER Unix time, the number of seconds since 00:00:00 UTC.
The application can select any of these formats to store the date and time, and use the built-in date and time functions to freely convert between these formats.
2.0 type affinity
To maximize compatibility between SQLite and other database engines, SQLite supports the concept of "type affinity" for columns. The Type affinity of a column is the recommended type where the index data is stored in the column. The important idea here is that the type is recommended, not necessary. Any column can still store any type of data. This only allows some columns to selectively use a certain storage type. The preferred storage type of a column is called its "affinity ".
Columns in each SQLite 3 database belong to one of the following types of affinity:
- TEXT
- NUMERIC
- INTEGER
- REAL
- NONE
A column with the TEXT affinity uses the storage type NULL, TEXT, or BLOB to store all data. If the value data is inserted into a column with the TEXT affinity, the data is converted to the TEXT format before being stored.
The value-affinity column may contain values that use all five storage classes. When text data is inserted into a numeric column, the storage type of the text is converted to an integer or a real number (sorted by priority) if the conversion is lossless or reversible. For the conversion between the text and the real number type, if the first 15 important decimal digits are retained, SQLite considers the conversion to be lossless and reversible. If the text cannot be converted to an integer or real number without loss, the value is stored as text. Do not try to convert NULL or BLOB values.
A string may look like a floating point text with a decimal point and/or exponent, but as long as the value can be expressed as an integer, the value affinity will convert it into an integer. Therefore, the string '3. 0e + 5' is stored in an integer 300000 instead of a floating point value 30000.0 in a value affinity column.
A column with integer affinity is consistent with a column with numerical affinity. Only in CAST expressions, the differences between them are obvious.
In addition to forcibly converting an integer value to a floating-point representation, a column with a real-number affinity is consistent with a column with a value affinity (as an internal optimization, in order to reduce space usage, the small floating point value stored in the real-number affinity column without decimal points is written to the disk in integer form. It is automatically converted back to the floating point value when being read. At the SQL level, this optimization is completely invisible and can only be detected by checking the original bits of database files ).
A column with a NONE affinity cannot be converted from one storage type to another, and do not try to forcibly convert it.
Determination of affinity of 2.1 Columns
The affinity of a column is determined by its declared type. The rules are shown in the following order:
1. If the declared type contains the string "INT", it is specified as the integer affinity;
2. If the declared type of a column contains any "CHAR", "CLOB", or "TEXT" string, the column has a TEXT affinity. Note: The VARCHAR type contains "CHAR" and is specified as text affinity;
3. If the declared type of the column contains "BLOB" or the type is not specified, the column has a NONE affinity;
4. If the declared type of a column contains any "REAL", "FLOA", or "DOUB" string, the column has a REAL-number affinity;
5. Otherwise, it will have a numerical affinity.
Note: It is important to determine the order of the column affinity rules. A column with the declared "CHARINT" type will match rules 1 and 2, but rule 1 takes precedence over all columns with integer affinity.
2.2 affinity name instance
The following table shows how many common data type names are implemented from more traditional SQL statements. The five rules described in the previous section are converted to various affinity types. This table only shows a small part of data type names acceptable to SQLite. Note: numeric parameters (such as VARCHAR (255) in parentheses following the type name )) will be ignored by SQLite-SQLite does not impose any restrictions on the length of strings, BLOBs, or numeric values (except for the large global SQLITE_MAX_LENGTH limit ).
Note: Because "INT" at the end of "POINT", a "floating point" Declaration type will be given an integer affinity, not a real affinity. In addition, the "STRING" Declaration type has the value affinity, not the text affinity.
2.3 columns of affinity behavior instances
The following SQL shows how SQLite converts data types using column affinity when values are inserted to a table:
CREATE TABLE t1( t TEXT, -- text affinity by rule 2 nu NUMERIC, -- numeric affinity by rule 5 i INTEGER, -- integer affinity by rule 1 r REAL, -- real affinity by rule 4 no BLOB -- no affinity by rule 3);
-- Values stored as TEXT, INTEGER, INTEGER, REAL, TEXT. (VALUES are stored in the form of text, integer, integer, real number, and text respectively.) insert into t1 VALUES ('123. 0', '2017. 0', '2017. 0', '2017. 0', '2017. 0 '); SELECT typeof (t), typeof (nu), typeof (I), typeof (r), typeof (no) FROM t1; text | integer | real | text -- Values stored as TEXT, INTEGER, INTEGER, REAL, REAL. delete from t1; insert into t1 VALUES (500.0, 500.0, 500.0, 500.0, 500.0); SELECT typeof (t), typeof (nu), typeof (I ), typeof (r), typeof (no) FROM t1; text | integer | real -- Values stored as TEXT, INTEGER, INTEGER, REAL, INTEGER. delete from t1; insert into t1 VALUES (500,500,500,500,500); SELECT typeof (t), typeof (nu), typeof (I), typeof (r), typeof (no) FROM t1; text | integer | real | integer -- BLOBs are always stored as BLOBs regardless of column affinity. delete from t1; insert into t1 VALUES (x '000000', x '000000', x '000000', x '000000', x '000000 '); SELECT typeof (t), typeof (nu), typeof (I), typeof (r), typeof (no) FROM t1; blob | blob -- NULLs are also unaffected by affinityDELETE FROM t1; insert into t1 VALUES (NULL, NULL ); SELECT typeof (t), typeof (nu), typeof (I), typeof (r), typeof (no) FROM t1; null | null
3.0 comparison expression
Like standard SQL, SQLite 3 supports the following comparison operators: "=", "=", "<", "<=", ">", "> = ", "! = "," <> "," IN "," not in "," BETWEEN "," IS ", and" is not ".
3.1 sorting rules
The comparison result is related to the storage type of the operands, and is subject to the following rules:
- The NULL value is smaller than any other value (including another NULL value)
- INTEGER or REAL is smaller than the TEXT or BLOB value. If two integers (or REAL) are compared, the actual values are used.
- TEXT is smaller than BLOB. If two texts are compared, the result is determined by the appropriate sorting order.
- If two blods are compared, the results are consistent with those of memcmp ().
3.2 similarity of operands during comparison
Before comparing values, SQLite tries to convert values between storage class INTEGER, REAL, and/or TEXT. Before comparison, it is entirely dependent on the similarity of the operands that you try not to convert. The Determination Rules of operand similarity are as follows:
- Only the expressions that reference the values in a column have identical similarity with the referenced columns. NOTE: If X and Y.Z represent the column name, + X and + Y.Z can be considered as expressions to determine their similarity.
- "CAST (expr AS type)" indicates that columns with the same expression type defined AS "type" have identical similarity.
- In other cases, the expressions have NONE similarity.
3.3 type conversion before comparison
"Apply affinity" means that when and only when the involved conversions are lossless and reversible, converts an operand to a specific storage type. Before comparison, the similarity rules for the operands of comparison operators are as follows:
- If one of the operands has the similarity of INTEGER, REAL, or NUMERIC, And the other operand has the similarity of TEXT or NONE, apply the NUMERIC similarity to the other operand.
- If one of the operands has the TEXT Similarity and the other has the NONE similarity, it is necessary to apply the TEXT similarity to the other operand.
- In other cases, no similarity is applied, and the two operands are compared as they are.
The expression "a BETWEEN B AND c" is considered as two separate binary comparison operations "a> = B AND a <= c". Even so, this may cause a to be applied with different similarity in the two comparisons. Datatype conversions in comparisons of the form IN the comparison of the form "x in (SELECT y...)", the data type conversion is exactly the same as that of "x = y. Expression "a IN (x, y, z,...)" is equivalent to "a = + x OR a = + y OR a = + z OR. IN other words, the value on the right side of the IN operator (IN this example, "x", "y", and "z") is considered as non-similarity, even if they happen to be values of a column or CAST expressions.
3.4 Comparison example
CREATE TABLE t1( a TEXT, -- text affinity b NUMERIC, -- numeric affinity c BLOB, -- no affinity d -- no affinity); -- Values will be stored as TEXT, INTEGER, TEXT, and INTEGER respectivelyINSERT INTO t1 VALUES('500', '500', '500', 500);SELECT typeof(a), typeof(b), typeof(c), typeof(d) FROM t1;text|integer|text|integer -- Because column "a" has text affinity, numeric values on the-- right-hand side of the comparisons are converted to text before-- the comparison occurs.SELECT a < 40, a < 60, a < 600 FROM t1;0|1|1 -- Text affinity is applied to the right-hand operands but since-- they are already TEXT this is a no-op; no conversions occur.SELECT a < '40', a < '60', a < '600' FROM t1;0|1|1 -- Column "b" has numeric affinity and so numeric affinity is applied-- to the operands on the right. Since the operands are already numeric,-- the application of affinity is a no-op; no conversions occur. All-- values are compared numerically.SELECT b < 40, b < 60, b < 600 FROM t1;0|0|1 -- Numeric affinity is applied to operands on the right, converting them-- from text to integers. Then a numeric comparison occurs.SELECT b < '40', b < '60', b < '600' FROM t1;0|0|1 -- No affinity conversions occur. Right-hand side values all have-- storage class INTEGER which are always less than the TEXT values-- on the left.SELECT c < 40, c < 60, c < 600 FROM t1;0|0|0 -- No affinity conversions occur. Values are compared as TEXT.SELECT c < '40', c < '60', c < '600' FROM t1;0|1|1 -- No affinity conversions occur. Right-hand side values all have-- storage class INTEGER which compare numerically with the INTEGER-- values on the left.SELECT d < 40, d < 60, d < 600 FROM t1;0|0|1 -- No affinity conversions occur. INTEGER values on the left are-- always less than TEXT values on the right.SELECT d < '40', d < '60', d < '600' FROM t1;1|1|1
If the comparison in the example is replaced-for example, "a <40" written "40> a"-All results are still the same.
4.0 Operator
All mathematical operators (+,-, *,/, %, <, >>, &, and |) Put the two operands into the NUMERIC storage class before expansion. Even if this process is lossy and irreversible. A null operand produces a NULL result on a mathematical operator. The operand on the number calculation operator is not regarded as a number, and NULL is not converted to 0 or 0.0.
5.0 sorting, grouping, and Combined Query
When the query results are sorted BY the order by clause, the NULL values of the storage type rank first, followed by integer and REAL data distributed in the numerical ORDER, the second is the TEXT value according to the check sequence, and the last is the BLOB value of memcmp () order. there will be no storage type conversion before sorting.
When grouping using group by clauses, different types of values are considered to be different data, except for INTEGER and REAL values. If they are equal, they are considered to be the same data. no affinity applies to any value in the result of the group by clause.
UNION, INTERSECT, and except t are used for combined queries to perform Implicit Comparison between data. no affinity applies to the comparison of the number of implicitly compared operations-data associated with UNION, INTERSECT, or except t.
6.0 sorting Sequence
When SQLite compares two strings, it uses a sort sequence or sort function (one thing and two tables) to determine which string value is higher when the two strings are the same. SQLite has three built-in sorting functions: BINARY, NOCASE, and RTRIM.
- BINARY-use memcmp () to compare strings without text encoding.
- NOCASE-similar to binary, 26 uppercase letters except ASCII are converted into lowercase letters before comparison. Note that only ASCII characters are converted into uppercase and lowercase letters. Because of the table size requirements, SQLite does not try UTF case conversion.
- RTRIM-similar to binary, except for trailing space characters.
The application can use the sqlite3_create_collation () interface to register additional sorting functions.
6.1 set the order in SQL
Each column in each table has a related sorting function. If no sorting function is explicitly defined, BINARY is used by default. The COLLATE clause in the column definition defines an optional sorting function for the column.
For binary comparison operators (=, <, >,< =, >= ,! =, IS, and is not), the rules for determining which sort function to use are listed in order as follows:
- If either of the two operands has a sorting function explicitly defined by the COLLATE operator with the suffix, this function is used for comparison. If both operands have the same value, the sorting function of the left operand is used first.
- If either of the two operands is a column, the sort function of the column is used for comparison. However, when both operands are columns, the sorting function of the column corresponding to the left operand is used first. For this purpose, if the column name contains one or more unary operators "+", it is still processed by the original column name.
- In other cases, the BINARY sorting function is used for comparison.
Compare the operands in the operation. If the suffix COLLATE operator is used in any of its subexpressions, it can be considered as an explicit sorting function (Rule 1 above ). Furthermore, if the COLLATE operator is used anywhere in the comparison expression, the sorting function defined by this operator will be used for string comparison, no matter which column in the table appears in the expression. If two or more COLLATE operator subexpressions appear anywhere in the comparison, no matter how deep the expressions are embedded or how the expressions use parentheses, all use the explicit sorting function that appears on the left.
The expression "x BETWEEN y and z" is logically equivalent to the comparison operation "x> = y AND x <= z, when using sorting functions, they must be processed just like two separate comparison operations. Expression "x IN (SELECT y ...) "The processing method is exactly the same as the expression" x = y ", such as" x IN (y, z ,...) ", the order is exactly the same as the order of X.
As a part of the SELECT statement, the sorting conditions in the order by clause can also be set BY using the COLLATE operator. If sorting is set, sort BY the set sorting function. Otherwise, if the order by clause uses a column, the ORDER of the column is used to determine the ORDER. If the sorting expression is not a column and there is no COLLATE clause, the BINARY sorting order is used.
6.2 sorting sequence example
The following example identifies the sorting sequence and determines the text comparison result of the SQL statement. Note: During text comparison, if it is a number, binary, or Null value, the sorting sequence may not be used.
Create table t1 (x integer primary key, a,/* collating sequence BINARY */B COLLATE BINARY,/* collating sequence BINARY */c COLLATE RTRIM, /* collating sequence RTRIM */d collate nocase/* collating sequence NOCASE */);/* x a B c d */INSERT INTO t1 VALUES (1, 'abc ', 'abc', 'abc', 'abc'); insert into t1 VALUES (2, 'abc', 'abc '); insert into t1 VALUES (3, 'abc'); INSERT I NTO t1 VALUES (4, 'abc', 'abc');/* The text of a = B is compared to BINARY (BINARY) sort the sequence. */SELECT x FROM t1 WHERE a = B order by x; -- result 1 2 3/* a = B: The text comparison uses RTRIM to sort the sequence. */SELECT x FROM t1 WHERE a = B COLLATE RTRIM ORDER BY x; -- result 1 2 3 4/* d = a: The text comparison shows that the sequence is sorted BY NOCASE. */SELECT x FROM t1 WHERE d = a order by x; -- result 1 2 3 4/* a = d for text comparison, the BINARY (BINARY) sorting sequence is used. */SELECT x FROM t1 WHERE a = d order by x; -- result 1 4/* 'abc' = c: Use RTRIM (Binary) to sort the sequence. */SELECT x FROM t1 WHERE 'abc' = c order by x; -- result 1 2 3/* c = 'abc' text comparison shows that the RTRIM sorting sequence is used. */SELECT x FROM t1 WHERE c = 'abc' order by x; -- result 1 2 3/* The group uses NOCASE to sort the sequence (value 'abc ', 'abc' and 'abc' ** are divided into the same group ). */SELECT count (*) FROM t1 group by d order by 1; -- Result 4/* The GROUP uses BINARY to sort the sequence (value 'abc ', 'abc' and 'abc' ** are divided into different groups ). */SELECT count (*) FROM t1 group by (d | '') order by 1; -- result 1 1 2/* column c is sorted by rtrim. */(Note: sorting or column c is suspected to be sorting... BY mistake) SELECT x FROM t1 order by c, x; -- Result 4 1 2 3/* (c | ''): the BINARY sorting sequence is used. */SELECT x FROM t1 order by (c | ''), x; -- Result 4 2 3 1/* column c is sorted by nocase. */SELECT x FROM t1 order by c collate nocase, x; -- result 2 4 3 1