Objective
Previous articles we explained the index of knowledge, this section we continue to explain the following, short content, in-depth understanding.
Data type
SQL Server supports two character data types, one is general and the other is Unicode. General data types include char and varchar,unicode data types including ncahr and nvarchar. Each character of a regular character is stored with 1 bytes, while each character of the Unicode data requires 2 bytes. Regular character columns are restricted to English only, while Unicode is for multiple languages. The text representation of the two character data types is also different, in the case of regular word literals, only single quotes are required, such as ' hello,my name is Jeffckywang,i ' m from Cnblogs ', and for Unicode word literals, You need to specify the character n as the prefix, i.e. n ' hello,my name is Jeffckywang,i ' m from Cnblogs '.
Any data type that does not have a VAR element in the name (CHAR, NCHAR) has a fixed length, that is, SQL Server preserves the row space by the column definition size, rather than by the actual characters in the character. For example, if a column defines a size of char (25), SQL Server retains 25 characters of space in that row, regardless of the length of the stored string.
The data type with the Var element in the name (VARCHAR, NVARCHAR) has a variable length, that is, SQL Server uses as much storage space as possible in the row, plus two additional bytes offset data, depending on the storage needs. For example, if a column is defined as varchar (25), the maximum number of characters supported at this time is 25, but the amount of storage is actually determined by the actual characters in the string. -Excerpt from the SQL Server (t) Basics Tutorial.
Here we need to focus on the Unicode character data types. We are now a table, as follows:
CREATE TABLE Unicodetype (FirstName VARCHAR (5) not NULL, LastName NVARCHAR (5) Not NULL);
At this point we manually insert the data, normal insertion, as follows:
INSERT dbo. Unicodetype (FirstName, LastName) VALUES '11111',--Firstname-varchar ( 5 N' ah's hair good ' --Lastname-nvarchar (5) )
The characters are fully inserted into the table as follows:
At this point we will FirstName, insert five Chinese try the following:
INSERT dbo. Unicodetype (FirstName, LastName) VALUES ' reach get people ',--Firstname-varchar ( 5 ) N' get the Lastname-nvarchar ' --------(5)
The following results appear:
That is, in the general character type as defined in the above Varvhar is five characters, at this time we insert five Chinese characters will be intercepted, of course, also inserted does not go in. Because the above already explicitly said 1 non-English strings equivalent to two bytes, at this time the Chinese occupies 10 bytes, and the varchar is only five characters, so a warning appears. Let's do it again. Insert the FirstName into two Chinese two English or a number to see
INSERT dbo. Unicodetype (FirstName, LastName) VALUES ' up to 1',--Firstname-varchar (5 ) N' get the Lastname-nvarchar ' --------(5)
Now insert into the warning, because at this time two Chinese characters is four bytes plus a number byte is exactly five bytes, so can be inserted normally, we look at LastName, from the know, since the English or the number is treated as a byte, Then we should be able to insert four Chinese characters and two English bytes for LastName to just 10 bytes. Let's see:
INSERT dbo. Unicodetype (FirstName, LastName) VALUES ' up to 1',--Firstname-varchar (5 ) N' up get Jean AB' --Lastname-nvarchar (5) )
Oh,shit, this time unexpectedly error, as follows:
We are not the analysis of the above-mentioned, is it not a byte in English, we insert an English try.
INSERT dbo. Unicodetype (FirstName, LastName) VALUES ' up to 1',--Firstname-varchar (5 N' reach get let B' --Lastname-nvarchar (5) )
The result is correct, practice is the only criterion to test the truth, from here we can see: in the regular character, a Chinese will be used as two bytes, a English will be used as a byte, but in Unicode, a Chinese will be used as two bytes, But an English will also be used as two bytes. At this point we can conclude that the individual has always thought that in Unicode, the English is stored as a byte, the insight is short.
Regular characters and one Chinese character in Unicode are stored in two bytes, while in English, regular characters are stored in one byte, while Unicode is still stored in two bytes.
String functions
Functions for string manipulation are substring, left, right, CHARINDEX, PATINDEX, REPLACE, Repicate, STUFF, UPPER, LOWER, RTRIM, LTRIM, FORMAT. For simple functions we skipped, let's talk about a few points to note.
Comparison between Len and Datalength
We first create the following test table
CREATE TABLE stringfun ( firststr VARCHAR (max) is not null, secondstr TEXT is not null);
We insert the test data
INSERT dbo. Stringfun (Firststr, secondstr) VALUES ' I am jeffckywang, I am from the blog Park, Focus on. NET technology ',--firststr- varchar (max) ' I am jeffckywang, I am from the blog Park, Focus on. NET technology ' --secondstr- text )
We first use the Len function to return the string length size of firststr and SECONDSTR
SELECT LEN (FIRSTSTR) as varcahrfieldsize from dbo. Stringfunselect LEN (SECONDSTR) as textfieldsize from dbo. Stringfun
Great, something went wrong. The Len function is unable to manipulate text. We went on to look down.
SELECT datalength (FIRSTSTR) as varcahrfieldsize from dbo. Stringfunselect datalength (SECONDSTR) as textfieldsize from dbo. Stringfun
The error is not reported at this time and the result is displayed as 47 bytes in size. Since Len is not valid for text, we do not work with text.
SELECT LEN (FIRSTSTR) as varcahrfieldsize from dbo. Stringfunselect datalength (SECONDSTR) as textfieldsize from dbo. Stringfun
At this time the type of Varcahr firststr byte size is 31, why, see here we must understand, in the above we said that the normal character will be a character two byte size in Chinese, but here is actually the actual size of the character, of course, a storage, a retrieval, It's a little different, and we don't store Chinese in varchar. We can draw a conclusion here.
Conclusion: The DATALENGTH function is for text, and Len is for Varvhar, and the text is invalid for error.
Here we have a special value that is not processed, which is null. So the question is, Len and datalength to NULL, what is its length, is it 0 or not 0 Guineas?
Let's test it out:
DECLARE @MyVar VARCHAR (0'LEN of NULL is 0' ' LEN ofnull is null'
The result we get above is that Len of NULL is Null,datalength no longer demonstrates.
Conclusion: The result of Len and datalength for null calculations is NULL.
Comparison between Charindex and Patindex
Both the charindex and PATINDEX string functions are the starting position of the query that returns the specified match string.
Let's start by querying a string that exists in the table as follows:
Use AdventureWorks2012; Goselect CHARINDEX ('Worn'CHARINDEX'55 ; Goselect PATINDEX ('Worn'PATINDEX' ;
Why did the CHARINDEX function find it, and Patindex did not query it? This is the difference between the two, both have two parameters, the second argument is to match the string, but the PATINDEX function must be matched before or after the string to add the percent is the wildcard character, and the CHARINDEX function is not required. as follows:
Use AdventureWorks2012; Goselect CHARINDEX ('Worn'CHARINDEX'55 ; Goselect PATINDEX ('%worn%'PATINDEX' ;
Conclusion: The Patindex matching string must be preceded or followed by a wildcard character, and charindex need not be added.
Summarize
In this section we mainly explain the types of data in SQL and a few areas to be aware of, short content, in-depth understanding, we'll see you next.
SQL Server data type (vii)