Objective
In the previous articles we explained the index of knowledge, this section we continue to explain the following content, short content, in-depth understanding.
Data type
SQL Server supports two types of character data, one is general and the other is Unicode. General data types include char and varchar,unicode data types including ncahr and nvarchar. Each character of a regular character is stored in 1 bytes, whereas Unicode data requires 2 bytes per character. Regular character columns are limited to English only, while Unicode is for multiple languages. The textual representations of the two character data types are also different, and only single quotes, such as ' hello,my name is Jeffckywang,i ' literals, are required to represent a regular character. Cnblogs ', and for Unicode character literals, you need to specify the character n as the prefix, that is, the N ' hello,my name is Jeffckywang,i ' m from Cnblogs '.
Any data type (CHAR, NCHAR) that does not have a VAR element in the name has a fixed length, that is, SQL Server preserves the row space according to the column definition size instead of preserving space according to the actual characters in the character. For example, if a column definition is char (25), SQL Server retains 25 characters of space in the row, regardless of the length of the stored string.
The data type with the Var element in the name (VARCHAR, NVARCHAR) has a variable length, that is, SQL Server uses as much storage space as possible in the row to store the string, plus two additional byte offset data, based on storage needs. For example, if a column is defined as varchar (25), the maximum number of characters supported at this time is 25, but the amount of storage is actually determined by the actual characters in the string. -Excerpt from the SQL Server-T-C basics Tutorial.
Here about Unicode character data types we need to focus on understanding. Let's first create a table, as follows:
CREATE TABLE unicodetype
(
firstname VARCHAR (5) NOT NULL,
LastName NVARCHAR (5) is not null
);
At this point we insert the data manually, insert normally, as follows:
INSERT dbo. Unicodetype
(FirstName, LastName)
VALUES (' 11111 ',--Firstname-varchar (5)
N ' ah's hair good '--lastname-nvarcha R (5)
)
The characters are completely inserted into the table as follows:
At this time we will FirstName, insert five Chinese try the following:
INSERT dbo. Unicodetype
(FirstName, LastName)
VALUES (' Reach get people ',--Firstname-varchar (5)
N ' reach get people '--lastname-nvar CHAR (5)
)
The following results appear:
That is, if the normal character type is defined as five characters in the above Varvhar, then we insert five Chinese characters and we will be intercepted and of course not inserted. Since the 1 non-English strings mentioned above are equivalent to two bytes, the Chinese occupies 10 bytes at this time and the varchar is only five characters, so a warning appears. Let's insert FirstName into two Chinese two English or a number to see
INSERT dbo. Unicodetype
(FirstName, LastName)
VALUES (' Up to 1 ',--Firstname-varchar (5)
N ' reach get people '--Lastname-nvarchar ( 5)
)
At this point inserted into the warning, because at this time two Chinese characters that is four bytes plus a number byte is just five bytes, so can be inserted normally, we look at LastName, by the knowledge, since English or the number is as a byte, So it should be good for us to insert four Chinese characters and two English bytes into lastname for just 10 bytes. Let's see:
INSERT dbo. Unicodetype
(FirstName, LastName)
VALUES (' Up to 1 ',--Firstname-varchar (5)
N ' reach get let ab '--Lastname-nvarchar (5)
)
Oh,shit, at this time actually made a mistake, as follows:
Our analysis of the above is not a reasonable and reasonable, is not the English language here to occupy a byte, we insert an English try.
INSERT dbo. Unicodetype
(FirstName, LastName)
VALUES (' Up to 1 ',--Firstname-varchar (5)
N ' reach get B '--lastname-nvarch AR (5)
)
The result is correct, practice is the only criterion to test the truth, from here we can see: in the regular character, one Chinese will be used as two bytes, an English will be used as a byte, but in Unicode, one Chinese will be used as two bytes, But an English language will also be used as a two byte. At this point we can conclude that the individual has always thought that in Unicode, English is stored as a byte, the experience is short.
In general characters and Unicode, one Chinese character is stored in two bytes, while in English, regular characters are stored in one byte, while Unicode remains stored in two bytes.
String functions
Functions that manipulate strings are substring, left, right, CHARINDEX, PATINDEX, REPLACE, Repicate, STUFF, UPPER, LOWER, RTRIM, LTRIM, FORMAT. For simple functions We skip, let's talk about a couple of places to watch out for.
A comparison between Len and Datalength
We first create the following test table
CREATE TABLE stringfun
(
firststr VARCHAR (max) not NULL,
secondstr TEXT not null
);
We insert test data
INSERT dbo. Stringfun
(Firststr, Secondstr)
VALUES (' I am jeffckywang, I come from the blog park, Focus on. NET technology ',-Firststr-varchar (max)
' I am jeffckywang, I come from the blog park, Focus on. NET technology '--secondstr-text
We first use the Len function to return the string length size of firststr and SECONDSTR
SELECT LEN (FIRSTSTR) as varcahrfieldsize from
dbo. Stringfun
SELECT LEN (SECONDSTR) as textfieldsize from
dbo. Stringfun
Great, there's been a mistake. The Len function was unable to manipulate text. We went on to look down.
SELECT datalength (FIRSTSTR) as varcahrfieldsize from
dbo. Stringfun
SELECT datalength (SECONDSTR) as textfieldsize from
dbo. Stringfun
The error is not reported at this time, and the result is 47 byte size. Since Len is not valid for text, we do not operate on text.
SELECT LEN (FIRSTSTR) as varcahrfieldsize from
dbo. Stringfun
SELECT datalength (SECONDSTR) as textfieldsize from
dbo. Stringfun
At this point the type of Varcahr firststr byte size is 31, why, see here we must have come to a sudden, in the above we talked about the regular characters in Chinese to a character of two byte size storage, but here actually returns the actual character size, of course, one is stored, one is retrieval, It's still a bit different, and we're not going to store Chinese in varchar. We can draw a conclusion here.
Conclusion: The DATALENGTH function is for text, and Len is for varchar, and invalid text is an error.
Here we have a special value that is not processed, that is null. So the question is, Len and Datalength, what's the size of the length, 0 or 0 Guineas?
We are here to test:
DECLARE @MyVar VARCHAR (a)
SET @MyVar = NULL
IF (len (@MyVar) = 0)
print ' Len of NULL is 0 '
ELSE
Print ' LEN of NULL is null '
The result we have above is that Len of NULL is Null,datalength no longer demonstrates.
Conclusion: The result of Len and datalength for null calculation is NULL.
Let's take a look at a little place where the difference is:
Select LEN (' Jeffckywang ') as ' LEN '
SELECT datalength (' Jeffckywang ') as ' datalength '
Conclusion: Len deletes trailing blanks, and Datalength does not
Comparison between Charindex and Patindex
Both CHARINDEX and PATINDEX string functions are queries that return the starting position of the specified matching string.
We first query a string, which exists in the table as follows:
Use AdventureWorks2012;
Go
SELECT CHARINDEX (' Worn ', documentsummary) as ' CHARINDEX ' from
production.document
WHERE changenumber = ;
Go
SELECT PATINDEX (' Worn ', documentsummary) as ' PATINDEX ' from
production.document
WHERE changenumber = 55 ;
Why did the CHARINDEX function find it, and Patindex did not query it? Now say the difference, both have two parameters, the second is a string to match, but the PATINDEX function must add the percent sign, wildcard character, before or after the matching string, and the CHARINDEX function is not required. as follows:
Use AdventureWorks2012;
Go
SELECT CHARINDEX (' Worn ', documentsummary) as ' CHARINDEX ' from
production.document
WHERE ChangeNumber =;
Go
SELECT PATINDEX ('%worn% ', documentsummary) as ' PATINDEX ' from
production.document
WHERE ChangeNumber = 55;
Conclusion: PATINDEX matching strings must be preceded or followed by a wildcard character, and charindex need not be added.
Summarize
This section focuses on data types in SQL and a few areas to be noted, short content, in-depth understanding, and we'll see you in the next section.
Above is the entire content of this article, I hope the content of this article for everyone's study or work can bring certain help, if there are questions you can message exchange, but also hope that a lot of support cloud Habitat community!