Detailed description of the data type in SQL Server, SQL Server

Source: Internet
Author: User

Detailed description of the data type in SQL Server, SQL Server

Preface

In the previous articles, we have explained the index-related knowledge. In this section, we will continue to explain the following content, including brief content and in-depth understanding.

Data Type

SQL Server supports two character data types: Regular and Unicode. Conventional data types include CHAR and VARCHAR, and Unicode data types include NCAHR and NVARCHAR. Each character of a regular character is stored in 1 byte, while each character of Unicode data requires 2 bytes. Regular character columns are limited to English only, while Unicode is for multiple languages. The text representation of the two character data types is also different. When representing the regular character text, you only need to use single quotes, such as 'hello, my name is JeffckyWang, i'm from cnblogs, and for Unicode character text, you need to specify the character n as the prefix, that is, n'hello, my name is JeffckyWang, I'm from cnblogs '.

Any data type (CHAR and NCHAR) with no VAR element in the name has a fixed length, that is, SQL Server retains row space according to the column definition size, rather than the actual characters in the character. For example, if a column is defined as CHAR (25), SQL Server reserves 25 characters in the row regardless of the length of the stored string.

The data type (VARCHAR and NVARCHAR) containing the VAR element in the name has a variable length, that is, the SQL Server uses as many storage strings as possible in the row according to the storage requirements, two additional bytes of offset data are added at the same time. For example, if a column is defined as VARCHAR (25), the maximum number of characters supported is 25, but the storage size is determined based on the actual characters in the string. -Excerpt from SQL Server 2012 T-SQL basics tutorial.

Here we need to understand the Unicode character data type. Create a table first, as shown below:

CREATE TABLE UnicodeType( firstname VARCHAR(5) NOT NULL, lastname NVARCHAR(5) NOT NULL);

In this case, we manually insert the data and insert the data normally as follows:

INSERT dbo. UnicodeType (firstname, lastname) VALUES ('000000', -- firstname-varchar (5) n', send a good response '-- lastname-nvarchar (5 ))

All characters are completely inserted into the table, as shown below:

In this case, insert firstname into five Chinese characters and try the following:

INSERT dbo. UnicodeType (firstname, lastname) VALUES ('get people ', -- firstname-varchar (5) n' to get people' -- lastname-nvarchar (5 ))

The following result is displayed:

That is to say, the regular character type such as VARVHAR is defined as five characters. At this time, if we insert five Chinese characters, it will be truncated, and of course it will not be inserted. Because the preceding statement clearly states that a non-English string is equivalent to two bytes, the Chinese character occupies 10 bytes, and VARCHAR only has five characters, so a warning is reported. Let's Insert the firstname into two Chinese characters and two English letters or numbers.

INSERT dbo. UnicodeType (firstname, lastname) VALUES ('reaching 1', -- firstname-varchar (5) n' to bring people '-- lastname-nvarchar (5 ))

This is a warning, because at this time, two Chinese characters, that is, four bytes plus a numeric byte, are inserted normally. Let's take a look at lastname, since English or numbers are treated as one byte, it is recommended that lastname be inserted with four Chinese characters and two English bytes with exactly 10 bytes. Let's take a look:

INSERT dbo. UnicodeType (firstname, lastname) VALUES ('reaching 1', -- firstname-varchar (5) n' to get AB '-- lastname-nvarchar (5 ))

Oh, shit, an error occurred at this time, as shown below:

Isn't the above analysis justified? Isn't English a byte here? Let's try inserting an English one.

INSERT dbo. UnicodeType (firstname, lastname) VALUES ('reaching 1', -- firstname-varchar (5) n' to achieve B '-- lastname-nvarchar (5 ))

The result is correct. practice is the only criterion for verifying truth. From this we can see that a Chinese character is used as two bytes, an English is used as a byte, but in Unicode, a Chinese character is used as two bytes, but an English character is also used as two bytes. So far, we can draw a conclusion that I have always thought that in Unicode, English is stored as a byte.

A common character and a Chinese character in Unicode are stored in two bytes. For English, a common character is stored in one byte, while Unicode is still stored in two bytes.

String Functions

Functions for string operations include SUBSTRING, LEFT, RIGHT, CHARINDEX, PATINDEX, REPLACE, REPICATE, STUFF, UPPER, LOWER, RTRIM, LTRIM, and FORMAT. For simple functions, let's take a look at some important points.

LEN comparison with DATALENGTH

First, create the following test table.

CREATE TABLE StringFun( firststr VARCHAR(max) NOT NULL, secondstr TEXT NOT NULL);

We insert Test Data

INSERT dbo. stringFun (firststr, secondstr) VALUES ('I am JeffckyWang, I am from the blog Park and focus on it. NET technology ', -- firststr-varchar (max)' I am JeffckyWang, I am from the blog Park, focused on. NET technology '-- secondstr-text)

We first use the LEN function to return the length of the firststr and secondstr strings.

SELECT LEN(firststr) AS VARCAHRFieldSize FROM dbo.StringFunSELECT LEN(secondstr) AS TEXTFieldSize FROM dbo.StringFun

Great. An error occurred. LEN function cannot operate on TEXT. Let's look down.

SELECT DATALENGTH(firststr) AS VARCAHRFieldSize FROM dbo.StringFunSELECT DATALENGTH(secondstr) AS TEXTFieldSize FROM dbo.StringFun

No error is reported. The result is 47 bytes. Since LEN is invalid for text, we do not operate on text.

SELECT LEN(firststr) AS VARCAHRFieldSize FROM dbo.StringFunSELECT DATALENGTH(secondstr) AS TEXTFieldSize FROM dbo.StringFun

At this time, the firststr of the type VARCAHR has 31 bytes. Why? We may suddenly realize that the general characters mentioned above will be stored in one character or two bytes for Chinese characters, however, the actual character size is actually returned here. Of course, the actual character size is stored, and the retrieval is a bit different. At the same time, Chinese characters are not stored in VARCHAR. Here we can draw a conclusion.

Conclusion: The DATALENGTH function is for TEXT and LEN is for VARCHAR. If the function is invalid for TEXT, an error is returned.

Here we still have a special value that is not processed, that is, NULL. So the question is: What is the length of LEN and DATALENGTH for NULL? Is it 0 or not 0?

Let's test:

DECLARE @MyVar VARCHAR(10)SET @MyVar = NULLIF (LEN(@MyVar) = 0)PRINT 'LEN of NULL is 0'ELSEPRINT 'LEN of NULL is NULL'

The above result is LEN of NULL is NULL, and DATALENGTH will not be demonstrated.

Conclusion: LEN and DATALENGTH calculate NULL.

Let's take a look at a small difference between the two:

SELECT LEN('JeffckyWang ') AS 'LEN'SELECT DATALENGTH('JeffckyWang ') AS 'DATALENGTH'

Conclusion: LEN deletes trailing spaces, but DATALENGTH does not.

CHARINDEX and PATINDEX comparison

Both CHARINDEX and PATINDEX string functions query and return the starting position of the specified matched string.

Query a string that exists in the table as follows:

USE AdventureWorks2012;GOSELECT CHARINDEX('Worn', DocumentSummary) AS 'CHARINDEX'FROM Production.DocumentWHERE ChangeNumber = 55;GOSELECT PATINDEX('Worn', DocumentSummary) AS 'PATINDEX'FROM Production.DocumentWHERE ChangeNumber = 55;

Why is the CHARINDEX function not found while the PATINDEX function? Now let's talk about the difference between the two. Both have two parameters. The second parameter is a string to be matched, but the PATINDEX function must add a percent sign (wildcard) before or after the string to be matched, the CHARINDEX function is not required. As follows:

USE AdventureWorks2012;GOSELECT CHARINDEX('Worn', DocumentSummary) AS 'CHARINDEX'FROM Production.DocumentWHERE ChangeNumber = 55;GOSELECT PATINDEX('%Worn%', DocumentSummary) AS 'PATINDEX'FROM Production.DocumentWHERE ChangeNumber = 55;

Conclusion: wildcard characters must be added before, after, or before, the PATINDEX match string. CHARINDEX does not need to be added.

Summary

This section describes the data types in SQL and some notes, including brief content and in-depth understanding.

The above is all the content of this article. I hope the content of this article will help you in your study or work. If you have any questions, you can leave a message and share it with us!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.