SQL Server intercepts strings and processes Chinese Characters

Source: Internet
Author: User

I. Environment Introduction

SQL ServerPRINT @ VERSIONMicrosoftSQLServer2012-11.0.2100.60 (X64) Feb10201219: 39: 15 Copyright (c) microsoftconfigurationenterpriseedition: Core-basedLicensing (64-bit) onWindowsNT6.1 (Build7601: ServicePack1) operating System ---------------- System Information ------------------ Operating System: Windows 7 Ultimate 64-bit (6.1, Build 7601) Service Pack 1 (7601. win7sp1_gdr.130828-1532) System Model: Aspire E1-471GProcessor: Intel (R) Core (TM) i5-3230M CPU @ 2.60 GHz (4 CPUs ),~ 2.6 GHzMemory: 4096 MB RAM


2. The feature filters out numbers from a bunch of strings containing Chinese characters and numbers.


3. Simulation
First, we prepare the test data. Note that all the data here is analog data with no actual meaning. The statement is as follows:
Create table # temp (name VARCHAR (80); insert into # tempVALUES ('wudaokou shop 123456'); insert into # tempVALUES ('wuyang store 123456 '); insert into # tempVALUES ('yangjiatun 3061 '); insert into # tempVALUES ('shilidi Dian 3062'); insert into # tempVALUES ('zhongguancun Dian 3063 '); insert into # tempVALUES ('beautiful shop 123456'); insert into # tempVALUES ('stone shop 123456'); insert into # tempVALUES ('huangcun shop 123456 '); insert into # tempVALUES ('dongpu Dian 100'); insert into # tempVALUES ('tianhe Dian 100'); insert into # tempVALUES ('People's road Plaza 100 '); insert into # tempVALUES ('Community center 3070 '); insert into # tempVALUES ('zhuhai 3071'); insert into # tempVALUES ('lido 100 '); insert into # tempVALUES ('month 3073 '); insert into # tempVALUES ('old district 3074'); insert into # tempVALUES ('new city 3075 '); insert into # tempVALUES ('shuijinggou 3076 ');


Then, we observe the data and find that the data is regular. The number is a number, which is 4 characters long. The front of the number contains nine characters: store, farm, heart, city, capital, month, district, city, and trench.
We try to use the SQL Server built-in functions Substring, Charindex, Rtrim, and Ltrim to filter out the strings that appear most frequently (store.
The statement is as follows:
SELECT Rtrim (Ltrim (Substring (name, Charindex ('store', name) + 1, Len (name) AS nameINTO # t1FROM # temp

The following describes how to use these functions:


Substring

Returns the part of a character expression that starts at the specified position and has the specified length. The position parameter and the length parameter must evaluate to integers.


Syntax

SUBSTRING (character_expression, position, length)


Arguments

Character_expression

Is a character expression from which to extract characters.

Position

Is an integer that specifies where the substring begins.

Length

Is an integer that specifies the length of the substring as number of characters.


Result Types

DT_WSTR


Charindex
Searches an expression for another expression and returns its starting position if found.

Syntax

CHARINDEX (expressionToFind, expressionToSearch [, start_location])


Arguments
ExpressionToFind
Is a character expression that contains the sequence to be found. expressionToFind is limited to 8000 characters.
ExpressionToSearch
Is a character expression to be searched.
Start_location
Is an integer or bigint expression at which the search starts. If start_location is not specified, is a negative number, or is 0, the search starts at the beginning of expressionToSearch.

Return Types
Bigint if expressionToSearch is of the varchar (max), nvarchar (max), or varbinary (max) data types; otherwise, int.

Rtrim
Returns a character expression after removing trailing spaces.

RTRIM does not remove white space characters such as the tab or line feed characters. unicode provides code points for each different types of spaces, but this function recognizes only the Unicode code point 0x0020. when double-byte character set (DBCS) strings are converted to Unicode they may include space characters other than 0x0020 and the function cannot remove such spaces. to remove all kinds of spaces, you can use the Microsoft Visual Basic. NET RTrim method in a script run from the Script component.

Syntax
RTRIM (character expression)

Arguments
Character_expression
Is a character expression from which to remove spaces.

Result Types
DT_WSTR

Ltrim
Returns a character expression after removing leading spaces.

LTRIM does not remove white-space characters such as the tab or line feed characters. unicode provides code points for each different types of spaces, but this function recognizes only the Unicode code point 0x0020. when double-byte character set (DBCS) strings are converted to Unicode they may include space characters other than 0x0020 and the function cannot remove such spaces. to remove all kinds of spaces, you can use the Microsoft Visual Basic. NET LTrim method in a script run from the Script component.

Syntax
LTRIM (character expression)

Arguments
Character_expression
Is a character expression from which to remove spaces.

Result Types
DT_WSTR

Now, we can see that all the strings containing the store have been filtered out.
SELECT * FROM # t13059306030613062306330643065306630673068 people's road Plaza 3069 community center 3070 Zhuhai City 3071 Lido 3072 Xiaoyue 3073 old district 3074 New City 3075 shuijinggou 3076


Then we process strings that contain fields, hearts, cities, months, districts, cities, and troughs in sequence. The statements and processing results are as follows:
SELECT * FROM # t1WHERE name LIKE n' % [1-region] % 'collate Chinese_PRC_BIN Renmin Road Plaza 3069 community center 3070 Zhuhai 3071 Lido 3072 Xiaoyue 3073 old district 3074 New City 3075 shuijinggou 3076 SELECT Rtrim (ltrim (Substring (name, charindex ('farm ', name) + 1, Len (name )))) AS nameINTO # t2FROM # t1SELECT * FROM # t2WHERE name LIKE n' % [1-region] % 'collate Chinese_PRC_BIN community center 3070 Zhuhai 3071 Lido 3072 Xiaoyue 3073 old district 3074 New City 3075 shuijinggou 3076 SELECT rtrim (Ltrim (Substring (name, charindex ('heart', name) + 1, Len (name )))) AS nameINTO # t3FROM # t2SELECT * FROM # t3WHERE name LIKE n' % [1-Snapshot] % 'collate Chinese_PRC_BIN Zhuhai City 3071 Lido 3072 Xiaoyue 3073 old district 3074 New City 3075 shuijinggou 3076 SELECT Rtrim (Ltrim (Substring (name, charindex ('city', name) + 1, Len (name )))) AS nameINTO # t4FROM # t3SELECT * FROM # t4WHERE name LIKE n' % [1-region] % 'collate Chinese_PRC_BIN Lido 3072 old district 3073 New City 3074 shuijinggou 3075 SELECT Rtrim (Ltrim (Substring (name, charindex ('dual', name) + 1, Len (name )))) AS nameINTO # t5FROM # t4SELECT * FROM # t5WHERE name LIKE n' % [1-region] % 'collate Chinese_PRC_BIN Xiaoyue 3073 old district 3074 New City 3075 shuijinggou 3076 SELECT Rtrim (Ltrim (Substring (name, charindex ('month', name) + 1, Len (name )))) AS nameINTO # t6FROM # t5SELECT * FROM # t6WHERE name LIKE n' % [1-region] % 'collate Chinese_PRC_BIN old district 3074 New City 3075 shuijinggou 3076 SELECT Rtrim (Ltrim (Substring (name, charindex ('region', name) + 1, Len (name )))) AS nameINTO # t7FROM # t6SELECT * FROM # t7WHERE name LIKE n' % [1-Snapshot] % 'collate Chinese_PRC_BIN New City 3075 shuijinggou 3076 SELECT Rtrim (Ltrim (Substring (name, charindex ('city', name) + 1, Len (name )))) AS nameINTO # t8FROM # t7SELECT * FROM # t8WHERE name LIKE n' % [1-Snapshot] % 'collate Chinese_PRC_BIN shuijinggou 3076 SELECT Rtrim (Ltrim (Substring (name, charindex ('channel', name) + 1, Len (name )))) AS nameINTO # t9FROM # t8SELECT * FROM # t9WHERE name LIKE n' % [1-Snapshot] % 'collate Chinese_PRC_BIN -- no record


This is the final processing result. After filtering out the numbers, I can associate them with the database table to obtain the desired data.
SELECT *INTO   #resultFROM   #t9SELECT *FROM   #result name305930603061306230633064306530663067306830693070307130723073307430753076SELECT s.xxx,       s.xxxFROM   xx s       JOIN #result r         ON s.xxx = r.nameWHERE  s.xxx = 0; 


Iv. Summary In this article, the number is actually two core codes. The first one is to use the built-in function of SQL Server to filter out the specified number. The statement is as follows:
SELECT Rtrim (Ltrim (Substring (name, Charindex ('store', name) + 1, Len (name) AS nameINTO # t1FROM # temp


The second is to determine whether it contains Chinese characters. The statement is as follows:
SELECT * FROM # t1WHERE name LIKE n' % [1-Snapshot] % 'collate Chinese_PRC_BIN


Discovering and summarizing these tips at work will help you get twice the result with half the effort.

Good Luck!


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.