sql-a special string decomposition topic

Source: Internet
Author: User

The subject is not a direct string disassembly,

The scenario is as follows, with a field in the table that represents the set of countries affected by the event, separated by commas, and unfortunately some countries themselves have commas, so how exactly do they find them when normalized?

The following code has certain limitations. But basically enough.

The following code is supported by the analysis function lag and lead as well as the cte,sqlserver2012 and later versions, and Oracle seems to support it more than 10g.

Main ideas:

For the decomposition of a string, you can use the numeric auxiliary table, then cross join brush Copy, and then the root separator where it appears and then cut out the string to the stuff we need. (The recursive CTE I use in the solution to handle finding the corresponding location)

One more step now is to verify and remove parts of the disassembly to the part that does not meet the requirements.

The advantage of using lag and lead is that you do not need to use the self-connection to find the next piece of data.

The principle of the problem is how to connect to the right country, then take long-term, otherwise take short items.

The code is as follows:

 --preparing sample tables and dataDrop Tablemy_countries;Drop TableValid_country; Create TableMy_countries (RIDint, COUNTRY_NAME_CCvarchar( $));Insert  intoMy_countries (RID,COUNTRY_NAME_CC)Values(1,'china,test, public of');Insert  intoMy_countries (RID,COUNTRY_NAME_CC)Values(2,'us, public Of,china,evan, public of'); Create TableValid_country (CIDint, country_namevarchar( -));Insert  intoValid_country (Cid,country_name)Values(1,' China');Insert  intoValid_country (Cid,country_name)Values(2,'test, public of');Insert  intoValid_country (Cid,country_name)Values(3,'Evan, public of');Insert  intoValid_country (Cid,country_name)Values(4,'us, public of');Insert  intoValid_country (Cid,country_name)Values(5,'Evan');--select * from My_countries;--select * from Valid_country;

The correct result is:

 withSplit_country as(SELECTRID,1  asLVL,1  asStartpos,CHARINDEX(',', COUNTRY_NAME_CC+',')-1  asEndpos frommy_countriesUNION  AllSELECTSC. RID,LVL+1  asLvl,endpos+2,CHARINDEX(',', COUNTRY_NAME_CC+',', Endpos+2)-1 frommy_countries CCJOINsplit_country SC onCc. RID=SC. RIDWHERE CHARINDEX(','Cc. Country_name_cc+',', Endpos+2)>0), Cte_country as (SELECTRid,lvl,startpos,endpos,lead (Endpos,1) Over(PARTITION byRIDORDER  byLVL) asNextendpos fromsplit_country), CTE as (SELECTMC. RID,SC.LVL, Case  whenNextendpos is  not NULL  and EXISTS(SELECT *  fromValid_country VCWHEREVc. Country_name= SUBSTRING(Country_name_cc,startpos,nextendpos-Startpos+1)) ThenSUBSTRING(Country_name_cc,startpos,nextendpos-Startpos+1)ELSESUBSTRING(MC. Country_name_cc,startpos,endpos-Startpos+1)END asCountry fromMy_countries MCJOINcte_country SC onMc. RID=SC. RID), Check_valid as (SELECT  Case  when CHARINDEX(',', LAG (Country,1) Over(PARTITION byRIDORDER  byLVL))>0  Then 0 ELSE 1 END  asISVALID,*  fromCTE)SELECTCV. RID,CV. COUNTRY,VC. Cid fromCheck_valid CVJOINvalid_country VC onCV. Country=VC. Country_name andISVALID=1 ORDER  byRID;

Another option, based on the first one, is slightly modified:

 withSplit_country as(SELECTRID,1  asLVL,1  asStartpos,CHARINDEX(',', COUNTRY_NAME_CC+',')-1  asEndpos frommy_countriesUNION  AllSELECTSC. RID,LVL+1  asLvl,endpos+2,CHARINDEX(',', COUNTRY_NAME_CC+',', Endpos+2)-1 frommy_countries CCJOINsplit_country SC onCc. RID=SC. RIDWHERE CHARINDEX(','Cc. Country_name_cc+',', Endpos+2)>0), Cte_country as (SELECTRid,lvl,startpos,endpos,lead (Endpos,1) Over(PARTITION byRIDORDER  byLVL) asNextendpos fromsplit_country), CTE as (SELECTMC. RID,SC.LVL,SUBSTRING(MC. Country_name_cc,startpos,endpos-Startpos+1) asCountry,SUBSTRING(Country_name_cc,startpos,nextendpos-Startpos+1) asCOUNTRY2 fromMy_countries MCJOINcte_country SC onMc. RID=SC. RID)SELECTCTE. RID,VC. COUNTRY_NAME,VC. CID fromCTEJOINvalid_country VC on( Case  when EXISTS(SELECT *  fromValid_country XWHEREX.country_name=Cte. COUNTRY2) ThenCTE. COUNTRY2ELSECte. CountryEND)=VC. country_name;

 

sql-a special string decomposition topic

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.