ASP keyword acquisition (various search engines, GB2312 and UTF-8)

Source: Internet
Author: User
Tags xslt

I don't know why the major search engine encoding is not the same now. Of course not GB2312 is the UTF-8. encoding problem is a headache... headache is not terrible...

We obtain keywords, which are generally analyzed by visiting the url of the page. For example:

Http://www.google.com/search? Hl = zh-CN & q = % E5 % AD % A4 % E7 % 8B % AC & lr =

You must know that this is encoded using urlencode.

We can get the information, which requires two steps. the first step is to perform urldecode. When our common parameters are alive, this is done by ASP, but now we have to perform manual decoding.

There are a lot of online functions, but are aimed at GB2312 page solution GB2312.UTF-8. for this, we can easily first decode, and then judge its encoding according to the search engine, if it is a UTF-8 and then convert to GB2312.

But since my website is UTF-8 page. while the UTF-8 page I found only the urldecode encoding for the characters in the decoding UTF-8. after a long pause, I can only use the worst method to submit the split keywords to an ASP page of GB2312 using xmlhttp, and then garbled code (GB2312) then convert GB2312 to UTF-8.

The main implementation code is as follows.

Public Function GetSearchKeyword (RefererUrl) Search Keyword

If RefererUrl = "" or len (RefererUrl) <1 then exit function

On error resume next

Dim re

Set re = New RegExp

Re. IgnoreCase = True

Re. Global = True

Dim a, B, j

Fuzzy search keyword. This method is fast and has a large range.

Re. pattern = "(word = ([^ &] *) | q = ([^ &] *) | p = ([^ &] *) | query = ([^ &] *) | name = ([^ &] *) | _ searchkey = ([^ &] *) | baidu. *? W = ([^ &] *)"

Set a = re. Execute (RefererUrl)

If a. Count> 0 then

Set B = a (a. Count-1). SubMatches

For j = 1 to B. Count

If Len (B (j)> 0 then

If instr (1, RefererUrl, "google", 1) then

GetSearchKeyword = Trim (U8Decode (B (j )))

Elseif instr (1, refererurl, "yahoo", 1) then

GetSearchKeyword = Trim (U8Decode (B (j )))

Elseif instr (1, refererurl, "yisou", 1) then

GetSearchKeyword = Trim (getkey (B (j )))

Elseif instr (1, refererurl, "3721", 1) then

GetSearchKeyword = Trim (getkey (B (j )))

Else

GetSearchKeyword = Trim (getkey (B (j )))

End if

Exit Function

End if

Next

End If

If err then

Err. clear

GetSearchKeyword = RefererUrl

Else

GetSearchKeyword = ""

End if

End Function

Function URLEncoding (vstrIn)

Dim strReturn, I, thischr

StrReturn = ""

For I = 1 To Len (vstrIn)

ThisChr = Mid (vStrIn, I, 1)

If Abs (Asc (ThisChr) <& HFF Then

StrReturn = strReturn & ThisChr

Else

InnerCode = Asc (ThisChr)

If innerCode <0 Then

InnerCode = innerCode + & H10000

End If

Hight8 = (innerCode And & HFF00) \ & HFF

Low8 = innerCode And & HFF

StrReturn = strReturn & "%" & Hex (Hight8) & "%" & Hex (Low8)

End If

Next

URLEncoding = strReturn

End Function

Function getkey (key)

Dim oreq

Set oreq = CreateObject ("MSXML2.XMLHTTP ")

OReq. open "POST", "http: //" & WebUrl & "/system/ShowGB2312XML. asp? A = "& key, false

OReq. send

Getkey = UTF2GB (oReq. responseText)

End function

Function chinese2unicode (Str)

Dim I

Dim Str_one

Dim Str_unicode

For I = 1 to len (Str)

Str_one = Mid (Str, I, 1)

Str_unicode = Str_unicode & chr (38)

Str_unicode = Str_unicode & chr (35)

Str_unicode = Str_unicode & chr (1, 120)

Str_unicode = Str_unicode & Hex (ascw (Str_one ))

Str_unicode = Str_unicode & chr (59)

Next

Response. Write Str_unicode

End function

Function UTF2GB (UTFStr)

Dim dig, GBSTR

For Dig = 1 to len (UTFStr)

If mid (UTFStr, Dig, 1) = "%" then

If len (UTFStr)> = Dig + 8 then

GBStr = GBStr & ConvChinese (mid (UTFStr, Dig, 9 ))

Dig = Dig + 8

Else

GBStr = GBStr & mid (UTFStr, Dig, 1)

End if

Else

GBStr = GBStr & mid (UTFStr, Dig, 1)

End if

Next

UTF2GB = GBStr

End function

Function ConvChinese (x)

Dim a, I, j, DigS, Unicode

A = split (mid (x, 2), "% ")

I = 0

J = 0

For I = 0 to ubound ()

A (I) = c16to2 (A (I ))

Next

For I = 0 to ubound (A)-1

DigS = instr (A (I), "0 ")

Unicode = ""

For j = 1 to DigS-1

If j = 1 then

A (I) = right (A (I), len (A (I)-DigS)

Unicode = Unicode & A (I)

Else

I = I + 1

A (I) = right (A (I), len (A (I)-2)

Unicode = Unicode & A (I)

End if

Next

If len (c2to16 (Unicode) = 4 then

ConvChinese = ConvChinese & chrw (int ("& H" & c2to16 (Unicode )))

Else

ConvChinese = ConvChinese & chr (int ("& H" & c2to16 (Unicode )))

End if

Next

End function

Function U8Decode (enStr)

Input a heap of strings separated by %, which are first divided into arrays and sorted by UTF-8 rules.

Enter: E5 85 B3 key E9 94 AE word E5 AD 97

Output: Key B9D8, BCFC, D7D6

Dim c, I, i2, v, deStr, WeiS

For I = 1 to len (enStr)

C = Mid (enStr, I, 1)

If c = "%" then

V = c16to2 (Mid (enStr, I + 1, 2 ))

Determine the position where 0 appears for the first time,

It may be 1 (single byte), 3 (3-1 byte), 4, 5, 6, and 7. It cannot be 2 or greater than 7.

Theoretically, 7 is not more than 3.

WeiS = instr (v, "0 ")

V = right (v, len (v)-WeiS) first remove the leftmost WeiS

I = I + 3

For i2 = 2 to WeiS-1

C = c16to2 (Mid (enStr, I + 1, 2 ))

C = right (c, len (c)-2) Remove the two leftmost

V = v & c

I = I + 3

Next

If len (c2to16 (v) = 4 then

DeStr = deStr & chrw (c2to10 (v ))

Else

DeStr = deStr & chr (c2to10 (v ))

End if

I = I-1

Else

If c = "+" then

DeStr = deStr &""

Else

DeStr = deStr & c

End if

End if

Next

U8Decode = deStr

End function

Function c16to2 (x)

This function is used to convert hexadecimal to binary, can be any length, the general conversion of UTF-8 is two length, such as A9

For example, if you enter "C2" and convert it to "11000010", where 1100 is "c" and 12 (1100) in 10 hexadecimal format, then 2 (10) if the number of digits is less than four, it must be merged into (0010 ).

Dim tempstr

Dim I: I = 0 temporary pointer

For I = 1 to len (trim (x ))

Tempstr = c10to2 (cint ("& h" & mid (x, I, 1 ))))

Do while len (tempstr) <4

Tempstr = "0" & tempstr if there are less than four digits, then fill in the four digits

Loop

C16to2 = c16to2 & tempstr

Next

End function

Function c2to16 (x)

The hexadecimal to hexadecimal conversion. Every four 0 or 1 is converted into a hexadecimal letter. The input length cannot be a multiple of 4.

Dim I: I = 1 temporary pointer

For I = 1 to len (x) step 4

C2to16 = c2to16 & hex (c2to10 (mid (x, I, 4 )))

Next

End function

Function c2to10 (x)

The pure binary to 10-hexadecimal conversion does not take into account the four-digit anterior zeros required for the hexadecimal conversion.

This function is useful! It will also be used in the future. People who have worked on communication and hardware should know.

Here, strings are used to represent binary data.

C2to10 = 0

If x = "0" then exit function, if it is 0, it will be done if it is 0.

Dim I: I = 0 temporary pointer

For I = 0 to len (x)-1 otherwise, we will use 8421 code for computation. This will happen when I first learned the computer. I miss Mr. Xie daojian, who taught us how to do this!

If mid (x, len (x)-I, 1) = "1" then c2to10 = c2to10 + 2 ^ (I)

Next

End function

Function c10to2 (x)

10-to-2 Conversion

Dim sign, result

Result = ""

Symbol

Sign = sgn (x)

X = abs (x)

If x = 0 then

C10to2 = 0

Exit function

End if

Do until x = "0"

Result = result & (x mod 2)

X = x \ 2

Loop

Result = strReverse (result)

If sign =-1 then

C10to2 = "-" & result

Else

C10to2 = result

End if

End function

Function URLDecode (enStr)

Dim deStr, strSpecial

Dim c, I, v

DeStr = ""

StrSpecial = "! "" # $ % & () * +,/:; <=>? @ [\] ^ '{| }~ %"

For I = 1 to len (enStr)

C = Mid (enStr, I, 1)

If c = "%" then

V = eval ("& h" + Mid (enStr, I + 1, 2 ))

If inStr (strSpecial, chr (v)> 0 then

DeStr = deStr & chr (v)

I = I + 2

Else

V = eval ("& h" + Mid (enStr, I + 1, 2) + Mid (enStr, I + 4, 2 ))

DeStr = deStr & chr (v)

I = I + 5

End if

Else

If c = "+" then

DeStr = deStr &""

Else

DeStr = deStr & c

End if

End if

Next

URLDecode = deStr

End function

Many codes are online. The author cannot be found.

PS: Now summer vacation is acceptable. for family reasons, I don't want to stay in my city. the senior high school entrance exam has reached the local key points. you do not want to name the city. otherwise, an acquaintance will be recruited. if the school is not in Shandong Province, Can you contact us.

QQ: 32113739

I am very interested in the program, but I have only one or more X members in the Information Olympics. because I think technology should not be reflected in the so-called competition, just as it should not be reflected in meaningless tests. electronic works have also made the provincial level one .. but it is also common to learn... so as long as it is the general focus of the good .. I just don't want to be too close to home.

Now ASP is very skilled, although there are some knowledge defects, such as coding problems (Khan ...), however, the Internet is so large that I don't think I can get the so-called knowledge only in textbooks. I am also reading a book from ASP.net. If your school is a website, you can help me.

I am very enthusiastic about new technologies. Although they are known as people with aesthetic barriers, I want to see that the structure doll program will not vomit blood.

Forget it... paste it again.

Even the development of D Database + ASP-> xml + xslt-> xhtml + css is called CMS.

Http://www.joysou.com

I also used the FCK editor for CSDN. I found that I changed it only today. However, the FCK FIle System occasionally changed it.

This system will be released before the end of the summer vacation. But many friends say there is a problem with ease of use... many people don't know xslt. Khan...

Alas... if you cannot find the school. I may be wandering, maybe it will disappear. of course this is not a threat .. I hate my city, what I see there, and what I have done.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.