Collection principles-collection technology-XMLHTTP

Source: Internet
Author: User

Recently, collection has been a hot topic, from news thieves to music thieves to news collection and flash collection. Many people are interested in collection, to serve everyone, I am also writing a collection set Program The name is the intention collection program. Now I will talk about the technologies used for collection.

The following is not a very advanced XMLHTTP technology. I also want to talk a little bit about a collection of several pieces of content that need to be used.
If you want to learn more, go to www.google.com to search for XMLHTTP. You will get more help. If you have any questions, post them on the forum.

The following describes how to obtain data on the Internet and does not involve data processing.

First XMLHTTP Technology

Http://www.0579.info/study/exploitation/net/58685.htm

The address above, that articleArticleThe basic principles have been described in detail, but we generally do not need to know much about them at the beginning. As long as it is practical, when it is not enough, it is also urgent to find the relevant documents.

First, we need to create an XMLHTTP object
The XMLHTTP component released by Microsoft already has many versions. I know the following:

"Msxml2.serverxmlhttp. 4.0"
"Msxml2.serverxmlhttp. 3.0"
"Msxml2.serverxmlhttp"
"Msxml2.xmlhttp. 5.0"
"Msxml2.xmlhttp. 4.0"
"Msxml2.xmlhttp. 3.0"
"Msxml2.xmlhttp"
"Microsoft. XMLHTTP

There are so many components above, we certainly need to apply for the highest version of the object, then what can be done?
Below I will take out a paragraphCodeYou can see that he applied for an XMLHTTP object based on the highest version.

Dim arrprogid, prog, flag, xmlhttpcom

Arrprogid = array ("msxml2.serverxmlhttp. 4.0 "," msxml2.serverxmlhttp. 3.0 "," msxml2.serverxmlhttp "," msxml2.xmlhttp. 5.0 "," msxml2.xmlhttp. 4.0 "," msxml2.xmlhttp. 3.0 "," msxml2.xmlhttp "," Microsoft. XMLHTTP ")

For each prog in arrprogid
If (isobjinstalled (Prog) = true) then
Xmlhttpcom = prog
Exit
End if
Next

'// <Summary>
'// REM check if the component supports true or false
'// </Summary>
Public Function isobjinstalled (strclassstring)
On Error resume next

'// Set the initialization value

Isobjinstalled = false
Err = 0

'// Test code

Dim xtestobj
Set xtestobj = server. Createobject (strclassstring)
If 0 = err then isobjinstalled = true

'// Clear the applied object

Set xtestobj = nothing
Err = 0
End Function

The above code is the XMLHTTP object of the highest version supported by the current server.

The following describes the collection functions.

'Getfiletext is a collection function
Public Function getfiletext (URL)
On Error resume next 'when there is an error, continue executing the code
Dim HTTP 'defines Variables
'Set HTTP = server. Createobject (xmlhttpcom) 'Application Object
Set HTTP = server. Createobject ("Microsoft. XMLHTTP") 'write a version that is generally supported by the server.
HTTP. Open "get", URL, and false' open the object and wait for the server response in the get Mode
HTTP. Send () 'send
If HTTP. readystate <> 4 then', exit the function if the server does not respond.
Exit Function
End if

Getfiletext = bytes2bstr (HTTP. responsebody, "gb2312") 'converts the binary data stream to the text character format (gb2312)

Set HTTP = nothing 'delete an object
If err. Number <> 0 then err. clear' if an error occurs, clear the error.
End Function

'// <Summary>
'// Use ADODB. Stream to process the collected data and convert the binary file into text characters
'// </Summary>
Function bytes2bstr (VIN, cset)
Dim bytesstream, stringreturn
Set bytesstream = server. Createobject ("ADODB. Stream ")
Bytesstream. type = 2
Bytesstream. Open
Bytesstream. writetext vin
Bytesstream. Position = 0
Bytesstream. charset = cset
Bytesstream. Position = 2
Stringreturn = bytesstream. readtext
Bytesstream. Close
Set bytesstream = nothing
Bytes2bstr = stringreturn
End Function

Below I will define a PATH variable URL

Url = "http://ent.sina.com.cn/star/mainland/more.html ";

The above is a Web site. If we want to collect and display the above address, we can do this.

Url = "http://ent.sina.com.cn/star/mainland/more.html ";

Response. Write getfiletext (URL)

In this way, the content of the above URL can be collected.
Is it easy?

What should I do after the collected data?
How to differentiate data? If you get the data you want, if you store the data into the database?
This is a problem that needs to be analyzed and explained in the future. You should pay attention to it in the warehouse receiving, and use a positive expression to process data.

Attach the source file of the above code. You can download the source file and run it to see if it can be collected to the database.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.