Getting started with ASP thieves (Remote Data Acquisition)

Source: Internet
Author: User

Here, the "thief" refers to the use of the powerful functions provided by the XMLHTTP component in XML in ASP to capture data (images, webpages, and other files) on a remote website to a local device, after various processing, it is displayed on the page or stored in the database.Program. You can use this thief program to complete tasks that seem completely impossible in the past. For example, you can steal a website page and change it to your own page, or you can use some data of a website (ArticleImages) are saved to the local database for use. "Thieves" have the following advantages: they do not need to maintain the website, because the data in the thief program comes from other websites and will be updated with the updates of the website; they can save a lot of server resources, generally, a thief program has several files, and all the webpage content comes from other websites. Disadvantages: unstable. If an error occurs on the target website, the program will also go wrong. If the target website is upgraded and maintained, the thief program must be modified accordingly; speed, because it is a remote call, it must be slower than reading data on the local server. How about it? It sounds amazing? Now let's start to learn more about the "thief" program!

Let's take a simple look at the weather forecast program on the QQ website.

CodeAs follows:

<%
On Error resume next
Server. scripttimeout = 9999999
Function gethttppage (PATH)
T = getbody (PATH)
Gethttppage = bytestobstr (T, "gb2312 ")
End Function

'First, perform initialization settings for the thief program. The above code is used to ignore all non-fatal errors, set the time-out period of the thief program to a long time (so there will be no running time-out error), convert the original default UTF-8 encoding to gb2312 encoding, otherwise, it is garbled to call a webpage with Chinese characters directly using the XMLHTTP component.

Function getbody (URL)
On Error resume next
Set retrieval = Createobject ("Microsoft. XMLHTTP ")
With Retrieval
. Open "get", URL, false ,"",""
. Send
Getbody =. responsebody
End
Set retrieval = nothing
End Function

'Then call the XMLHTTP component to create an object and perform initialization settings.

Function bytestobstr (body, cset)
Dim objstream
Set objstream = server. Createobject ("ADODB. Stream ")
Objstream. type = 1
Objstream. mode = 3
Objstream. Open
Objstream. Write body
Objstream. Position = 0
Objstream. type = 2
Objstream. charset = cset
Bytestobstr = objstream. readtext
Objstream. Close
Set objstream = nothing
End Function

Function newstring (wstr, strng)
Newstring = instr (lcase (wstr), lcase (strng ))
If newstring <= 0 then newstring = Len (wstr)
End Function

'To process the captured data, you need to call the ADODB. Stream component and perform initialization settings. %>

'Below is the page Display Section

<%
Dim wstr, STR, URL, start, over, City
'Defines the variables to be used.

City = request. querystring ("ID ")
'The ID variable returned by the Program (that is, the selected city) is assigned to the ID

Url = "http://appnews.qq.com/cgi-bin/news_qq_search? City = "& City &""
'Set the page address to be crawled here. Of course, you can also directly specify an address without using variables.

Wstr = gethttppage (URL)
'Retrieve all data on the specified page

Start = newstring (wstr, "<HTML> ")
'Set the header of the data to be processed. This variable should be set according to different situations. For details, seeSource codeTo confirm. Because we need to capture the whole page in this program, we set it to capture all the pages. Note: The set content must be unique and cannot be repeated.

Over = newstring (wstr, "'Corresponds to the end of the data to be processed. Similarly, the set content must be unique on the page.

Body = mid (wstr, start, over-start)
'Set the display page range

'The next is the time to use Qian Kun to move ++. With replace, you can use some characters to replace the specified characters in the data.

Body = Replace (body, "skin1", "Weather Forecast-SK network ")
Body = Replace (body, "http://appnews.qq.com/cgi-bin/news_qq_search? City "," Tianqi. asp? ID ")

'The replacement has been completed in this program. If there are other requirements, you can continue with similar replacement operations.

Response. Write body
%>

After replacing the content to be modified, you can display the modified content on the page. So far the program has ended

Program usage and results: remove the preceding code and save it as Tianqi. asp. upload it to a space that supports ASP and XML, and run it in a browser. You can further beautify the interface or optimize the program on the basis of this program.

The above are just some preliminary applications about the XMLHTTP component. In fact, there are many other functions that it can implement, such as saving remote images to a local server and working with ADODB. the stream component can save the obtained data to the database. Thieves have a wide range of functions and uses. But you cannot do anything illegal!

Someone may ask, is this "thief" program just a patent for ASP? Moreover, PHP can achieve the same effect through the fopen function. Due to various features of PHP, the written thief program has obvious advantages over ASP in terms of volume and execution efficiency, however, it is not described here.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.