An introductory tutorial for ASP Thieves (remote Data acquisition) programs

Source: Internet
Author: User
Tags end variable
Programs | tutorials | Getting Started | data | Getting Started tutorial the "thief" here refers to the powerful functionality provided by the XMLHTTP component in XML in ASP, which crawls data (images, Web pages and other files) from remote Web sites to Local A class of programs that are displayed on a page or stored in a database after a variety of processing. You can use this kind of thief program to accomplish some seemingly impossible tasks in the past, such as cynical a page of a station to become your own page, or save some data (articles, pictures) of a station to the local database. The advantages of "thieves" are: no need to maintain the site, because the Thief program data from other sites, it will be updated with the site update, you can save a lot of server resources, the General Thief program on several files, all Web content is from other sites. The disadvantage is: instability, if the target site error, the program will also be wrong, and, if the target site to upgrade maintenance, then the Thief program to make corresponding changes; speed, because it is a remote call, speed and on the local server reading data than it is certainly slower. What, it sounds amazing, doesn't it? Let's start by learning some of the basics of "thief" programs. <BR><BR> let's get a little something simple to study, the weather program <BR><BR> code on the QQ website is as follows:<br><br><%<br> On Error Resume next<br>server.scripttimeout=9999999<br>function gethttppage (Path) <br>t = GetBody ( Path) <br>gethttppage=bytestobstr (t, "GB2312") <br>end function<br><br> ' first, some initialization settings for the Thief program, The effect of the above code is to ignore all non-fatal errors, the Thief program's running timeout is set very long (this will not be the error of running timeout), converting the original default UTF-8 encoding into GB2312 encoding, Otherwise, directly using the XMLHTTP component to invoke a page with Chinese characters will be garbled. <br><br>function getbody (URL) <br>on Error resume Next<br>set retrieval = CreateObject (" Microsoft.XMLHTTP ") &Lt;br>with retrieval <br> Open "Get", url, False, "", "" &LT;BR&GT; Send <br>getbody =. Responsebody<br>end with <br>set retrieval = Nothing <br>end function<br><br> ' The XMLHTTP component is then called to create an object and initialize the settings. <br><br>function bytestobstr (body,cset) <br>dim Objstream<br>set objstream = Server.CreateObject ("ADODB.stream") <br>objstream. Type = 1<br>objstream. Mode =3<br>objstream. Open<br>objstream. Write Body<br>objstream. Position = 0<br>objstream. Type = 2<br>objstream. Charset = Cset<br>bytestobstr = objstream. ReadText <br>objstream. Close<br>set objstream = nothing<br>end function<br><br>function Newstring (WSTR,STRNG) < Br>newstring=instr (LCase (WSTR), LCase (strng)) <br>if newstring<=0 then Newstring=len (WSTR) <br>end Function<br><br> ' handles crawled back data needs to invoke the ADODB.stream component and initialize the settings. %><br><br> ' Below is the page display section <br><br><%<br>dim wstr,str,url,start,over,city<br> ' defines some variables that need to be used <br><br>city = Request.QueryString ("id") <BR> ' The ID variable returned by the program (that is, the user's chosen city) is assigned to Id<br><br>url= "http://appnews.qq.com/cgi-bin/news_qq_search?city=" &city& "" <BR> "here set the page address you want to crawl, of course you can also directly specify an address without using variable <br><br>wstr=gethttppage (URL) <br > ' Get all the data on the specified page <br><br>start=newstring (wstr, "
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.