Thieves & Thieves warehousing & Collecting and storing _ thieves/collecting

Source: Internet
Author: User
XMLHTTP Application Reference
First, the use of steps:
1, create XMLHTTP object//Need MSXML4.0 support
2, open the connection with the service side, at the same time define instructions to send the way, the Service page (URL) and request permissions. The client opens a connection to the Service Web page of the server through the Open command. As with normal HTTP instructions, you can use the "get" method or the "POST" method to point to the Service Web page of the server.
3, send instructions.
4, waiting for and receiving the processing results returned by the service side.
5. Release XMLHTTP Object

Second, the XMLHTTP method:
1, XMLHTTP Object
Note: Clients can use the XMLHTTP object to send arbitrary HTTP requests, accept HTTP replies, and parse XML documents that are answered.

Open method: Initializes a msxml2.xmlhttp request, specifying the HTTP request method, URL, and authentication information.
Open (Bstrmethod, bstrURL, Varasync, Bstruser, Bstrpassword)
Bstrmethod: Data transfer mode, i.e. get or post.
bstrURL: URL of the Service Web page.
Varasync: Whether to execute synchronously. The default is true, which is synchronous execution, but can only be implemented synchronously in the DOM. It is usually set to false in, that is, asynchronous execution.
Bstruser: User name, can be omitted.
Bstrpassword: User password, can be omitted.

Send method: Sends an HTTP request to the server and returns an answer.
Grammar:
Oxmlhttprequest.send (Varbody)
Description: Whether this method synchronizes depends on the Varasync parameter of the Open method. If set to True, the call returns immediately, and returns if the call is set to false until the entire answer is received.

setRequestHeader (Bstrheader, Bstrvalue)
Bstrheader:http Header (header)
Value of Bstrvalue:http header (header)

If the open method is defined as post, you can define a form to upload:
Xmlhttp.setrequestheader ("Content-type", "application/x-www-form-urlencoded")

Third, XMLHTTP properties:
onReadyStateChange: Gets the event handle that returns the result in the synchronous execution mode. Can only be invoked in the DOM.
Responsebody: The result returns an array of unsigned integers.
Responsestream: The result is returned as a IStream stream.
ResponseText: The result is returned as a string.
Responsexml: Results are returned as XML format data.

Iv. Examples:
< script language= "JavaScript" >
function getdatal (URL) {
var xmlhttp = new ActiveXObject ("MSXML2. xmlhttp.4.0 ";//Create XMLHttpRequest object, MSXML4.0 support [MSXML2. xmlhttp.4.0 "," MSXML2. domdocument.4.0 "]
Xmlhttp.open ("Get", Url,false, "", ""; To initialize an HTTP request with an HTTP GET
Xmlhttp.send (""; Send HTTP request and get HTTP response
return xmlhttp.responsexml; Get XML document
}
</script >


Now online more popular thief program, there are news thieves, music thieves, download thieves, then they are how to do it, I would like to do a simple introduction, I hope to help you webmaster.
(i) principle
The thief program actually invokes pages on other Web sites through the XMLHTTP component in the XML. For example, the News thief program, many are called Sina's news page, and some of the HTML is replaced, while the ads also filtered. The advantages of using a thief program are: no maintenance site, because the Thief program data from other sites, it will be updated with the site update, you can save server resources, the General Thief program on several files, all Web content is from other sites. Disadvantages are: Instability, if the target site error, the program will also be wrong, and, if the target site to upgrade maintenance, then the thief program to make the corresponding changes; speed, because it is a remote call, speed and read data on the local server, it must be slower.
(ii) case

The following is a brief description of the application of XMLHTTP in ASP


Code: <%
' Common functions

' 1, enter URL target page address, return value Gethttppage is the HTML code of the target page
function gethttppage (URL)
Dim Http
Set Http=server.createobject ("MSXML2. XMLHTTP "
Http.open "Get", Url,false
Http.send ()
If Http.readystate<>4 Then
Exit function
End If
Gethttppage=bytestobstr (Http.responsebody, "GB2312")
Set http=nothing
If Err.number<>0 then err. Clear
End Function

' 2, the conversion of XMLHTTP, directly with the use of Chinese characters to call the Web page will be chaos, can be converted through the ADODB.stream component
Function Bytestobstr (Body,cset)
Dim objstream
Set objstream = Server.CreateObject ("ADODB.stream"
Objstream. Type = 1
Objstream. Mode =3
Objstream. Open
Objstream. Write body
Objstream. Position = 0
Objstream. Type = 2
Objstream. Charset = Cset
Bytestobstr = objstream. ReadText
Objstream. Close
Set objstream = Nothing
End Function

' Try to invoke http://wmjie.51.net/swords HTML content below
Dim url,html
Url= "http://wmjie.51.net/swords/"
Html = Gethttppage (URL)
Response.Write Html
%>

------------------------------------------------------
Code:
' Code ' reads remote files with XMLHTTP

<%
Response.Buffer = True
Dim objXmlHttp, XML
Set XML = Server.CreateObject ("Microsoft.XMLHTTP"

Xml. Open "Get", "Http://wmjie.51.net/swords/diary.rar", False

Xml. Send

' Add a header to give it a file name:
Response.AddHeader "Content-disposition", _
"Attachment;filename=mitchell-pres.zip"

' Specify the content type to tell the browser what:
Response.ContentType = "Application/zip"

' BinaryWrite the bytes to the browser
Response.BinaryWrite Xml.responsebody

Set xml = Nothing
%>



-------------------------------------
How to write ASP in-storage thief program
The principle of the storage thief is also very simple: is to use XMLHTTP remote read the content of the Web page, and then according to the need to read the content of processing (filtering, replacement, classification), and finally get their own needs of data, add to the database.
First of all: we read the Remote Web page with XMLHTTP first (I have an introduction in another article).
Second: The content of the filter, this is a more critical step, for example, I want to extract from the Remote Web page all the URL connection, what should I do?
Code:
' This is a regular formula.
Set objregexp = New Regexp ' establishes object
Objregexp.ignorecase = True ' case ignored
Objregexp.global = True ' Global is true '
Objregexp.pattern = "http://.+?" ' Match field
Set Mm=objregexp.execute (str) ' performs lookup, str is input parameter
For the Match in mm ' into the loop
Response.Write (match.value) ' Output URL address
Next


Then, we need to do some replacement function, to replace the unnecessary data, this is relatively simple, with the Replace function can be.
Finally, do database operations
-------------------------------
An example
Code:
<%
On Error Resume Next
server.scripttimeout=9999999
Function Gethttppage (Path)
t = GetBody (Path)
Gethttppage=bytestobstr (T, "GB2312"
End Function

' First of all, to carry out some initialization of the Thief program, the role of the above code is to ignore all non-fatal errors, the Thief program's running timeout time set very long (so do not run timeout error), converted to the original default UTF-8 encoding to GB2312 encoding, Otherwise, directly using the XMLHTTP component to invoke a page with Chinese characters will be garbled.

Function getbody (URL)
On Error Resume Next
Set retrieval = CreateObject ("Microsoft.XMLHTTP"
With retrieval
. Open "Get", url, False, "", ""
. Send
GetBody =. Responsebody
End With
Set retrieval = Nothing
End Function

' Then call the XMLHTTP component to create an object and initialize the settings.

Function Bytestobstr (Body,cset)
Dim objstream
Set objstream = Server.CreateObject ("ADODB.stream"
Objstream. Type = 1
Objstream. Mode =3
Objstream. Open
Objstream. Write body
Objstream. Position = 0
Objstream. Type = 2
Objstream. Charset = Cset
Bytestobstr = objstream. ReadText
Objstream. Close
Set objstream = Nothing
End Function

Function newstring (WSTR,STRNG)
Newstring=instr (LCase (WSTR), LCase (STRNG))
If Newstring<=0 then Newstring=len (WSTR)
End Function

' Processing crawled back data requires calling the ADODB.stream component and initializing the settings. %>

' Below is the page display section

<%
Dim wstr,str,url,start,over,city
' Define some variables that need to be used

City = request.querystring ("id")
The ID variable that the program returns (that is, the city the user chooses) is assigned to the ID

Url= "http://appnews.qq.com/cgi-bin/news_qq_search?city=" &city& ""
' Here you set the address of the page you want to crawl, but you can also specify an address directly without using a variable

Wstr=gethttppage (URL)
' Get all the data on the specified page

Start=newstring (WSTR, "' Here set the head of the data that needs to be processed, the variable should be set depending on the situation, and the content can be determined by looking at the source code of the page that needs to be crawled. Because in this program we need to crawl the entire page, so set the page all crawl. Note that the contents of the setting must be unique to the content of the page and cannot be duplicated.

Over=newstring (WSTR, "</HTML>")
' and start corresponds to the tail of the data that needs to be processed, and the content must be the only one on the page.

Body=mid (Wstr,start,over-start)
' Set the scope of the display page

' Here is the time to use the universe to remove the * * * *, and replace the characters specified in the data with some characters.

BODY = replace (body, "skin1", "Weather forecast-gram Network")
BODY = replace (body, "http://appnews.qq.com/cgi-bin/news_qq_search?city", "Tianqi.asp?id")

' The replacement has been completed in this program and a similar replacement operation can be carried out if there are other needs.

Response.Write Body
References: Remotely fetching content and putting content on the local computer, including any files

<%
'----------get content remotely and have the content on the local computer, including any files! ----------
' On Error Resume Next
' Set ' the content type to the specific type of you are sending.
' Response.ContentType = ' image/jpeg '
'-------------------------------defines the output format-----------------------------

Path=request.querystring ("P")
spath = Path
If left (LCase (path), 7) <> "http://"; Then
'-------------if HTTP is not in front of the local file, give LocalFile processing------------
LocalFile (PATH)
Else
'--------------------Otherwise for remote files, to RemoteFile processing------------------
RemoteFile (Path)
End If
' Response.Write Err. Description

Sub LocalFile (Path)
'-------------------simply jump to the page if it is a local file-------------------
Response.Redirect Path
End Sub

Sub RemoteFile (spath)
'-------------------------processing remote file Functions------------------------------
FileName = GetFileName (spath)
'-------------GetFileName the process of converting addresses to qualified file names-------------
filename = Server.MapPath ("/uploadfile/cache/" & filename)
Set objFSO = Server.CreateObject ("Scripting.FileSystemObject")
' Response.Write FileName
If Objfso.fileexists (FileName) Then
'--------------Check to see if the file is already visited, and if so, simply jump------------
Response.Redirect "/uploadfile/cache/" & GetFileName (Path)
Else
'----------------Otherwise, read the GetBody function First----------------------
' Response.Write Path
t = GetBody (Path)
'-----------------is written to the browser using the binary method--------------------------
Response.BinaryWrite T
Response.Flush
'-----------------Output buffer------------------------------------------
SaveFile T,getfilename (PATH)
'------------------cache the contents of the file to the local path for the next visit-----------
End If
Set objFSO = Nothing
End Sub

Function getbody (URL)
'-----------------------function for remote fetch of content---------------------
' On Error Resume Next
' Response.Write URL
Set retrieval = CreateObject ("Microsoft.XMLHTTP")
'----------------------set up the XMLHTTP object-----------------------------
With retrieval
. Open "Get", url, False, "", ""
'------------------send-----------------------with Get, asynchronous method
. Send
' GetBody =. ResponseText
GetBody =. Responsebody
'------------------function returns the contents of the Fetch--------------------------
End With
Set retrieval = Nothing
' Response. Write Err. Description
End Function

Function GetFileName (str)
'-------------------------This function is a qualified filename function-------------------
str = Replace (LCase (str), "http://";, "")
str = Replace (LCase (str), "//", "/")
str = Replace (str, "/", "")
str = replace (STR,VBCRLF, "")
GetFileName = str
End Function

Sub SaveFile (Str,fname)
'-------------------------this function to save the contents of the stream-------------------
' On Error Resume Next
Set objstream = Server.CreateObject ("ADODB. Stream ")
'--------------set up the ADODB.stream object, you must have more than ADO 2.5 version---------
objStream.Type = adTypeBinary
'-------------open in binary mode-------------------------------------
objStream.Open
Objstream.write Str
'--------------------write the string contents to the buffer--------------------------
' Response. Write fname
Objstream. SaveToFile "C:\inetpub\myweb\uploadfile\cache\" & Fname,adsavecreateoverwrite
'--------------------write buffered content to the file--------------------------
' Response. BinaryWrite objstream. Read
Objstream. Close ()
Set objstream = Nothing
'-----------------------close the object and release the resource-------------------------
' Response. Write Err. Description
End Sub
%>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.