Xml
This article refers to the Internet collation (thanks to the material devotees). Hope this article as far as possible system, try to understand.
Steal, that is, without labor. On the network, such as a large authoritative site released the news, and their own small site also want to keep pace with the times, and others like synchronous update, more n AH. So stealing is the best way to do it. Stealing is immoral, do not encourage stealing, but do not advocate not to steal, are the curse of technology, pulled away ~!
What is the thief program?
In fact, the XMLHTTP object in the XML to invoke the Web pages on other Web sites, even the accepted HTML code can be filtered to obtain the required content (such as the extraction of a certain weather station, it is impossible to extract the whole station display, but only need to show the weather part).
In fact, it is a parasite, suitable for the site of the Thief program has been only determined, then as long as the site does not replace the relevant main content, will always steal down. Otherwise, it is necessary to modify the thief program accordingly. BTW, in the small fat pubcms has been defined as a reptile, I think the meaning is similar: P
So how to steal it? Read the XMLHTTP introduction should know some.
A function is defined first, explained in the function:
<%
Function gethttppage (URL)
Dim objxml
Set Objxml=server.createobject ("MSXML2. XMLHTTP ")" Definition
Objxml.open "Get", Url,false ' open
Objxml.send () ' Send
If Objxml.readystate<>4 then ' determines if the document has been parsed to make the client accept the return message
Exit function
End If
Gethttppage=bytestobstr (objxml.responsebody) ' returns information, with function definition encoding
' Gethttppage=bytes2bstr (objxml.responsebody) ' or converting Chinese characters using functions when returning information
Set Objxml=nothing ' Off
If Err.number<>0 then err. Clear
End Function
%>
Then take a look at the main contents of the definition encoding function bytestobstr ()
<%
Function Bytestobstr (body)
Dim objstream
Set objstream = Server.CreateObject ("ADODB.stream")
Objstream. Type = 1
Objstream. Mode =3
Objstream. Open
Objstream. Write body
Objstream. Position = 0
Objstream. Type = 2
Objstream. Charset = "GB2312"
' Convert the original default UTF-8 encoding to GB2312 encoding, otherwise directly using the XMLHTTP to invoke the page with Chinese characters will be garbled
Bytestobstr = objstream. ReadText
Objstream. Close
Set objstream = Nothing
End Function
%>
Of course, you can also use special functions to handle Chinese characters:
Function Bytes2bstr (vIn)
Strreturn = ""
For j = 1 to LenB (vIn)
Thischarcode = AscB (MidB (vin,j,1))
If Thischarcode < &h80 Then
Strreturn = Strreturn & Chr (Thischarcode)
Else
Nextcharcode = AscB (MidB (vin,j+1,1))
Strreturn = Strreturn & Chr (CLng (thischarcode) * &h100 + CInt (nextcharcode))
j = j + 1
End If
Next
Bytes2bstr = Strreturn
End Function
LenB returns the number of bytes instead of the number of characters, the same ASCB returns the ASCII code of each byte, greater than 80h, that is, 128 of ASCII is Chinese characters-half Chinese characters, half a half of the Chinese character ASCII code together and then use the CHR function to return the character on it.
The application is as follows:
<%
Dim url,html
URL = "Http://www.cnbruce.com/blog"
Html = Gethttppage (URL)
Response.Write (Html)
%>
That means "stealing" the contents of the Http://www.cnbruce.com/blog, and returns the contents of the relevant station.
Copy the following to keep the ASP file for debugging
<%function gethttppage (URL) Dim objxml Set Objxml=server.createobject (" Msxml2.xmlhttp ") ' definition objxml.open" get ", Url,false ' open objxml.send () ' Send if Objxml.readystate<>4 Then ' determines if the document has been resolved to do the client accept return message Exit function End if gethttppage=bytestobstr (objxml.responsebody) ' Returns information, Also use function definition encoding set objxml=nothing ' Close if err.number<>0 then err. Clear End functionfunction Bytestobstr (body) Dim Objstreamset objstream = Server.CreateObject ("ADODB.stream") objstream. Type = 1 objstream. Mode =3 objstream. open objstream. Write body objstream. Position = 0 objstream. Type = 2 objstream. Charset = "GB2312" "converts the original default UTF-8 encoding to GB2312 encoding, otherwise the page with the Chinese character will be invoked directly with the XMLHTTPwill be garbled bytestobstr = objstream. ReadText objstream. CloseSet objstream = nothingend Functiondim url,htmlurl = "Http://www.cnbruce.com/blog" Html = Gethttppage (URL) Response.Write (Html)%> [Ctrl + a ALL SELECT hint: You can modify some of the code, and then run]
Such a page was "stolen" down. At the same time, note that the returned information, some pictures can not be displayed, the style can not be connected, if normal, you need to filter and adjust the information returned.
So how do you extract useful return information, filter and adjust it?