: This article mainly introduces the simple PHP thief program. if you are interested in the PHP Tutorial, you can refer to it. Thief program: crawls data (images, webpages, and other files) on a remote website to a local device, and then displays the data after processing.
Regular expression: used for string mode segmentation, matching, search, and replacement operations.
Related functions:
Int ereg (string$pattern
, String$string
[, Array&$regs
])
If the array returned by the parameter is omitted, the return value is True. otherwise, False is returned.
The corresponding eregi () is not case sensitive.
String file_get_contents (string$filename
[, Bool$use_include_path
= False [, resource$context
[, Int$offset
= 0 [, int$maxlen
])
Read the entire file, for example:
This function can be used to obtain webpage information.
He is the foundation of the thief program.
For example:
$ Url = file_get_contents ("http://www.ubuntu.org.cn/index_kylin ");
Echo $ url;
?>
But for another website:
$ Url = file_get_contents ("http://www.alangzhong.com/index.html ");
Echo $ url;
?>
Many background images are invisible.
View the source code of the webpage. we found that this is
Src = "/upload/201503/b123ec26-bb8f-43be-b5ad-cdf45153d053.png"/>
The image address uses a relative path, but we do not have such a file locally, of course, it cannot be displayed.
Select an image using a regular expression and replace the relative path with the remote address:
The following code does not solve the timeout problem.
The above introduces the simple PHP thief program, including some content, and hopes to help friends who are interested in PHP tutorials.