PHP Collection Program Overview

Source: Internet
Author: User

I haven't officially published anything in the forum for a long time. Today I will share my collection code with you!

Ideas:

The idea of the collection program is simple and can be divided into the following steps:
1. Obtain the remote file source code (file_get_contents or fopen ).
2. Analyze the code to get the content you want (regular match is used here, And paging is generally used ).
3. Download and import the content obtained from the root, and perform other operations.

Here, the second step may have to be repeated several times. For example, you need to analyze the paging address first and analyze the content of the internal page to get what we want.

Code:

I have recorded some of the code from the previous sending part. Today I am here to make a simple sending part.
Copy the PHP content to the clipboard.
PHP code:
@ $ NL = file_get_contents ($ Rs ['url']); // capture remote content
Preg_match_all ("/var url = \" gameswf \/(.*?) \. SWF \ ";/is", $ NL, $ Connect); // perform a regular match to obtain the desired content.
Mysql_query ("insert... insert Database ");

The above code is all the code used for collection. Of course, you can also use FOPE. I personally prefer file_get_contents.

Below, I will share my method of downloading the image flash to the local device, which is too simple with two lines of code.
Copy the PHP content to the clipboard.
PHP code:
If (@ copy ($ URL, $ newurl )){
Echo 'OK ';
}

I used to send an image download function to the forum. This function will also be available to you.
Copy the PHP content to the clipboard.
PHP code:
/* Function for saving images */
Function getimg ($ URL, $ filename ){
/* Determine whether the image URL is empty. If it is empty, stop the function */
If ($ url = ""){
Return false;
}
/* Get the image extension and save it to the variable $ ext */
$ Ext = strrchr ($ URL ,".");
/* Determine whether the image file is valid */
If ($ ext! = ". GIF" & $ ext! = ". Jpg "){
Return false;
}
/* Read the image */
$ IMG = file_get_contents ($ URL );
/* Open the specified file */
$ Fp = @ fopen ($ filename. $ Ext, "");
/* Write the image to the pointing file */
Fwrite ($ FP, $ IMG );
/* Close the file */
Fclose ($ FP );
/* Return the new image file name */
Return $ filename. $ ext;
}

Share your collection ideas:

1. If you do not use anti-leech sites, you can resort to fraud. However, the cost of such sites is too high.

2. The site that collects data as quickly as possible. It is best to collect data locally.

3. Some data can be first stored in the database for further processing.

4. Make sure to handle errors during collection. I usually skip this step if the collection fails three times. In the past, it was often because a piece of content could not be collected and stuck there for a long time.

5. Make good judgment before warehouse receiving, check that the content is valid, and filter unnecessary strings.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.