PHP collection tips _ PHP Tutorial-php Tutorial

Source: Internet
Author: User
PHP collection tips. 1. obtain the source code of the remote file (file_get_contents or use fopen). 2. analyze the code to get the desired content (regular match is used here, and the page is usually obtained ). 3. get the content from the root 1. obtain the remote file source code (file_get_contents or fopen ).
2. analyze the code to get the content you want (regular match is used here, and paging is generally used ).
3. download and import the content obtained from the root, and perform other operations.

Here, the second step may have to be repeated several times. for example, you need to analyze the paging address first and analyze the content of the internal page to get what we want.
Code:
I have recorded some of the code from the previous sending Part. today I am here to make a simple sending part.
Copy the PHP content to the clipboard.
PHP code:
@ $ Nl = file_get_contents ($ rs ['URL']); // capture remote content
Preg_match_all ("/var url =" gameswf /(.*?). Swf ";/is", $ nl, $ connect); // perform a regular match to obtain the desired content.
Mysql_query ("insert... insert database ");
The above code is all the code used for collection. of course, you can also use fope. I personally prefer file_get_contents.
Below, I will share my method of downloading the image flash to the local device, which is too simple with two lines of code.
PHP code:

The code is as follows:


If (@ copy ($ url, $ newurl )){
Echo 'OK ';
}


I used to send an image download function to the Forum. This function will also be available to you.
PHP code:

The code is as follows:


/* Function for saving images */
Function getimg ($ url, $ filename ){
/* Determine whether the image url is empty. if it is empty, stop the function */
If ($ url = ""){
Return false;
}
/* Get the image extension and save it to the variable $ ext */
$ Ext = strrchr ($ url ,".");
/* Determine whether the image file is valid */
If ($ ext! = ". Gif" & $ ext! = ". Jpg "){
Return false;
}
/* Read the image */
$ Img = file_get_contents ($ url );
/* Open the specified file */
$ Fp = @ fopen ($ filename. $ ext, "");
/* Write the image to the pointing file */
Fwrite ($ fp, $ img );
/* Close the file */
Fclose ($ fp );
/* Return the new image file name */
Return $ filename. $ ext;
}


Share your collection ideas:
1. if you do not use anti-Leech sites, you can resort to fraud. However, the cost of such sites is too high.
2. the site that collects data as quickly as possible. it is best to collect data locally.
3. some data can be first stored in the database for further processing.
4. make sure to handle errors during Collection. I usually skip this step if the collection fails three times. In the past, it was often because a piece of content could not be collected and stuck there for a long time.
5. make good judgment before warehouse receiving, check that the content is valid, and filter unnecessary strings.

Compile (file_get_contents or use fopen). 2. analyze the code to get the content you want (regular match is used here, and the page is usually obtained ). 3. get the content from the root...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.