For phpfile_get_contents data collection and common problems, see solution. In php, the file_get_contents function is rarely used for batch data collection. However, for a small amount of data, you can use the file_get_contents function, because it is not only easy to use, but also rarely uses the file_get_contents function for batch data collection in php. but if it is a small amount, we can use the file_get_contents function for operations, because it is not only easy to use, but also easy to learn, I will introduce the usage and solution of file_get_contents.
Let's look at the problem first.
File_get_contents cannot obtain the URL with Port
For example:
| The code is as follows: |
|
| File_get_contents ('http: // localhost: 100 '); |
There is no get.
Solution: disable selinux.
1 Permanent method-server restart required
Modify selinux = disabled in the/etc/SELINUX/config file, and then restart the server.
2. temporary method-set system parameters
Run the setenforce 0 command.
Appendix:
Setenforce 1 sets SELinux to enforcing mode
Setenforce 0 sets SELinux to permissive mode
File_get_contents timeout
| The code is as follows: |
|
Function _ file_get_contents ($ url) { $ Context = stream_context_create (array ( 'Http' => array ( 'Timeout' => 180 // timeout time, in seconds ) )); Return @ file_get_contents ($ url, 0, $ context ); } |
After the above problem is solved, we can start to collect data.
| The code is as follows: |
|
// The country determines if $ REQUEST_URI contains html If (! Strpos ($ _ SERVER ["REQUEST_URI"], ". html ")) { $ Page = "http://qq.ip138.com/weather "; $ Html = file_get_contents ($ page, 'r '); $ Pattern = "/Online Query of weather trend forecasts for major cities, counties, and the next five days(.*?) /Si "; // Html between regular expression matching Preg_match ($ pattern, $ html, $ pg ); Echo ""; // Replace the remote address with the local address $ P = preg_replace ('// weather/(w +)/index.htm/', 'tq. php/$1.html ', $ pg [1]); Echo $ p; } // Save. check whether the condition is $ REQUEST_URI? Else if (! Strpos ($ _ SERVER ["REQUEST_URI"], "? ")){ // Use the separation method recommended by yoyo to obtain data. here, the province name is obtained. $ Province = explode ("/", $ _ SERVER ["REQUEST_URI"]); $ Province = explode (".", $ province [count ($ province)-1]); $ Province = $ province [0]; // The comment is the regular expression written by myself. it is not easy to write, but the effect is equivalent to the above. // Preg_match ('/[^/] + [. (html)] $/', $ _ SERVER ["REQUEST_URI"], $ pro ); // Define province1_preg_replace('.html/', '', $ pro [0]); $ Page = "http://qq.ip138.com/weather/". $ province. "/index.htm "; // Try to open the page before obtaining html data to prevent error caused by malicious address input If (! @ Fopen ($ page, "r ")){ Die ("Sorry, this address does not exist! Click here to return "); Exit (0 ); } $ Html = file_get_contents ($ page, 'r '); $ Pattern = "/five-day weather trend forecast(.*?) Enter City/si "; Preg_match ($ pattern, $ html, $ pg ); Echo ""; // Obtain the province and city for regular expression replacement. $ P = preg_replace ('// weather/(w +)/(w+).htm/', '%2.html? Pro = $ 1', $ pg [1]); Echo $ p; } Else { // City, pass the province through get $ Pro = $ _ REQUEST ['pro']; $ City = explode ("/", $ _ SERVER ["REQUEST_URI"]); $ City = explode (".", $ city [count ($ city)-1]); $ City = $ city [0]; // Preg_match ('/[^/] + [. (html)] + [?] /', $ _ SERVER ["REQUEST_URI"], $ cit ); // $Cityuncpreg_replace('.html? /', '', $ Cit [0]); $ Page = "http://qq.ip138.com/weather/". $ pro. "/". $ city. ". htm "; If (! @ Fopen ($ page, "r ")){ Die ("Sorry, this address does not exist! Click here to return "); Exit (0 ); } $ Html = file_get_contents ($ page, 'r '); $ Pattern = "/five-day weather trend forecast(.*?) Enter City/si "; Preg_match ($ pattern, $ html, $ pg ); Echo ""; // Obtain the real image address $ P = preg_replace ('// image //', 'http: // qq.ip138.com/image/', $ pg [1]); Echo $ p; } ?> |
If the above method cannot collect data, we can use it for processing.
| The code is as follows: |
|
$ Url = "http://www.bKjia. c0m "; $ Ch = curl_init (); $ Timeout = 5; Curl_setopt ($ ch, CURLOPT_URL, $ url ); Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1 ); Curl_setopt ($ ch, CURLOPT_CONNECTTIMEOUT, $ timeout ); // Add the following two lines to the webpage for user detection: // Curl_setopt ($ ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY ); // Curl_setopt ($ ch, CURLOPT_USERPWD, US_NAME. ":". US_PWD ); $ Contents = curl_exec ($ ch ); Curl_close ($ ch ); Echo $ contents; ?> |
...