Example code 1: Get content with file_get_contents in get
Copy CodeThe code is as follows:
$url = ' http://www.baidu.com/';
$html =file_get_contents ($url);
Print_r ($http _response_header);
EC ($html);
Printhr ();
Printarr ($http _response_header);
Printhr ();
?>
Example code 2: Open URL with fopen, get content in get
Copy CodeThe code is as follows:
$FP =fopen ($url, ' R ');
Printarr (Stream_get_meta_data ($fp));
Printhr ();
while (!feof ($fp)) {
$result. =fgets ($FP, 1024);
}
echo "URL body: $result";
Printhr ();
Fclose ($FP);
?>
Example code 3: Get the URL by post using the File_get_contents function
Copy CodeThe code is as follows:
$data =array (' foo ' = ' bar ');
$data =http_build_query ($data);
$opts =array (
' HTTP ' =>array (
' Method ' = ' POST ',
' Header ' = ' content-type:application/x-www-form-urlencoded\r\n '.
"Content-length:". strlen ($data). " \ r \ n ",
' Content ' = $data
),
);
$context =stream_context_create ($opts);
$html =file_get_contents (' http://localhost/e/admin/test.html ', false, $context);
echo$html;
?>
Example code 4: Open the URL with the Fsockopen function and get the full data in Get mode, including header and body
Copy CodeThe code is as follows:
Functionget_url ($url, $cookie =false) {
$url =parse_url ($url);
$query = $url [path]. "?". $url [query];
EC ("Query:". $query);
$FP =fsockopen ($url [host], $url [port]? $url [port]:80, $errno, $errstr, 30);
if (! $fp) {
Returnfalse;
}else{
$request = "get$queryhttp/1.1\r\n";
$request. = "Host: $url [host]\r\n";
$request. = "connection:close\r\n";
if ($cookie) $request. = "Cookie: $cookie \ n";
$request. = "\ r \ n";
Fwrite ($fp, $request);
while (! @feof ($fp)) {
$result. = @fgets ($fp, 1024);
}
Fclose ($FP);
Return$result;
}
}
Gets the HTML part of the URL, removing the header
Functiongeturlhtml ($url, $cookie =false) {
$rowdata =get_url ($url, $cookie);
if ($rowdata)
{
$body =stristr ($rowdata, "\r\n\r\n");
$body =substr ($body, 4,strlen ($body));
Return$body;
}
Returnfalse;
}
?>
Example code 5: Open the URL with the Fsockopen function to get the full data in post, including the header and body
Copy CodeThe code is as follows:
Functionhttp_post ($URL, $data, $cookie, $referrer = "") {
Parsing the given URL
$URL _info=parse_url ($URL);
Building referrer
if ($referrer = = "")//if not given the use of this script. As referrer
$referrer = "111";
Making string from $data
foreach ($dataas $key=> $value)
$values []= "$key =". UrlEncode ($value);
$data _string=implode ("&", $values);
Find out which port was needed-if not given use standard (=80)
if (!isset ($URL _info["Port"))
$URL _info["Port"]=80;
Building Post-request:
$request. = "POST". $URL _info["path"]. " Http/1.1\n ";
$request. = "Host:". $URL _info["host"]. " \ n ";
$request. = "Referer: $referer \ n";
$request. = "content-type:application/x-www-form-urlencoded\n";
$request. = "Content-length:". strlen ($data _string). " \ n ";
$request. = "connection:close\n";
$request. = "Cookie: $cookie \ n";
$request. = "\ n";
$request. = $data _string. " \ n ";
$FP =fsockopen ($URL _info["host"], $URL _info["Port"]);
Fputs ($fp, $request);
while (!feof ($fp)) {
$result. =fgets ($FP, 1024);
}
Fclose ($FP);
Return$result;
}
Printhr ();
?>
Example code 6: Using the Curl Library, before using the Curl library, you might need to look at php.ini to see if the curl extension has been turned on
Copy CodeThe code is as follows:
$ch = Curl_init ();
$timeout = 5;
curl_setopt ($ch, Curlopt_url, ' http://www.baidu.com/');
curl_setopt ($ch, Curlopt_returntransfer, 1);
curl_setopt ($ch, Curlopt_connecttimeout, $timeout);
$file _contents = curl_exec ($ch);
Curl_close ($ch);
echo $file _contents;
?>
About the Curl Library:
Curl Official website http://curl.haxx.se/
Curl is a routing file tool that uses URL syntax to support FTP, FTPS, HTTP htpps SCP SFTP TFTP, TELNET DICT file, and LDAP. Curl supports SSL certificates, HTTP POST, http PUT, FTP uploads, Kerberos, HTT-based uploads, proxies, cookies, user + password proofs, file transfer recovery, HTTP proxy channels, and a number of other useful tricks
Copy CodeThe code is as follows:
Functionprintarr (Array$arr)
{
Echo
Row field Count: ". Count ($arr)."
";
foreach ($arras $key=> $value)
{
echo "$key = $value
";
}
}
?>
======================================================
Code for crawling Remote Web site data by PHP
Now there may be a lot of program enthusiasts will encounter the same question, is how to crawl other people's Web site HTML code, like a search engine, and then collect the code into their own useful data! Let me introduce some simple examples today.
Ⅰ. Examples of crawling remote page titles:
Here is the code snippet:
Copy CodeThe code is as follows:
/*
+-------------------------------------------------------------
+ Crawl the page title code, copy this code snippet directly, save As. php file execution.
+-------------------------------------------------------------
*/
Error_reporting (7);
$file = fopen ("http://www.dnsing.com/", "R");
if (! $file) {
echo "Unable to open remote file.\n";
Exit
}
while (!feof ($file)) {
$line = fgets ($file, 1024);
if (Eregi (" <title>(.*)</title>", $line, $out)) {
$title = $out [1];
echo "". $title. "";
Break
}
}
Fclose ($file);
End
?>
Ⅱ. Examples of crawling Remote Web page HTML code:
Here is the code snippet:
Copy CodeThe code is as follows:
/*
+----------------
+dnsing Sprider
+----------------
*/
$fp = Fsockopen ("www.dnsing.com", $errno, $errstr, 30);
if (! $fp) {
echo "$errstr ($errno)
\ n ";
} else {
$out = "get/http/1.1\r\n";
$out. = "host:www.dnsing.com\r\n";
$out. = "Connection:close \r\n\r\n";
Fputs ($fp, $out);
while (!feof ($fp)) {
Echo fgets ($FP, 128);
}
Fclose ($FP);
}
End
?>
The above two code snippets are directly copied back to run to know the effect, the above example is just crawling the embryonic form of web data, to make it more suitable for their own use, the situation is different. So, here are the program enthusiasts themselves to study it.
===============================
A slightly more meaningful function is: Get_content_by_socket (), Get_url (), Get_content_url (), get_content_object several functions, perhaps giving you something to think about.
Copy CodeThe code is as follows:
Get all content URLs saved to file
function Get_index ($save _file, $prefix = "Index_") {
$count = 68;
$i = 1;
if (file_exists ($save _file)) @unlink ($save _file);
$fp = fopen ($save _file, "A +") or Die ("Open". $save _file. "Failed");
while ($i < $count) {
$url = $prefix. $i. ". HTM ";
echo "Get". $url. " ...";
$url _str = Get_content_url (Get_url ($url));
echo "ok\n";
Fwrite ($fp, $url _str);
+ + $i;
}
Fclose ($FP);
}
Get Target Multimedia Object
function Get_object ($url _file, $save _file, $split = "|--:* *:--|") {
if (!file_exists ($url _file)) die ($url _file, "not exist");
$file _arr = file ($url _file);
if (!is_array ($file _arr) | | empty ($file _arr)) die ($url _file, "not content");
$url _arr = Array_unique ($file _arr);
if (file_exists ($save _file)) @unlink ($save _file);
$fp = fopen ($save _file, "A +") or Die ("Open save File". $save _file. "Failed");
foreach ($url _arr as $url) {
if (empty ($url)) continue;
echo "Get". $url. " ...";
$html _str = Get_url ($url);
echo $html _str;
echo $url;
Exit
$obj _str = get_content_object ($html _str);
echo "ok\n";
Fwrite ($fp, $obj _str);
}
Fclose ($FP);
}
Traverse directory to get file contents
function Get_dir ($save _file, $dir) {
$DP = Opendir ($dir);
if (file_exists ($save _file)) @unlink ($save _file);
$fp = fopen ($save _file, "A +") or Die ("Open save File". $save _file. "Failed");
while (($file = Readdir ($DP)) = = False) {
if ($file! = "." && $file! = "...") {
echo "Read file". $file. " ...";
$file _content = file_get_contents ($dir. $file);
$obj _str = get_content_object ($file _content);
echo "ok\n";
Fwrite ($fp, $obj _str);
}
}
Fclose ($FP);
}
Gets the specified URL content
function Get_url ($url) {
$reg = '/^http:\/\/[^\/].+$/';
if (!preg_match ($reg, $url)) Die ($url. "Invalid");
$fp = fopen ($url, "R") or Die ("Open URL:". $url. "Failed.");
while ($FC = Fread ($fp, 8192)) {
$content. = $FC;
}
Fclose ($FP);
if (empty ($content)) {
Die ("Get URL:".) $url. "Content failed.");
}
return $content;
}
Get the specified page using the socket
function Get_content_by_socket ($url, $host) {
$fp = Fsockopen ($host, or Die ("Open"). $url. "Failed");
$header = "GET/". $url. " Http/1.1\r\n ";
$header. = "Accept: */*\r\n";
$header. = "accept-language:zh-cn\r\n";
$header. = "Accept-encoding:gzip, deflate\r\n";
$header. = "user-agent:mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; infopath.1;. NET CLR 2.0.50727) \ r \ n ";
$header. = "Host:". $host. " \ r \ n ";
$header. = "connection:keep-alive\r\n";
$header. = "cookie:cnzz02=2; rtime=1; ltime=1148456424859; Cnzz_eid=56601755-\r\n\r\n ";
$header. = "connection:close\r\n\r\n";
Fwrite ($fp, $header);
while (!feof ($fp)) {
$contents. = Fgets ($fp, 8192);
}
Fclose ($FP);
return $contents;
}
Gets the URL in the specified content
function Get_content_url ($host _url, $file _contents) {
$reg = '/^ (#|javascript.*?| ftp:\/\/.+|http:\/\/.+|. *?href.*?| play.*?| index.*?|. *?asp) +$/i ';
$reg = '/^ (down.*?\.html|\d+_\d+\.htm.*?) $/i ';
$rex = "/([HH][RR][EE][FF]) \s*=\s*[' \"]* ([^> ' \ "\s]+") [\ ' ' >]*\s*/i ';
$reg = '/^ (down.*?\.html) $/i ';
Preg_match_all ($rex, $file _contents, $r);
$result = ""; Array ();
foreach ($r as $c) {
if (Is_array ($c)) {
foreach ($c as $d) {
if (Preg_match ($reg, $d)) {$result. = $host _url. $d. " \ n "; }
}
}
}
return $result;
}
Gets the multimedia file in the specified content
function Get_content_object ($str, $split = "|--:* *:--|") {
$REGX = "/href\s*=\s*[" \ "]* ([^> ' \" \s]+) [\ "' >]*\s* (
.*?<\/b>)/I ";
Preg_match_all ($REGX, $str, $result);
if (count ($result) = = 3) {
$result [2] = Str_replace ("Multimedia: "," ", $result [2]);
$result [2] = Str_replace (""," ", $result [2]);
$result = $result [1][0]. $split. $result [2][0]. "\ n";
}
return $result;
}
?>
======================================================
PHP Gets the Remote Web page content function when the same domain name corresponds to multiple IPs
The FGC is simply read and encapsulates everything.
Fopen also carries out some encapsulation, but requires you to iterate through all the data.
Fsockopen This is the socket operation of the straight plate.
If you just read an HTML page, FGC is better.
If the company is online through a firewall, the general file_get_content function will not be. Of course, it is also possible to write HTTP requests directly to proxies through some socket operations, but it is cumbersome.
If you can confirm that the file is small, you can choose either of the two ways fopen, join (", File ($file)); For example, if you only operate files that are less than 1k, it is best to use file_get_contents.
If you determine that the file is large, or you cannot determine the size of the file, it is best to use a file stream. fopen a 1K file and fopen a 1G file There is no obvious difference. The content is long and can take longer to read than to let the script die.
----------------------------------------------------
Http://www.phpcake.cn/archives/tag/fsockopen
PHP gets Remote Web page content in several ways, such as using its own file_get_contents, fopen and other functions.
Copy CodeThe code is as follows:
Echo file_get_contents ("http://blog.s135.com/abc.php");
?>
However, in a load balancer such as DNS polling, the same domain name may correspond to multiple servers, more than one IP. Suppose blog.s135.com is parsed by DNS to 72.249.146.213, 72.249.146.214, 72.249.146.215 three IPs, Each time a user accesses blog.s135.com, one of the servers is accessed based on the appropriate algorithm for load balancing.
When I was doing a video project last week, I met a requirement that we should go to a PHP interface program on each server (assuming abc.php) and query the transport status of this server.
It is not possible to access the http://blog.s135.com/abc.php directly with file_get_contents because it may have been repeatedly accessing a single server.
Instead, access http://72.249.146.213/abc.php, http://72.249.146.214/abc.php, http://72.249.146.215/ Abc.php method, it is not possible to have multiple virtual hosts on a Web server in these three servers.
Not by setting up the local hosts, because hosts cannot set multiple IPs for the same domain name.
That's only possible with PHP and http: When you access abc.php, add the blog.s135.com domain name to the header. So, I wrote the following PHP function:
Copy CodeThe code is as follows:
/************************
* Function use: When the same domain name corresponds to multiple IPs, get the Remote Web page content of the specified server
* Parameter Description:
* $ip The IP address of the server
* The host name of the $host server
* The URL address of the $url server (excluding domain name)
* Return value:
* Access to the Remote Web page content
* False access to Remote Web page failed
************************/
function Httpvisit ($ip, $host, $url)
{
$errstr = ";
$errno = ";
$fp = Fsockopen ($ip, $errno, $errstr, 90);
if (! $fp)
{
return false;
}
Else
{
$out = "GET {$url} http/1.1\r\n";
$out. = "host:{$host}\r\n";
$out. = "connection:close\r\n\r\n";
Fputs ($fp, $out);while ($line = Fread ($fp, 4096)) {
$response. = $line;
}
Fclose ($FP);
Remove header information
$pos = Strpos ($response, "\r\n\r\n");
$response = substr ($response, $pos + 4);
return $response;
}
}
Call Method:
$server _info1 = httpvisit ("72.249.146.213", "blog.s135.com", "/abc.php");
$server _info2 = httpvisit ("72.249.146.214", "blog.s135.com", "/abc.php");
$server _info3 = httpvisit ("72.249.146.215", "blog.s135.com", "/abc.php");
?>
http://www.bkjia.com/PHPjc/327908.html www.bkjia.com true http://www.bkjia.com/PHPjc/327908.html techarticle example code 1: Get the content copy code code with file_get_contents as follows: php $url = ' http://www.baidu.com/'; $html =file_get_contents ($url); Print_r ($http _response_h ...