Help PHP can't crawl Web page, asked a few people did not solve

Source: Internet
Author: User
This post was last edited by dz215136304 on 2013-06-11 11:35:47

The URL must be the following code in the URL, tested, when fetching, if the parameter after Q with a space, he will automatically convert "&" to "&", resulting in data can not be crawled, directly in the Web page to enter the URL is to get content, to solve the method
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz claiborne&page=1&showmode=list"; Echo Post ($url), function Post ($url, $post = null)//requested Web page {$context = array (), if (Is_array ($post)) {ksort ($post); $context [' http '] = array (' timeout ' =>60, ' method ' = ' POST ', ' header ' = ' >accept-language:en/r/n ', ' content ' = Http_build_query ($post, ', ' & ');} Return file_get_contents ($url, False, Stream_context_create ($context));}


Error message:
Warning:file_get_contents (Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz claiborne&page=1 &showmode=list) [Function.file-get-contents]: Failed to open stream:http request failed! http/1.1 505 HTTP Version not supported in F:\wwwroot\getTaobao\test.php on line 25


Reply to discussion (solution)

You can first look at the HTML character entity


File_get_contents? reads the entire file into a string


Description

String file_get_contents (stri ng $filename [, BOOL $use _include_path [, Resource $context [, int $offset [, int $maxlen]]])

is just like file () except File_get_contents () reads the file into a string. Reads the contents of length MaxLen at the position specified by the parameter offset. If it fails, file_get_contents () returns FALSE. The

file_get_contents () function is the preferred method for reading the contents of a file into a string. If operating system support also uses memory-mapping techniques to enhance performance.


Note: If you want to open a URL with special characters (for example, a space), you need to use UrlEncode () for URL encoding.



What else is the
' header ' and ' >accept-language:en/r/n '
The Scarlet Letter part?
> is redundant,/r/n should be \ r \ n
header incorrect, the server side returned error (505) is normal

file_get_contents? reads the entire file into a string


Description

String file_get_contents (String $f Ilename [, BOOL $use _include_path [, Resource $context [, int $offset [, int $maxlen]]])

is just like file () except for file _get_contents () reads the file into a string. Reads the contents of length MaxLen at the position specified by the parameter offset. If it fails, file_get_contents () returns FALSE. The

file_get_contents () function is the preferred method for reading the contents of a file into a string. If operating system support also uses memory-mapping techniques to enhance performance.


Note: If you want to open a URL with special characters (for example, a space), you need to use UrlEncode () for URL encoding.



What else is the
' header ' and ' >accept-language:en/r/n '
The Scarlet Letter part?
> is redundant,/r/n should be \ r \ n
header incorrect, server-side return error (505) is normal

cannot get data after URL encoding, the code is as follows

$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=lizclaiborne&page=1&showMode=list"; Echo Post (UrlEncode ($url)), function Post ($url, $post = null)//requested Web page {$context = array (), if (Is_array ($post)) {Ksort ($ Post); $context [' http '] = array (' timeout ' =>60, ' method ' = ' post ', ' header ' = ' accept-language:en\r\n ', ' Content ' = ' http_build_query ($post, ', ' & '),);} Return file_get_contents ($url, False, Stream_context_create ($context));}

The actual error is: http/1.1 505 HTTP Version not supported

File_get_contents (Str_replace (', '%20 ', $url));

Now it is possible that there was a problem with his server.

$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=lizclaiborne&page=1&showMode=list"; Echo file_get_contents ($url);
{"Result": "true", "totalpage": "+", "Catmap": "", "Ppath": "", "category": "", "AuctionTagFlag1": "", "AuctionTagFlag2" : "", "AuctionTagFlag3": "", "ListItem": [
{"Name": "Group purchase price US genuine Liz Claiborne Lai-lai-Ben-female-style wallet Liz Wallet", "img": "http://q.i02.wimg.taobao.com/bao/uploaded/i1/ T18ZyyXfXgXXXc8SLa_122312.jpg_90x90.jpg "," Img2 ":" http://q.i04.wimg.taobao.com/bao/uploaded/i1/ T18zyyxfxgxxxc8sla_122312.jpg "," ISWEBP ":" "," url ":" Http://a.m.taobao.com/i2431550873.htm?rn= BWHGEI1-ZCLPEKBBGC1LFJHM45-D1GLR8O-PUG7&SID=8B9C27255C655B1E "," Previewurl ":" Http://a.m.taobao.com/ajax/pre _VIEW.DO?ITEMID=2431550873&SID=8B9C27255C655B1E "," Favoriteurl ":" Http://fav.m.taobao.com/favorite/to_ COLLECTION.HTM?ITEMNUMID=2431550873&SID=8B9C27255C655B1E ",
"Icon": ["0"],
"Price": "39.00", "OriginalPrice": "39.00", "Freight": "Ten", "area": "Tianjin", "act": "Monthly Sale 1", "Itemnumid": "2431550873", "Nick" : "The _2007 of the Golden Wisp",
..........

Well, the data was glued to the wrong
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz claiborne&page=1&showmode=list";
This can't be http/1.1 505 HTTP Version not supported

That's all.
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz +claiborne&page=1&showmode=list";
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz%20claiborne&page=1&showmode=list" ;

His server does not know what settings to do, do not accept non-URL-encoded data


Well, sticking the wrong data
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz Claiborne&page=1&showmode=list ";
This is http/1.1 505 HTTP Version not supported

This is ok
$url = "http://110.75.65.8/search_turn_page_iphone.htm ? Sort=&q=liz +claiborne&page=1&showmode=list ";
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz%20claiborne&page=1&showmode= List ";

His server does not know what settings to do, do not accept data that is not URL-encoded

Server can accept "non-URL-encoded data"?
What I understand is that the server can only accept URL-encoded data,
if we directly open the address with a space in the browser,
the browser has automatically encoded the URL,
so open normal,
but PHP is not a browser,
so it does not Automatically do these things,
need to be coded manually,
isn't that so?




Well, the data was glued to the wrong
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz Claiborne&page=1&showmode=list ";
This is http/1.1 505 HTTP Version not supported

This is ok
$url = "http://110.75.65.8/search_turn_page_iphone.htm ? Sort=&q=liz +claiborne&page=1&showmode=list ";
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz%20claiborne&page=1&showmode= List ";

His server does not know what settings to do, do not accept data that is not URL-encoded

Server can accept "non-URL-encoded data"?
What I understand is that the server can only accept URL-encoded data,
if we directly open the address with a space in the browser,
the browser has automatically encoded the URL,
so open normal,
but PHP is not a browser,
so it does not Automatically do these things,
need to be coded manually,
isn't that so?


White space (\x20) is a URL legal character, how to deal with the server
you have done an HTTP socket, you know, the header sent with a blank URL is generally acceptable



Well, the data is glued to the wrong
$url = "http://110.75.65.8/search_turn_page_iphone.htm?sort=&q= Liz Claiborne&page=1&showmode=list ";
This is http/1.1 505 HTTP Version not supported

This is ok
$url = "http://110.75.65.8/search_turn_page_iphone.htm ? Sort=&q=liz +claiborne&page=1&showmode=list ";
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz%20claiborne&page=1&showmode= List ";

His server does not know what settings to do, do not accept data that is not URL-encoded

Server can accept "non-URL-encoded data"?
What I understand is that the server can only accept URL-encoded data,
if we directly open the address with a space in the browser,
the browser has automatically encoded the URL,
so open normal,
but PHP is not a browser,
so it does not Automatically do these things,
need to be coded manually,
isn't that so?


White space (\x20) is a URL legal character, how to deal with the server
you have done an HTTP socket, you know, the header sent with a blank URL is generally acceptable


That is to say that the query string, regardless of the character, the
server can be fully received is it?


The correct wording is:
$url = "http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=". UrlEncode (' Liz Claiborne '). "&page=1&showmode=list";




Well, sticky data
$url = "http://110.75.65.8/search_turn_page_iphone.htm?sort=& Q=liz claiborne&page=1&showmode=list ";
This is http/1.1 505 HTTP Version not supported

This is ok
$url = "http://110.75.65.8/search_turn_page_iphone.htm ? Sort=&q=liz +claiborne&page=1&showmode=list ";
$url = "Http://110.75.65.8/search_turn_page_iphone.htm?sort=&q=liz%20claiborne&page=1&showmode= List ";

His server does not know what settings to do, do not accept data that is not URL-encoded

Server can accept "non-URL-encoded data"?
What I understand is that the server can only accept URL-encoded data,
if we directly open the address with a space in the browser,
the browser has automatically encoded the URL,
so open normal,
but PHP is not a browser,
so it does not Automatically do these things,
need to be coded manually,
isn't that so?


White space (\x20) is a URL legal character, how to deal with the server
you have done an HTTP socket, you know, the header sent with a blank URL is generally acceptable


That is to say that the query string, regardless of the character, the
server can be fully received is it?



Line break and/or character, what do you think, of course, not any characters

I've met this question before, and I'm going to take the "&" alone, so let's say http://www.123.com?id=123&num=123;.
Written $url= ' http://www.123.com?id=123 '. & '. ' Num=123 '; So the compiler will not convert it when the string is counted.

Transcoding can be UrlEncode ()

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.