PHP Implementation crawl HTTPS content, PHP crawl HTTPS
Recently encountered an HTTPS problem while studying the hacker News API. Because all the hacker News APIs are accessed through an encrypted HTTPS protocol, unlike the normal HTTP protocol, when using PHP functions file_get_contents()
to get the data provided in the API, an error occurs, using the code:
<?php
$data = file_get_contents ("Https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty");
......
When you run the above code, you encounter the following error message:
PHP Warning: file_get_contents (): Unable to find the wrapper "https"-do you forget to enable it when you configured Php?
Here are the following:
Why is there such a mistake?
After a search on the internet, found that there are many people who have encountered such errors, the problem is very straightforward, because in the PHP configuration file does not open a parameter, in my machine is in /apache/bin/php.ini
;extension=php_openssl.dll
this item, you need to remove the preceding semicolon. You can use the following script to check the configuration of your PHP environment:
$w = Stream_get_wrappers ();
Echo ' OpenSSL: ', extension_loaded (' OpenSSL ')? ' Yes ': ' no ', ' \ n ';
Echo ' http wrapper: ', In_array (' http ', $w)? ' Yes ': ' no ', ' \ n ';
Echo ' https wrapper: ', In_array (' https ', $w)? ' Yes ': ' no ', ' \ n ';
Echo ' wrappers: ', Var_dump ($w);
Running the above script fragment, the result on my machine is:
Openssl:no
HTTP Wrapper:yes
HTTPS Wrapper:no
Wrappers:array (10) {
[0]=>
String (3) "PHP"
[1]=>
String (4) "File"
[2]=>
String (4) "Glob"
[3]=>
String (4) "Data"
[4]=>
String (4) "http"
[5]=>
String (3) "FTP"
[6]=>
String (3) "Zip"
[7]=>
String ("Compress.zlib")
[8]=>
String "Compress.bzip2"
[9]=>
String (4) "Phar"
}
Alternative Solutions
It is easy to find errors and correct errors, but it is difficult to find errors after they are found. I originally wanted to put this script method on the remote host, but I could not modify the remote host's PHP configuration, as a result, I could not use this scheme, but we can not be hanged in a tree, this road does not go through, see if there is no other way.
Another function I often use to crawl content in PHP is curl
that it is file_get_contents()
more powerful and provides a lot of optional parameters. HTTPS
The configuration parameters we need to use to access the CURL
content are:
curl_setopt ($ch, Curlopt_ssl_verifypeer, FALSE);
As you can see semantically, it is ignoring/skipping SSL security authentication. It may not be a good idea, but it's enough for a normal scenario.
Here is Curl
a function that leverages the encapsulation to access HTTPS content:
function Gethttps ($url) {
$ch = Curl_init ();
curl_setopt ($ch, Curlopt_ssl_verifypeer, FALSE);
curl_setopt ($ch, Curlopt_header, false);
curl_setopt ($ch, curlopt_followlocation, true);
curl_setopt ($ch, Curlopt_url, $url);
curl_setopt ($ch, Curlopt_referer, $url);
curl_setopt ($ch, Curlopt_returntransfer, TRUE);
$result = curl_exec ($ch);
Curl_close ($ch);
return $result;
}
The above is PHP to get HTTPS content of the entire process, very simple and practical, recommended to have the same project needs of small partners.
http://www.bkjia.com/PHPjc/920621.html www.bkjia.com true http://www.bkjia.com/PHPjc/920621.html techarticle php crawl HTTPS content, PHP crawling HTTPS recently encountered an HTTPS problem when studying the hacker News API. Because all the hacker News APIs are accessed via an encrypted HTTPS protocol, ...