PHP multithreaded Crawl Information Test example

Source: Internet
Author: User

PHP 5.3 version, using pthreads PHP extensions, you can make PHP really support multithreading. Multithreading can greatly shorten program execution time in the process of repetitive cyclic tasks.

PHP Extended Download: https://github.com/krakjoe/pthreads
PHP Manual Documentation: http://php.net/manual/zh/book.pthreads.php

1, the extension of the compiler installation (Linux), editing parameters--enable-maintainer-zts is required options:

cd/data/tgz/php-5.3.8
./configure--prefix=/data/apps/php--with-config-file-path=/data/apps/php/etc-- With-mysql=/data/apps/mysql--with-mysqli=/data/apps/mysql/bin/mysql_config--with-iconv-dir--with-freetype-dir= /data/apps/libs--with-jpeg-dir=/data/apps/libs--with-png-dir=/data/apps/libs--with-zlib--with-libxml-dir=/usr- -enable-xml--disable-rpath--enable-bcmath--enable-shmop--enable-sysvsem--enable-inline-optimization --enable-mbregex--enable-fpm--enable-mbstring--with-mcrypt=/data/apps/libs--with-gd--enable-gd-native-ttf-- With-openssl--with-mhash--enable-pcntl--enable-sockets--with-xmlrpc--enable-zip- With-pdo-mysql--enable-maintainer-zts
Make clean
make
make install        
 
Unzip pthreads-master.zip
CD pthreads-master
/data/apps/php/bin/phpize
./configure- -with-php-config=/data/apps/php/bin/php-config
Make
make install

Add Extension:


Vi/data/apps/php/etc/php.ini

Extension = "pthreads.so"

PHP multi-threading, with for loop, crawl Baidu search page PHP code example:

The code is as follows Copy Code
<?php


Class Test_thread_run extends Thread


{


public $url;


Public $data;





Public function __construct ($url)


{


$this->url = $url;


}





Public Function Run ()


{


if (($url = $this->url))


{


$this->data = Model_http_curl_get ($url);


}


}


}





function Model_thread_result_get ($urls _array)


{


foreach ($urls _array as $key => $value)


{


$thread _array[$key] = new Test_thread_run ($value ["url"]);


$thread _array[$key]->start ();


}





foreach ($thread _array as $thread _array_key => $thread _array_value)


{


while ($thread _array[$thread _array_key]->isrunning ())


{


Usleep (10);


}


if ($thread _array[$thread _array_key]->join ())


{


$variable _data[$thread _array_key] = $thread _array[$thread _array_key]->data;


}


}


return $variable _data;


}





function Model_http_curl_get ($url, $userAgent = "")


{


$userAgent = $userAgent? $userAgent: ' mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2) ';


$curl = Curl_init ();


curl_setopt ($curl, Curlopt_url, $url);


curl_setopt ($curl, Curlopt_returntransfer, 1);


curl_setopt ($curl, Curlopt_timeout, 5);


curl_setopt ($curl, curlopt_useragent, $userAgent);


$result = curl_exec ($curl);


Curl_close ($curl);


return $result;


}





for ($i =0; $i < $i + +)


{


$urls _array[] = Array ("name" => "Baidu", "url" => "http://www.baidu.com/s?wd=". Mt_rand (10000,20000));


}





$t = Microtime (true);


$result = Model_thread_result_get ($urls _array);


$e = Microtime (true);


echo "Multithreading:". ($e-$t). " n ";





$t = Microtime (true);


foreach ($urls _array as $key => $value)


{


$result _new[$key] = Model_http_curl_get ($value ["url"]);


}


$e = Microtime (true);


echo "For loop:". ($e-$t). " n ";


?>

example, collecting data

The code is as follows Copy Code

<?php
$urls = Array (
' http://www.111cn.net/',
' http://www.sohu.com/',
' http://www.163.com/'
);

$save _to= '/test.txt '; Write the crawled code to the file
$st = fopen ($save _to, "a");

$MH = Curl_multi_init ();
foreach ($urls as $i => $url) {
$conn [$i] = Curl_init ($url);
curl_setopt ($conn [$i], Curlopt_useragent, "mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0) ");
curl_setopt ($conn [$i], Curlopt_header, 0);
curl_setopt ($conn [$i], curlopt_connecttimeout,60);
curl_setopt ($conn [$i],curlopt_returntransfer,true); Setting does not write the crawl substitution code to the browser, but converts it to a string
Curl_multi_add_handle ($MH, $conn [$i]);
}

do {
Curl_multi_exec ($MH, $active);
while ($active);

foreach ($urls as $i => $url) {
$data = Curl_multi_getcontent ($conn [$i]); Get the code string for crawling
Fwrite ($st, $data); Writes a string to a file. Of course, you can not write to a file, such as a database
//Get data variable and write to file

foreach ($urls as $i => $url) {
Curl_multi_remove_handle ($MH, $conn [$i]);
Curl_close ($conn [$i]);
}

Curl_multi_close ($MH);
Fclose ($st);
?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.