How does Goutte obtain the url in tag? Or a handy PHP crawler Library. Thank you {code...}. how does Goutte obtain the url in tag? Or a handy PHP crawler Library. thank you.
[], 'Link' => [], 'content' => [], 'source' => [], 'date' => [],]; public function _ construct () {try {$ this-> _ client = new Client (); $ this-> _ crawler = $ this-> _ client-> request ('GET ',' http://www.ningshan.gov.cn/Category_90/Index.aspx '); // $ Client-> getClient ()-> setDefaultOption ('config/curl /'. CURLOPT_TIMEOUT, 10);} catch (Exception $ e) {throw new \ Exception ($ e-> getMessage (), 1);} public function getDate () {$ this-> _ crawler-> filter ('P # list> ul> li> span ')-> each (function ($ node) {$ this-> _ news ['Date'] [] = $ node-> text () ;});} public function getTitle () {$ link = $ this-> _ crawler-> selectLink P ($ link-> getUri); die; $ this-> _ crawler-> filter ('P # list> ul> li> ') -> each (function ($ node) {if ($ node-> text ()! = 'Shaanxi news ') {$ this-> _ news ['title'] [] = $ node-> text (); $ this-> _ news ['link'] [] = $ node-> link (); $ this-> _ news ['source'] [] = 'Shaanxi news ';}}}// ----------------------------------------- try {$ spider = new Spider (); $ spider-> getDate (); $ spider-> getTitle (); echo json_encode ($ spider-> _ news, JSON_UNESCAPED_UNICODE);} catch (Exception $ e) {echo $ e-> getMessage ();}
Reply content:
How does Goutte obtain the url in tag? Or a handy PHP crawler Library. thank you.
[], 'Link' => [], 'content' => [], 'source' => [], 'date' => [],]; public function _ construct () {try {$ this-> _ client = new Client (); $ this-> _ crawler = $ this-> _ client-> request ('GET ',' http://www.ningshan.gov.cn/Category_90/Index.aspx '); // $ Client-> getClient ()-> setDefaultOption ('config/curl /'. CURLOPT_TIMEOUT, 10);} catch (Exception $ e) {throw new \ Exception ($ e-> getMessage (), 1);} public function getDate () {$ this-> _ crawler-> filter ('P # list> ul> li> span ')-> each (function ($ node) {$ this-> _ news ['Date'] [] = $ node-> text () ;});} public function getTitle () {$ link = $ this-> _ crawler-> selectLink P ($ link-> getUri); die; $ this-> _ crawler-> filter ('P # list> ul> li> ') -> each (function ($ node) {if ($ node-> text ()! = 'Shaanxi news ') {$ this-> _ news ['title'] [] = $ node-> text (); $ this-> _ news ['link'] [] = $ node-> link (); $ this-> _ news ['source'] [] = 'Shaanxi news ';}}}// ----------------------------------------- try {$ spider = new Spider (); $ spider-> getDate (); $ spider-> getTitle (); echo json_encode ($ spider-> _ news, JSON_UNESCAPED_UNICODE);} catch (Exception $ e) {echo $ e-> getMessage ();}
Found
$crawler = $client->request('GET', 'http://www.symfony.com/blog/');$link = $crawler->selectLink('Security Advisories')->link();print_r($link->getUri());
Manual: http://symfony.com/doc/curren...
GIT: https://github.com/FriendsOfP...
Collection class reference: http://flc.ren/2016/06/528.html