Brief introduction
How to easily parse HTML code in PHP is an estimate of the problems that every phper will encounter. Using Phpquery, you can make PHP process HTML code as easy as jquery.
Project Address: https://code.google.com/p/phpquery/
GitHub Address: Https://github.com/TobiaszCudnik/phpquery
DEMO
Download Library files: https://code.google.com/p/phpquery/downloads/list
I was under the OneFile version: Phpquery-0.9.5.386-onefile.zip
Official demo:https://code.google.com/p/phpquery/source/browse/branches/dev/demo.php
Then reference in the project.
HTML file test.html:
Copy Code code as follows:
<div class= "thumb" id= "thumb-13164-3640" style= "Position:absolute"; left:0px; top:0px; " >
<a href= "/spiderman-city-drive" >
<span class= "Gamename" id= "gamename-13164-3640" style= "Display:none"; >spiderman City drive</span>
<span class= "gamerating" id= "gamerating-13164-3640" style= "Display:none"; >
<span style= "WIDTH:68.14816PX;" ></span>
</span>
</a>
</div>
<div class= "thumb" id= "thumb-13169-5946" style= "Position:absolute"; left:190px; top:0px; " >
<a href= "/spiderman-city-raid" >
<span class= "Gamename" id= "gamename-13169-5946" style= "Display:none"; >spiderman-city raid</span>
<span class= "gamerating" id= "gamerating-13169-5946" style= "Display:none"; >
<span style= "WIDTH:67.01152PX;" ></span>
</span>
</a>
</div>
PHP processing :
Copy Code code as follows:
<?php
Include (' phpquery-onefile.php ');
$filePath = ' test.html ';
$fileContent = file_get_contents ($filePath);
$doc = phpquery::newdocumenthtml ($fileContent);
Phpquery::selectdocument ($doc);
$data = Array (
' Name ' => array (),
' href ' => array (),
' img ' => Array ()
);
foreach (PQ (' a ') as $t) {
$href = $t-> getattribute (' href ');
$data [' href '] = $href;
}
foreach (PQ (' img ') as $img) {
$data [' img '] = $domain. $img-> getattribute (' src ');
}
foreach (PQ) ('. Gamename ') as $name) {
$data [' name '] [] = $name-> nodevalue;
}
Var_dump ($data);
?>
The above code contains the fetch attribute and the innertext content (through NodeValue).
Output:
Copy Code code as follows:
Array (size=3)
' Name ' =>
Array (size=2)
0 => string ' Spiderman City Drive ' (length=20)
1 => string ' spiderman-city Raid ' (length=21)
' href ' =>
Array (size=2)
0 => string ' http://www.gahe.com/Spiderman-City-Drive ' (length=40)
1 => string ' Http://www.gahe.com/Spiderman-City-Raid ' (length=39)
' img ' =>
Array (size=2)
0 => string ' yun_qi_img/spiderman-city-drive.jpg ' (length=53)
1 => string ' yun_qi_img/spiderman-city-raid.jpg ' (length=52)
Strong is the PQ selector, syntax similar to jquery, very convenient.