1 UseLWP::Simple ;2 UseHtml::Linkextor;3 4 $html= Get ("http://www.baidu.com");5 $link= Html::linkextor->new (\&check);6 $link->parse ($html);7 8 Subcheck{9($tag,%links) =@_;Ten Print "$tag \ n"; One foreach $key(Keys %links){ A Print "$key $links {$key}\n"; - } - } the - #$tag for tag types such as a, link, img, script, etc. - #%links is the hash type, the key is the link name, the value is the link value - #比如对于a标签, the key in links is href, and the value is the link name in the href + # link - # href-/favicon.ico + # link A # href-/content-search.xml at # link - # href-//www.baidu.com/img/baidu.svg - # link - # href-//s1.bdstatic.com - # link - # href-//t1.baidu.com in # link - # href-//t2.baidu.com to # link + # href-//t3.baidu.com - # link the # href-//t10.baidu.com * # link $ # href-//t11.baidu.comPanax Notoginseng # link - # href-//t12.baidu.com the # link + # href-//b1.bdstatic.com A # img the # src-//www.baidu.com/img/bd_logo1.png
This code prints all the label names in the page with the corresponding link address
If we're going to print all of the IMG addresses, then we might use $tag to determine what kind of tag, and then extract the data further.
Here's what you can see here: Perl html::linkextor modules (2)
Perl html::linkextor Modules (1)