Once the regular rule has been written, the page will be corrected once it has changed.
Is there a better way to extract the DOM of the page first?
Reply content:
Once the regular rule has been written, the page will be corrected once it has changed.
Is there a better way to extract the DOM of the page first?
I think what you need is a PHP DOM module ... Install by default Don't worry ...
Because I don't know what your actual application scenario is ... Let's write you a simple example ...
!--? php/* I heard that you need DOM ... */$doc = new DOMDocument ();/* I wrote a si Mple page ... change it to a curl result ... */$doc--->loadhtml (<<
Sunyanzi ' s test</t itle>
Hello
World Hey welcomehtml_section);/* Now we should try to get something ... */$h 1Elements = $doc-&G T;getelementsbytagname (' H1 ');/* This line prints "Hello World" ... */foreach ($h 1Elements as $h 1Node) echo $h 1node- >nodevalue, php_eol;/* and this line prints "http://segmentfault.com/" ... */echo $doc->getelementbyid (' Onlylink ' ->getattribute (' href '), php_eol;/* now I'll introduce something advanced ... using XPath ... */$xpath = new Domxpa Th ($doc);/* Also prints "http://segmentfault.com/" ... locate via H1 ... */echo $xpath->evaluate (' String (//h1[tex
T () = "Hello World"]/following-sibling::a/@href), Php_eol;
Basically... Wait until you have mastered the XPath ... You'll find DOM is much more flexible than regular ...
PHP's ability to process XML is far beyond your imagination ... Reading a manual is not a bad thing.
You're right about that. I'm going to use XPath now.