How to write this regular expression? Answers from experts! The code in a webpage is as follows:
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt
testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt
testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
...
We noticed that no class = "1", class = "2", class = "3 ",... it is regular, but the previous labels are not regular, sometimes span, sometimes p. what I want is to use php to write regular expressions to get class = "1 ", class = "2", class = "3 ",... the regular expression is hard to learn. I have tried it for half a day and cannot write it out!
Reply to discussion (solution)
Html or xml has a special dom api, especially the html with nested tags. try not to use regular expressions to obtain them, especially php regular expressions. this involves regular recursion, even if php can have a regular expression balancing Group provided by other languages, it is best not to use it.
<(Span | p) \ s + class = \ "\ d \"> \ s + (.*?) <\/H3> \ s + \ S *(.*?) \ S * <\/p> \ s * <\/(span | p)>
Thank you for your answers!
$s =<<< TXT testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt
testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt
testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
TXT;
Solution 1
include 'phpquery.php';$doc = phpQuery::newDocument($s);echo $doc->find('.1')->html();echo pq('.2')->html();
Get
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
Solution 2
include 'html_document.php';$p = new html_document( $s, 0);foreach($p->find('.\d') as $v) { echo "$v->innerHTML\n";}
Get
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
testtesttesttesttesttestt testtesttest
testtesttesttesttesttesttesttesttest
Thanks to the moderator, I have already pasted it.