I've been writing reptiles practiced hand lately, but I've found little data to match.
Take the blog Park For example, this is my regular
/http\:\/\/www\.cnblogs\.com\/' . $name . '\/[^\" ]+.html/i
Then match this classmate: http://www.cnblogs.com/hoojo/default.html?page=1
Found only 42 data, but this classmate obviously more than 42 article, ask how to optimize my regular
Reply content:
I've been writing reptiles practiced hand lately, but I've found little data to match.
Take the blog Park For example, this is my regular
/http\:\/\/www\.cnblogs\.com\/' . $name . '\/[^\" ]+.html/i
Then match this classmate: http://www.cnblogs.com/hoojo/default.html?page=1
Found only 42 data, but this classmate obviously more than 42 article, ask how to optimize my regular
First of all, you this http://www.cnblogs.com/hoojo/default.html?page=1 is only the first page, the first page seems to have so many articles, right? Http://www.cnblogs.com/hoojo/default.html?page=2 is the second page.
First, you need to determine how many pages are in his blog. You can get the total number of pages from the second page http://www.cnblogs.com/hoojo/default.html?page=2 共6页: 上一页 1 2 3 4 5 6
, and then add one to your original code for 循环
http://www.cnblogs.com/hoojo/default.html?page={$page_number}
.
Don't quite understand your regular writing.
I counted the first page altogether 50 articles, and then I realized this:
|i", $aa, $m);var_dump($m[1]);
The result is an array of article links.
During the testing process, it was found that the author of the article wrote links to other articles in the summary and was displayed. So your method will also read the links in the summary.
I am here to use the link of the original text link to get the links.
Hope to be able to help you.