This article introduces the data code sharing of Baidu Post Bar captured by PHP Web page capture. The program implements two functions: one-click crawling of all posts mailbox and paging crawling of mailbox, if you are interested, you can study Baidu Post. When you go to Baidu post, you will often see the owner share some resources and ask to leave a mailbox before sending it.
For a hot post, the number of mailboxes left is very large. The landlord needs to copy the reply mailboxes one by one and then paste and send the emails, either suffering or exhausting. It is boring to write a program that captures the mailbox data of Baidu Post Bar and needs to be taken away.
The program implements two functions: one-click capture of all posts mailbox and paging capture of mailbox, the interface is too lazy to do, the effect is as follows:
Old Rules: directly paste the source code
<? Php $ url2 = ""; $ page = ""; if ($ _ GET ['url2 '] = "") {$ url2 = "http://tieba.baidu.com/p/2314539885? Pn = 1 ";}else {$ url2 =$ _ GET ['url2 '];} if ($ _ GET ['page'] = "") {$ page = "1" ;}else {$ page =$ _ GET ['page'] ;}?> <? Phpif ($ _ GET ['type']! = "") {$ Counts = 0; if ($ _ GET ['type'] = "getAll") {$ pages = $ _ GET ['page']; $ url = $ _ GET ['url']; for ($ I = 0; $ I <$ pages; $ I ++) {$ ch2 = curl_init (); curl_setopt ($ ch2, CURLOPT_URL, $ url); curl_setopt ($ ch2, CURLOPT_FOLLOWLOCATION, TRUE); curl_setopt ($ ch2, success, FALSE); curl_setopt ($ ch2, success, FALSE, false); curl_setopt ($ ch2, CURLOPT_RETURNTRANSFER, TRUE); $ texts = curl_exec ($ ch2); curl_close ($ ch2); $ dat = getEmail ($ texts ); for ($ j = 0; $ j
"; $ Counts ++ ;}}} else if ($ _ GET ['type'] =" getNow ") {$ url =$ _ GET ['url2 ']; $ ch2 = curl_init (); curl_setopt ($ ch2, CURLOPT_URL, $ url); curl_setopt ($ ch2, CURLOPT_FOLLOWLOCATION, TRUE); curl_setopt ($ ch2, success, FALSE ); curl_setopt ($ ch2, CURLOPT_SSL_VERIFYPEER, false); curl_setopt ($ ch2, CURLOPT_RETURNTRANSFER, TRUE); $ texts = curl_exec ($ ch2); curl_close ($ ch2 ); $ dat = getEmail ($ texts); for ($ I = 0; $ I
"; $ Counts ++ ;}} echo 'collected data :'. $ counts. 'string';} function getEmail ($ str) {$ pattern = "/([a-z0-9 \-_ \.] + @ [a-z0-9] + \. [a-z0-9 \-_ \.] +)/"; preg_match_all ($ pattern, $ str, $ emailArr); return $ emailArr [0] ;}?>