Introduction to PHP collection tutorial, teach you how to write collection

Source: Internet
Author: User
Tags preg

Introduction to PHP collection tutorial, teach you how to write collection

Our first step is to collect all the connections, we this is not a simple collection of an article Oh, we have to do is to collect the whole book, and saved to a text, because now MP3 Universal, can look at e-books.
How to save a book, of course, to use the title to save for easy to find pull, we first collect the title of the book,
Let's take a look at the prototype:
<meta name= "description" content= "Wu Xian (ii), after Jin Yong martial Arts Bible: 2" >
The rule is:
<meta name= "description" content= "title" >
Let's write the regular expression, do not tell me not, will not come to Hunan pull, hey hei many big birds.
Regular expression:
<meta name= "description" content= "(. *?)" >
Here we go, pull! The first thing we need to get the resources here is to use a function:
File_get_contents ()
Introduced:
Main function: Read the whole file into a string
The prototype is: String file_get_contents
(string filename [, bool Use_include_path [, resource context [, int offset [, int maxlen]]]


What does that mean, in fact, is to tell you to search for a specified string within a resource and give it to a variable
The above is the beginning need to use, we understand a little bit to start to write a little more profound understanding and can remember, I will analyze the way to write the program:
We collect an address, not just collect a book, so our collection address is changing, what is the change? This time a huge piece of chalk was thrown over, didn't I tell you? Variable, a strict Wangjianjun teacher, exhausted the whole body strength, collected in the chalk to me mercilessly threw over, I want to cry ... The teacher hit the!!!!!!!!. Hit home to see AH.
With variables good, then use the variable, we get the address, the code is as follows:
$url = "http://book.sina.com.cn/nzt/lit/zhuxian2/index.shtml";//Book Address
With the above, you should be able to write it all out now, start code:
<?php


//****************************************************************


$url = "http://book.sina.com.cn/nzt/lit/zhuxian2/index.shtml";//Book Address


$ver = "old"; Old and new versions


Because there are two types of books on his page, so we're going to make a difference here.

//****************************************************************


Get the page code file_get_contents () read the file into a string, and it needs to be used at the bottom


$r = file_get_contents ($url);


Search for the title in the string above, and assign the value to the variable $booktitle, $booktitle is an array,/is to get started!


Preg_match ("/<meta name=" description "content=" (. *?) " >/is ", $r, $booktitle);


Assign the first occurrence of the caption to the variable bookname.


$bookname = $booktitle [1]; Title


Print_r ($booktitle);d ie (); Do not understand the output this look, hey, help you understand


/*************************************************************************************


* Prototype: <li><a href=/nzt/lit/zhuxian2/1.shtml target=_blank class=a03> 45th Chapter Pain (1) </a>


* The rule is: <li><a href= is not fixed. shtml target=_blank class=a03> not fixed </a>


*isu is a regular pattern, the pattern is not greedy, that is to say, as long as the match ends


*************************************************************************************/


$preg = '/<li><a href= (. *). shtml Target=_blank Class=a03>/isu ';


/********************************************************************************


*preg_match_all for global Regular expression matching


Prototype


*
int Preg_match_all


*


(string pattern, string subject, array matches [, int flags])


* Meaning: In the global search resource variable $preg, get an array assignment to a variable $ZJ, this variable is the array.


* Access to the resources of the time with the logo can be, will not look at the array Oh!


* Miss Wang said, will not be the array to go out and chew the book, when will come in


**********************************************************************************/


Preg_match_all ($preg, $r, $ZJ);


Print_r ($ZJ);d ie (); Do not understand the output this look, hey, help you understand


Calculate the number of titles, I was asked the last hint to see how many chapters, how many collected


$BOOKZJ = count ($zj [1]);


Judge you want to collect the plate is kind of oh, because the content began different oh, in fact, can automatically judge, I also wrote, but do not publish, because very simple


if ($ver = = "new") {

$content _start = "<!--the contents of the text began-->";

$content _end = "<!--body content end-->";

}


if ($ver = = "old") {

$content _start = "</table><!--newszw_hzh_end-->";

$content _end = "<br>";

}


After the file is collected, then it is processed. This is set code, why is this, because you look at the site source code, HEY!!!

Header ("content-type:text/html;charset=gb2312");

/*****************************************************************************************

* Merge from 1 to 136 pages at a time. This is the most fun ... Play a copyright, lest someone infringement, hey, as if I was in tort Oh!!!

* So-and-so must want to kill, this means to write a copyright, create a file.

*****************************************************************************************/

Writer ($bookname. "A total". $bookzj. " The section RN handsome Liu and in ". Date (" D M J g:i:s T Y ")." In order to graduate the design of the novel collation Collection Rn ","./ljy/". $bookname.". TXT "," w+ ");

/*****************************************************************************************

* Merge from 1 to 136 pages at a time. This is the most fun ... Play a copyright, lest someone infringement, hey, as if I was in tort Oh!!!

* So-and-so must want to kill, this means to write a copyright, create a file.

*****************************************************************************************/

For ($i =0 $i < $bookzj; $i + +) {//hint: $bookzj What's in the front of you.


echo "http://book.sina.com.cn". $zj [1][$i] ". shtml";d ie ();


$str = file_get_contents ("http://book.sina.com.cn". $zj [1][$i]. ". sHTML ");


Preg_match ("/(<title>) (. *?) (</title>)/is ", $str, $title);


$title = Str_replace ("_ Reading Channel _ Sina Net", "", Preg_replace ("/<") >/s "," ", $title [2]));


/***************************************************************************


*preg_replace performs search and replace of regular expressions


*str_replace usage is really not good to say, see example! is actually a replacement


* str = "ABCABC". Replace (/a/g, "D"); The result is DBCDBC


* str = "ABCABC". Replace (/a/, "D"); The result is DBCABC


***************************************************************************/


Preg_match ("/(". $content _start. ") (.*?) (". $content _end.") /is ", $str, $content);


$content = Preg_replace ("/<" (. *?) >/s "," ", Str_replace (" </p> "," RN ", $content [2]));


$content = Str_replace ("
"," ", Preg_replace ("/^[s]*n/is "," ", $content));


$content = Str_replace ("?", "" ", Preg_replace ("/^[s]*n/is "," ", $content));

$result = "RN". ($i + 1). " Section--------". $title." _ Mr. Wang is handsome---------rn ". $content;


Var_dump ($result);d ie ();


Writer ($result, "./ailaopo/". $bookname. ". TXT "," A + ");


echo "novel". $bookname. " Altogether ". $bookzj." section, now sorted to the ". $i." Section _ ". $title." <br> ";

}
echo "novel". $bookname. " Altogether ". $bookzj." Section has been all sorted out! ";


function writer ($content, $url, $mode)
{
$fp = fopen ($url, $mode);
Fwrite ($fp, $content);
Fclose ($FP);
}
?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.