Example explanation more detailed book collection rules for Jackie's novels _ website application

Source: Internet
Author: User
Tags arithmetic numeric server port

Add toAcquisitionRules
Rule Description
System default variable:<{articleid}>-article serial number,<{chapterid}>-chapter serial number, <{subarticleid}>-article sub serial number, <{subchapterid}>- The chapter sub ordinal.
System Label * can replace any string.
System tab! can replace any string other than < and >.
System Label ~ can replace any string except <> ' ".
System label ^ can replace strings except numbers and <>.
System label $ can be substituted for a numeric string.
Acquisition rules, the portion of the content that needs to be fetched is replaced with more than four system labels, such as!!!!

Basic Settings

website IdentificationConfigs\article\collectsite.php in the written logo, can be casually filled out, generally for the acquisition of the site's domain name shorthand, and other rules to distinguish. Example: Feiku

Site nameThe name of the collection station. Example: Flying Pool

website AddressThe address of the collected station. Cases:http://www.feiku.com

Article sub-ordinal operation MethodIt's not necessary to write it, I'm going to leave it blank.
Supports the use of <{articleid}> marking of arithmetic (+ Plus,-minus, * multiply,/except,%)

Chapter Sub-ordinal operation MethodIt's not necessary to write it, I'm going to leave it blank. (Who knows how many books he puts in a folder, he doesn't follow the rules, I'm not a collector)
Supports the use of <{articleid}> marking of arithmetic (+ Plus,-minus, * multiply,/except,%)

Proxy server AddressDo not use proxy server leave blank

Proxy Server Port

whether the existing chapters can not be fully emptied and collected againWhether to choose according to their own needs

whether to set the collected article as the defaultWhether to choose according to their own needs, if you choose "Yes" whether the article is serialized or finished, you stand on the display of all, suggest that "no"

send Http_referer flag to break the collection settingsWhether the default is "yes", do not know what to use, I choose "yes" first break the next

other page code(Automatic detection of GB2312 UTF8 BIG5) The default "Automatic detection" encoding is different from this site will automatically try to convert
Article Information page collection rules

article information page addressBook information page URL, book ID with <{articleid}> instead. Cases:
http://feiku.com/book/<{articleid}>/index.html

article title Collection rulesAsk to see the source file of the Web page ah, will not be able to stop. View the source file of the next information page, and then find the article title in the source file where (we are taking the flying library for example, is the chapter information page that "article title" in the source file location). Here, for example, "My Pretty Lady", Find the code near the title is <div id= "Crbooktitle" ><span class= "BookTitle" > "My Beautiful Lady" </span></div> Copy the code above to the box in the Title collection rule, and then replace the real title of My Beautiful lady with the!!!! Of course, can be replaced with other replacement symbols such as * * * But the emphasis is on the ability to express the meaning of the scope of the smaller the better (habit problem, here of course, can only collect the title of the article, but some other collection when there is something you do not want).

Author Collection Rules<li class= "L6" ><a href= "/author/wb/144238.html" > Li Xingyu </a></li> here Li Xingyu is the content to be collected, with!!!! Instead of 144,238 only useful for this article, other articles have other numbers, so use any number string $ instead. So the author collects the rules
<li class= "L6" ><a href= "/author/wb/$.html" >!!!! </a></li>

article type collection rules<li class= "L2" ><a href= "/book/ln/133.html" > Urban </a></li> by the above two collection rules, it's easy to see that the rules here are <li class= "L2" ><a href= "/book/ln/$.html" >!!!! </a></li>

article type correspondence relationWrite it yourself, hehe, give the corresponding relationship of the flying library, we refer to the following. Fantasy =>1| | Fantasy =>1| | Martial Arts =>2| | Fairy Knight =>2| | Romance =>3| | City =>3| | Sci-Fi =>7| | Supernatural =>8| | game =>6| | Athletic =>6| | Historical =>4| | Military =>4| | Mei Wen =>10| | Doujinshi =>9| | Biography =>10| | Famous =>10| | Notes =>10| | Joke =>10| | Foreign =>10| | Classical =>10| | Children =>10| | Detective =>5| | Operated =>10| | Fashion =>10| | English =>10| | Computer =>10| | Learn =>10| | Legal =>10| | Other =>10
The offset type name and the type ordinal of this station are separated by "=>", and two types are used between "| |" Split, type name "default" identifies default type correspondence
This station type and ordinal correspondence relation is as follows:
Fantasy Magic =>1| | Martial Arts Repair True =>2| | Urban Romance =>3| | Historical Military =>4| | Detective Reasoning =>5| | Online Games Anime =>6| | Science Fiction =>7| | Horror Supernatural =>8| | Prose Poetry =>9| | Other types of =>10

keyword Acquisition rulesFind the key code near the main character search keyword--my beautiful Li Xingyu beautiful city <br/> Here "my beauty Li Xingyu beautiful City" with * * * * * replaced. Result rule is lead search keyword--****<br/>

Content Introduction Collection rules<div id= "crbssum" > ' Big ' miss and big ' small ' elder sister, you don't toss me, ok? I beg you! ~~!<br>, who has a billionaire, is unwilling to Zhu Men the life of wine and intrigue, giving up the big company of the family, Instead, I chose to be an ordinary white-collar in a small company. <br> in the restaurant of a hero to save the United States so that he met a big beauty, and this beauty unexpectedly is the company of Liu Xing in Shanghai head office of the daughter, in other words, that is his big miss. <br> but on the surface is very beautiful seemingly elegant lady but have the unknown side, really want to human life ah!<br> for me as a nanny? Miss Big, you're joking, you're not going to do anything, and I'm babysitting? <br> boss has two daughters? So the beauty of your day is two Miss?<br>, huh? What the? And you decided to live here? Ah! Stop messing with me, ~~!. One is enough for me, another one. But really ' big ' little ' sis! <br> ' big ' looks elegant and gentle but very confused big, big ' small ' sister appearance ice gorgeous but very hot shrew, and two sisters from small to large, this time to live in my home, this ... But really lively!<br> want to bubble beauty is ' big ' small ' elder sister to ' bubble '! Ah, ~~! 's not letting anyone live, ~~!<br/></div>.
<div id= "Crbsrole" > According to the above, the result rule is
<div id= "Crbssum" >****<br/></div>
<div id= "Crbsrole" >
Note:Source file inside some code to do something, you copy in addition to replace with replacement to collect content, do not change the format, don't look at him to change lines, you give back a few and the front of the connection to together.

cover Picture Collection rules<div id= "crbtlbookimg" > result rule is <div id= "crbtlbookimg" ></div>" "125" height= "$" can also be made into Width= "$" height= "$" but if the collection station cover pictures are the same size without a reason. Looking for the cover picture in the source file location, you can go to the information page to view the attributes of the picture, see the picture what name, Then search the source file.

Filtered cover PictureFind an article without a cover picture, then look at the img src= "and" What's inside, write on it, this is/img/noimg.gif

catalog page link Collection rulesBecause we have not written the number of face, here we use this rule to collect the sub serial number on it. In the source file of the article information page, locate the code near the Directory page connection (generally in the vicinity of the click Reading, the Flying library is"Click to read"Code near the source file)
"<a href="/html/book/168/144238/list.shtm "><font color=" #CC0000 "> click Reading </font></a>"
Here to collect the content 168 and 144238 can be replaced by any number, so the result rule is
"<a href="/html/book/$$$$/$/list.shtm "><font color=" #CC0000 "> click Reading </font></a>"
The content collected by this rule will be marked as<{indexlink}>(The following sub serial number can be replaced with this, hehe), can be used in the following "Article Directory page address" inside

full-text tag acquisition rulesNeed to find a complete work, in the information page source file to find the code near the writing process (with the process "finish")
<li class= "L3" > Writing process </li>
<li class= "L4" > End </li>
Writing process with!!!! Instead, so the rule of the result is
<li class= "L3" >!!!! </li>
<li class= "L4" > End </li>
This rule is not a collection of content to save, but the match is considered to be all in all, mismatch is considered to be serialized
Article catalog page collection rules

Article directory page addressIs the address of the directory page
http://feiku.com/html/book/168/144238/List.shtm
But the 168 articles are numbered with the above<{indexlink}>Replace 144238 article serial number with <{articleid}> instead, the result rule is
http://www.feiku.com/Html/Book/<{indexlink}>/<{articleid}>/list.shtm

Sub-volume name collection rulesView the source file of the catalog page, locate the code near the <div id= "Nclasstitle" > The body of the text is what we want to collect, with!!!! Instead, the result rule is <div id= "nclasstitle" >!!!!

Chapter Name Collection RulesFind the chapter name near the code update word number: 3402 "> Chapter One Elephant ~ ~ Elephant ~~! </a></li> here is the first chapter Elephant ~ ~ Elephant ~~! is to collect the content with!!!! or * * * instead of 3402 is any number in lieu of $, the result rule is to update the word ">!!!! </a></li>

Chapter serial Number collection rulesFind code near Chapter ordinal
<li><a href= "3320510.shtm" title= "Update Time:
3320510 of which we are going to collect the chapter number is replaced by $ $, the result rule is
<li><a href= "$$$$.shtm" title= "Update Time"
Chapter Content Page Collection rules

Chapter content page address
http://feiku.com/html/book/168/144238/3320510.shtm
The 168 articles are numbered with the above<{indexlink}>Replace 144238 article serial number with <{articleid}> instead of 3320510 chapter serial number with <{chapterid}> instead, the result rule is
http://www.feiku.com/Html/Book/<{indexlink}>/<{articleid}>/<{chapterid}>.shtm

Chapter Content Collection rulesChapter content near the code, OH too big ah, I was lazy.
</div>
<div id= "Booktext" > Chapter Content
</div>
Above the <div id= "Booktext" > inside the book chapter content code is not booktext such as some <div id= "ssmmkkg" > but
</div>
<div id= "
Is all there is, so use him, chapter content with * * * substituted, the result rules such as, everybody oneself study
</div>
<div id= "****</div>


Chapter Content filtering rulesAll the things you don't want in the code above can be written here. Here are some of the things I've removed, and we can do it on our own.
<a href= "/user/messages.aspx?to=badmin&title=
[Fei ku nethttp://www.feiku.com]
http://www.feiku.com
Flying Warehouse Network
http://www.cmfu.com
Booktext ">
Cmfu.com
Multiple filter rules, each rule must be one line, you can use a replacement label, such as:<div>!</div>
whether to collect picture content to localWhether according to the needs of their own choice (below all choose it, exhausted, flashing)
collection of local image processing, you need GD library support
whether to enable picture processingWhether or not to enable image processing has a certain impact on the acquisition speed
whether to collect pictures or not to add watermarksWhether
The watermark is set in the parameter settings of this module, and it is the same as the watermark method for hand-uploaded images.
Collect picture background color
Leave this blank, and the system will automatically judge
Erase the original image watermark by region
Erase the contents of the area according to the rectangular coordinates in the picture. A rectangle is represented by four numeric values (separated by ","), namely the upper-left corner of the rectangle and the lower-left corner of the X,y x,y. When the x,y is greater than 0 indicates how many pixels to start from the upper-left corner of the picture, and when the x,y is less than 0, the number of pixels from the bottom right corner of the picture is reduced. Multiple Zones with "|" Segmentation.
For example, this item is set to "1,1,100,50|-100,-50,-1,-1", which represents the rectangular area 100*50 the upper-left and lower-right corners.
Erase the original image watermark according to the color
General watermark color and picture background and content color are different, you can set more than one watermark color all erase, with "|" Separate, such as "#FAFAFA | #FF0000 | #00FF00"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.