Case development based on Lucene: design of the database of the vertical and horizontal novels

Source: Internet
Author: User

Reprint Please specify source: http://blog.csdn.net/xiaojimanman/article/details/45694049

Http://www.llwjy.com/blogdetail/fa404163e42295646ab6e36e1ddb1037.html

Personal Blog Station has been online, the website www.llwjy.com ~ welcome you to vomit Groove ~
-------------------------------------------------------------------------------------------------

First and everyone to say sorry, due to recent corporate and personal reasons, blog stop has been more than one months, and recently I will slowly restore the blog update.


In the previous several have introduced the vertical and horizontal Chinese novel network collection, this blog on the introduction of the database design.


Design ideas

We will design four tables to store the relevant information for the four acquisition classes in the Chinese-English fiction network. Table crawllist Main storage acquisition of the entrance, storage of the collected address, acquisition status and acquisition frequency; table Novelinfo stores the URLs obtained by the Update List page capture program and updates the other information through the introduction page capture program; table Novelchapter Stores the information obtained by the Chapter list collection program, and the table Novelchapterdetail stores the information obtained by the novel Reading page collection program. Table Novelchapterdetail information can be merged into the table Novelchapter, but here for the future expansion needs, deliberately separate it.

In these tables, add a state field that identifies whether the URL under this item needs to be collected, which is the key to implementing distributed acquisition.


Table Design Diagram



SQL text

/*navicat MySQL Data Transfersource server: Native database source server Version:50151source Host:localhost:3 306Source database:noveltarget Server type:mysqltarget server Version:50151file encoding:65001dat E:2015-05-13 15:37:35*/set foreign_key_checks=0;--------------------------------Table structure for ' crawllist ' The novel collection entrance------------------------------DROP TABLE IF EXISTS ' crawllist '; CREATE TABLE ' crawllist ' (' ID ' bigint () not NULL auto_increment, ' url ' varchar (+) ' not null,# #采集url ' state ' enum (' 1 ', ' 0 ') not null,# #采集状态 ' info ' varchar (+) default null,# #描述 ' frequency ' int (one) default ' $ ', # #采集频率 PRIMARY KEY (' id ') Engine=innodb DEFAULT Charset=utf8;--------------------------------Table structure for ' Novelchapter ' novel chapter information------- -----------------------DROP TABLE IF EXISTS ' novelchapter '; CREATE TABLE ' novelchapter ' (' id ' varchar (+) not NULL, ' url ' varchar (+) ' not null,# #阅读页URL ' title ' varchar (DEFAU) LT null,# #章节名 ' WordCount ' int (one) default null,# #字数 ' chapterid ' int (one) default null,# #章节排序 ' Chaptertime ' bigint (default null,# #章节时间 ' Create Time ' bigint default null,# #创建时间 ' state ' enum (' 1 ', ' 0 ') not null,# #采集状态 PRIMARY KEY (' id ')) engine=innodb default CHA Rset=utf8,--------------------------------Table structure for ' novelchapterdetail ' fiction chapter details----------------------- -------DROP TABLE IF EXISTS ' novelchapterdetail ';  CREATE TABLE ' novelchapterdetail ' (' id ' varchar (+) not NULL, ' url ' varchar (+) ' not null,# #阅读页url ' title ' varchar (50)  Default null,# #章节标题 ' wordcount ' int (one) default null,# #字数 ' chapterid ' int (one) default null,# #章节排序 ' content ' text,# #正文 ' Chaptertime ' bigint () default null,# #章节时间 ' Createtime ' bigint () default null,# #创建时间 ' UpdateTime ' bigint (defaul)  T null,# #最后更新时间 PRIMARY KEY (' id ')) engine=innodb DEFAULT Charset=utf8;--------------------------------Table structure For ' Novelinfo ' novel introduction information------------------------------DROP TABLE IF EXISTS ' novelinfo '; CREATE TABLE ' NoVelinfo ' (' id ' varchar (+) not NULL, ' url ' varchar (+) ' not null,# #简介页url ' name ' varchar () DEFAULT null,# #小说名 ' auth or ' varchar ' default null,# #作者名 ' description ' text,# #小说简介 ' type ' varchar (default null,# #分类 ' lastchapter ' varchar Default null,# #最新章节名 ' chaptercount ' int (one) default null,# #章节数 ' chapterlisturl ' varchar (+) Default null,# #章节列表页u RL ' wordcount ' int (one) default null,# #字数 ' keywords ' varchar (+) default null,# #关键字 ' Createtime ' bigint (default NUL) l,# #创建时间 ' UpdateTime ' bigint () DEFAULT null,# #最后更新时间 ' state ' enum (' 1 ', ' 0 ') not null,# #采集状态 PRIMARY KEY (' id ')) ENGINE =innodb DEFAULT Charset=utf8;

----------------------------------------------------------------------------------------------------
PS: Recently found other sites may be reproduced on the blog, there is no source link, if you want to see more about Lucene-based case development please click here. Or visit the URL http://blog.csdn.net/xiaojimanman/article/category/2841877 or http://www.llwjy.com/blogtype/lucene.html

Case development based on Lucene: design of the database of the vertical and horizontal novels

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.