This search engine is suitable to use in a medium-sized LAN, because the found Web page exists in the database, not only can cable static HTML page, can search PHP, ASP and other dynamic pages. For a 50,000-page system (using PII-400 as a server), the search response time of about 2-10 seconds, fully meet the requirements, because Java, MySQL, PHP are cross-platform software, so this search engine can not only work on the Windows Server, It can also work in other systems such as Linux.
First, the establishment of the search engine needs of the database and data tables.
First set up the database:
c:\mysql\bin\> mysqladmin-uroot-pmypasswd Create Spider
Then build the table structure in the database
c:\mysql\bin\> mysql-uroot-pmypasswd Spider Spider.mysql
Where Spider.mysql is a text file with the following contents:
CREATE TABLE link (
Id int(10) unsigned NOT NULL auto_increment,
Url varchar(120) NOT NULL,
Class tinyint(3) unsigned NOT NULL default 0 ,
IsSearchLink tinyint(3) unsigned default 0,
PRIMARY KEY (Url),
UNIQUE Id (Id),
KEY Url (Url),
KEY Class (Class)
);
# The initial home page address of this local area network, search spiders start searching all other pages from this URL
INSERT into link VALUES (' 1 ', ' HTTP://102.211.69.1/', ' 0 ', ' 0 ');
# datasheet webpagelocal used to store all of the downloaded pages
CREATE TABLE webpagelocal (
Id int(10) unsigned NOT NULL auto_increment,
Url varchar(120) NOT NULL,
Content text NOT NULL,
PRIMARY KEY (Url),
UNIQUE Id (Id),
KEY Url (Url)
);
# data Table Webpagefindfast
# Use makefast.php to extract 512 bytes of retrieval information from the table webpagelocal to store them
CREATE TABLE webpagefindfast (
Id int(10) unsigned NOT NULL,
Url varchar(120) NOT NULL,
Title varchar(64),
Content blob,
PRIMARY KEY (Url),
KEY Url (Url),
KEY Title (Title)
);