[Article banquet this article version: V1.0 last modified: 2008.12.09 reprinted please indicate the original article link: http://blog.zyan.cc/post/#/]In July, I wrote an article titled Architecture Design for full-text retrieval (search engine) of tens of millions of data records based on sphsf-+ mysql. The former company's classification information search was
regularly update the database. The update cycle is usually about weeks or months. The larger the index database, the more difficult it is to update. there is too much information on the Internet. Even powerful collectors cannot collect all the information on the Internet. Therefore, the collector uses a certain search policy to traverse the Internet and download documents. For example, the collector gene
XML sitemaps-I personally think this isWordPressAll blogs are requiredSeo pluginTool. Although Google's XML site map generation tool is used, the XML-sitmap generated by the plug-in can also be read by search engines such as ask, MSN, and Yahoo.
For more site map tools, refer:SitemapTools-free site map generation tool and plug-in introduction, Google site map Generation Tool beta
Headspace2-this plug-i
.
Website ArchivesDownload
Website archives ("website time back machine" Wayback Machine)
Search engine page capture statisticsDownload
You can query the pages indexed by important search engines and compare them with five websites of the same type.
Google Update Check Too
How many pages are indexed on my site?
If you want to know how many pages are indexed on your site, perform a simple test first. Go to Google or other search engines you like and search for your company name. If the company name is a common name (such as AAA plumbing or Acme industries), then add the region (AAA plumbing Peoria) or the company's most famous product (ACME industries sheet metal ), check wh
Full-Text Search | index
Content Summary:
Lucene is a Java-based Full-text indexing kit.
Java-based Full-text indexing engine Lucene Introduction: About the author and the History of Lucene
Implementation of full-text search: A comparison of luene Full-text indexes and database indexes
A brief introduction to the mechanism of Chinese word segmentation: A compar
achieve the page ranking, and I just for the registration of blog SEO need to know the knowledge. This article describes the content relative to the real search engine technology, is only fur , but the blog SEO is enough to use. I try to be the easiest way to understand and not design algorithms and esoteric theoretical knowledge.The working process of a search
another page, not only to avoid duplication of content, but also to reduce the resulting dead links. But one thing we need to be aware of is that you don't use 301 redirects for multiple degrees.
(v) correct use of Sitemap
If you want to include a better site, the search engine more friendly, Sitemap is a search engine
Php code example for recording search engine keywords
This article introduces a piece of code that uses php to record search engine keywords. it is a good reference for beginners.Use php to record search engine
Http://blog.csdn.net/sgivee/archive/2009/12/04/4938476.aspx
The main search engine site included submit entrance
Baidu submitted to the entrance: http://www.baidu.com/search/url_submit.htmlGoogle Google submitted to the Portal: Http://www.google.com/addurlYahoo Submission Portal: http://search.help.cn.yahoo.com/h4_4.htmlSohu Sogou Submit entrance: http://www.sog
The major SEO search engine spiders will continue to visit our site to crawl the content, will also consume a certain amount of site traffic, sometimes need to screen some spiders to visit our site. In fact, the commonly used search engine is so few, as long as in the robots file in a few commonly used
Search engine using FrontPage or auxiliary tools to make site search engine although very simple, but the steps are more cumbersome, suitable for the use of the webmaster operators. If the use of professional code to create search engines, in addition to the production site
First look at the spider list
Search engine
User-agent (included)
Whether PTR
Note
Google
Googlebot
√
Host IP Get domain name: googlebot.com primary Domain name
Baidu
Baiduspider
√
Host IP Get domain name: *.baidu.com or *.baidu.jp
Yahoo
Yahoo!
√
H
site.
* Link spam This problem is now seoer used by one of the commonly used means, the use of automatic mass blog, BBS, etc., so that can quickly obtain a large number of links, such practices are generally considered to be efficient and practical, that the search engine has no way to confirm the link is valid, but need to understand In China to do SEO, faced with the human intervention of Baidu, Yahoo!
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.