Point Rui SEO Series: Web site how to attract crawl

Source: Internet
Author: User
Keywords SEO

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

How to attract crawling Web site

1. How do mainstream search engines discover sites and Web pages?

Search engines use spider programs to search the Web, collect Web pages, assign unique identities, scan text, and submit to indexing programs. During the scan, the spider program extracts the hyperlinks that point to other pages in the Web page that crawls, and then crawls the pages that are pointed to. (like the http://www.ushangpin.com link on my website, I did an experiment, add a custom page below, and the next time the snapshot is updated, it will appear about my custom page).

2. How do search engines find your site?

From the mainstream search engine overall there are 4 ways to discover new sites. The first is the most common to submit your Web site to the search engine, the second is when the search engine from other indexed sites to find links to this site, so that it crawled. The third is for Google, is registered Google Webmaster Tools, confirmed after the submission of a site map of this site. Four is to redirect from a page that has already been indexed to a new page (such as 301 orientation, which we'll talk about later). The newly registered website is best not to use Web site to submit the software of the website batch, also do not submit the same Web site to the same search engine many times, this will have undesirable consequence.

3. What did the spider do to your site?

Once the spider has visited your site, it will crawl each page in order. When it finds a hop the inner chain is recorded and crawled later or on the next visit. Eventually the spider crawls the entire site. In the next step. I'll tell you how Spiders index pages based on a search query, and I'll explain how each index page is ranked. If a site is a tree, the root is the site's home page, the directory is the branches, the page is the leaves at the end of the leaf. Spiders crawl on the transfer of nutrients into the palace, starting from the roots gradually to reach each part of the order based on the PR value of the calculation of the importance. If this is a reasonable structure of the tree, then the crawl can be balanced to crawl to his branches and leaves (so the beginning of the time to say a site template is reasonable, the code is written to help search engine included).

4. Site map to the role of the collection

A site map is an HTML page whose content is a sequential list of links to all pages on the site. A good site map helps visitors find what they need, and allows search engines to use site maps to manage crawl behavior. Spiders, in particular, may be indexed to all pages of the site after multiple visits, and will often check for updates later. Spiders will also pay attention to the number of site map (depth), and combined with other factors to determine the PR value, that is, the weight of each page.

5. Site structure and navigation

Whether your new station or your old station, you need to be in the site's structure up and down to attract spiders crawl, you need to remember that each page URL is the spider in the page encountered the first text block.

5.1 Site Directory Structure

Limit your site depth to 4 levels as much as possible

Home-area Page-Contents page-content page

The site must have a shipping structure. The following example describes the Site directory structure

Optimizing file names and extensions

In the overall site, each page's own file name is also part of the optimization, each page to the extent possible to use static. html extension.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.