Be good at using robots.txt this setting to do SEO optimization

Source: Internet
Author: User
Keywords SEO

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Last time and we shared a < new station in Baidu K station in the case of serious, there are a lot of children's shoes plus my qq to learn from me, in fact, I also just contact SEO is not experienced people, and career is not engaged in the network of this industry, just their hobbies. I am also often in Lou, Mouchangqing and other well-known promotional blogs and web sites continue to learn from, coupled with their own enough time and patience to test, from the practice of learning to gain experience!

All right, let's move on. Today's topic Robots.txt,robots.txt is the first file to view when visiting a Web site in a search engine. When a search spider accesses a site, it first checks to see if there is a robots.txt in the root directory of the site, and if so, the search robot will determine the scope of the access according to the contents of the file; If the file does not exist, all the search spiders will be able to access all pages that are not password protected on the site!

The introduction of the robots, already very clear, here to say why the site is very important. Many webmasters are not in their own site root directory to add this file and set it, its standard format you can search engines, you can use Google Webmaster tools to generate.

Use robots.txt to tell the spider their site weight distribution

To know, for a Web site, the weight is limited, especially the grassroots site, if the entire site to give equal authority, one unscientific, and then completely wasteful server resources (search spiders than normal access to occupy server resources, cpu/iis/bandwidth, etc.); You can think about it, Just like your site structure is not clear, no good weight statement, the spider can not determine what content on your site is what important content, what content is your main content.

Screen spider on the background files are included in the use of other standardized web page code, here does not explain, to my own grass egg nets, I think can be shielded on the cache, include, JS, update, skins, etc. directory, in order to not stupid B tell others directory of administrators, So the admin directory is not written here.

User: The following rules are applicable to the rover, generally fill "*"

Disallow: pages to intercept, usually written in front of allow

Allow: Do not intercept the page, general fill "/"

Sitemap: Site Map URL

If you want to block some spiders, someone asks if personalization is set up? You can write it on top of that.

User-agent:baiduspider

Disallow:/

With robots.txt limit garbage search engine included, reduce the site pressure, you can look at your traffic statistics, see the flow of traffic mainly from which search engines, do not come to the flow of the spider completely shielding; because I have a buddy is a virtual host provider, so know that garbage spiders on the stability of the site is very large He told me that I had met some of the site IP only dozens of a day, but the consumption of traffic on a fairly 1000-IP normal access. The following example is assumed to allow only Baidu, Google spider Access, all other prohibited

User-agent:baiduspider

Disallow:

User-agent:googlebot

Disallow:

User: *

Disallow:/

Sitemap:

With robots.txt tell Spider Web Station map is which file, Sitemap is to tell spiders your map file is which file, use absolute address, Google Spider suggested to Google Webmaster tools to submit, the advanced application of robots can find information on their own.

Resources:

Http://baike.baidu.com/view/1011742.htm

With some search spider robot Name:

Baidu each product uses different user:

Wireless Search Baiduspider-mobile

Image Search Baiduspider-image

Video Search Baiduspider-video

News search Baiduspider-news

Baidu Search Tibet Baiduspider-favo

Baidu Alliance Baiduspider-cpro

Web pages and other search Baiduspider

Search Spider's User:

Sosospider

Sosoimagespider

Google's

Googlebot

Googlebot-image

Googlebot-mobile

Mediapartners or Mediabot This special note: Google advertising crawler, used to match ads, if you do Google ads and limit all spiders, you tragedy, did not do Google ads, reptiles will not visit (this is Google's tough place, with a spider to serve advertising)

Others do not say, you can search their own, little brother writing sparse if there is a shortage of places, I hope that you can guide Pleaes prawns!

This article reproduced please indicate from the grass egg net www.fuckegg.cc, reproduced please keep this link, please respect the original Zhao design www.zhaofeng.org!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.