What is Robots.txt?_ website operation

Source: Internet
Author: User

The robots.txt file restricts the search engine bots (called bots) that crawl the web. These bots are automatic and see if there are any robots.txt files that prevent them from accessing a particular page before they can access the page.

How do I create a robots.txt file?

You can create this file in any text editor. The file should be an ASCII-encoded text file, not an HTML file. File names should use lowercase letters.

Grammar
The simplest robots.txt file uses two rules:

    • user-agent: bots that apply the following rules
    • Disallow: Web pages to intercept

These two lines are treated as an entry in the file. You can include as many entries as you want. You can include multiple disallow rows and multiple user-agent in an entry.

What should be listed in the User-agent line?
User-agent is a specific search engine rover. The Network rover database lists a number of common bots. You can set entries that apply to a specific robot (by name) or set to apply to all bots (by listing asterisks). Entries applied to all bots should resemble the following entries:

user-agent:*

Google uses a variety of different bots (user agents). The rover for Web search is Googlebot. Other bots, such as Googlebot-mobile and Googlebot-image, follow the rules you set for Googlebot, and you can also set additional rules for those specific bots.

What should be listed in the disallow line?
Disallow the page you want to intercept. You can list specific URLs or URL patterns. The entry should begin with a forward slash (/).

    • to intercept the entire site , use a forward tilt.
      disallow:/
    • to block the directory and all of its contents, add a forward tilt after the directory name.
      disallow:/private_directory/
    • to block a Web page , list the page.
      Disallow:/private_file.html

URLs are case-sensitive. For example,disallow:/private_file.html will intercept http://www.example.com/private_file.html, but allow http:// Www.example.com/Private_File.html.

For more information, please visit: http://www.google.com/support/webmasters

Use the robots.txt file only if your Web site contains content that you do not want the search engine to index. If you want the search engine to index everything on your site, you do not need to robots.txt files (or even empty files).

Example:

--------------------------------------------------------------------------------------------------------------- ---------------------------

#
# robots.txt for Netmao Movie
# Version 2.0.x
#

User-agent: *
Disallow:/admin/
Disallow:/ inc/
Disallow:/html/
Disallow:/templates/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.