The robots.txt file restricts the search engine bots (called bots) that crawl the web. These bots are automatic and see if there are any robots.txt files that prevent them from accessing a particular page before they can access the page.
How do I create a robots.txt file?
You can create this file in any text editor. The file should be an ASCII-encoded text file, not an HTML file. File names should use lowercase letters.
Grammar
The simplest robots.txt file uses two rules:
- user-agent: bots that apply the following rules
- Disallow: Web pages to intercept
These two lines are treated as an entry in the file. You can include as many entries as you want. You can include multiple disallow rows and multiple user-agent in an entry.
What should be listed in the User-agent line?
User-agent is a specific search engine rover. The Network rover database lists a number of common bots. You can set entries that apply to a specific robot (by name) or set to apply to all bots (by listing asterisks). Entries applied to all bots should resemble the following entries:
user-agent:*
Google uses a variety of different bots (user agents). The rover for Web search is Googlebot. Other bots, such as Googlebot-mobile and Googlebot-image, follow the rules you set for Googlebot, and you can also set additional rules for those specific bots.
What should be listed in the disallow line?
Disallow the page you want to intercept. You can list specific URLs or URL patterns. The entry should begin with a forward slash (/).
URLs are case-sensitive. For example,disallow:/private_file.html will intercept http://www.example.com/private_file.html, but allow http:// Www.example.com/Private_File.html.
For more information, please visit: http://www.google.com/support/webmasters
Use the robots.txt file only if your Web site contains content that you do not want the search engine to index. If you want the search engine to index everything on your site, you do not need to robots.txt files (or even empty files).
Example:
--------------------------------------------------------------------------------------------------------------- ---------------------------
#
# robots.txt for Netmao Movie
# Version 2.0.x
#
User-agent: *
Disallow:/admin/
Disallow:/ inc/
Disallow:/html/
Disallow:/templates/