Sitemap (Search Engine website map)

Source: Internet
Author: User
Tags website server
What is sitemap (Search Engine website map)-MySearch blog | Meta Search Engine Research-MySearch (metasoo.com) official blog

Reference a paragraph on Google to describe:

In the simplest case, sitemap is the list of webpages on your website. Creating and submitting sitemap helps ensure that Google knows all the webpages on your website, including the URLs that may not be found during normal Google crawling.

Mi search (actually, I personally) has a very direct definition of it:

Sitemap is a/multiple XML files that are provided to the search engine URL in the specified format.

In fact, the search engine sitemap, which can make "web crawler" lazy, is the product of Google, and now it has become the industry standard. (It is really the standard for first-class enterprises to sell, and second-class enterprises to sell products .) Currently sitemap version is 0.9, the official website is: http://www.sitemaps.org/(I am here many times is not open ). Currently, companies/websites that support this standard include Google, Yahoo, ask, live, IBM, etc. Foreign companies have obvious attitudes towards a standard cooperation, although this is not the greatest invention. Currently, Chinese search engines do not seem to support it. This is not a technical issue, but mainly an attitude issue. According to the current trend, it is estimated that the first search engine in China that supports sitemap is youdao (haha, guess ).

In fact, Baidu also has something similar, called the "Internet news open protocol", but the XML format is different. However, it seems that there are few followers, so there will be no sound after the launch of this content.

Role of sitemap:

The following is a reference to Google:

Sitemap is particularly useful in the following scenarios:

    • The website contains dynamic content.
    • Websites have pages that are not easily discovered by googlebot during crawling, such as pages with rich Ajax or Flash content.
    • The website is a new website with few links to it. (Googlebot crawls the network from one web page to another. Therefore, if your website does not have a good link, it may be hard to find it .)
    • Websites have a large number of content page archives. These content pages do not have good links to each other, or there is no link at all.

You can also use sitemap to provide Google with other information about your webpage, including:

    • The frequency of changing the webpage on your website. For example, you may update the product page every day, but only update the "my profile" page once every several months.
    • The last modification date of each webpage.
    • The relative importance of each page on your website. For example, the relative importance of a home page is 1.0, that of a category page is 0.8, and that of a personal blog entry or product page is 0.5. This priority only indicates the importance of a specific website relative to other websites on your website, and does not affect the ranking of your web pages in the search results.

Sitemap format:

See Instructions for specific formats: https://www.google.com/webmasters/tools/docs/zh_CN/protocol.html
Simple sitemap: http://www.metasoo.com/MetaSoositemap.xml
<? XML version = "1.0" encoding = "UTF-8"?>
<Urlset xmlns = "http://www.sitemaps.org/schemas/sitemap/0.9">
<URL>
<Loc> http://www.metasoo.com/</loc>
<Lastmod> 2008-08-08 </lastmod>
<Changefreq> weekly </changefreq>
<Priority> 1.0 </Priority>
</URL>
<URL>
<Loc> http://www.metasoo.com/MetaSoo/about/ </loc>
<Lastmod> 2008-08-12 </lastmod>
<Changefreq> weekly </changefreq>
<Priority> 1.0 </Priority>
</URL>
<URL>
<Loc> http://www.metasoo.com/MetaSoo/about/duty.htm </loc>
<Lastmod> 2008-08-12 </lastmod>
<Changefreq> weekly </changefreq>
<Priority> 0.8 </Priority>
</URL>
<URL>
<Loc> http://www.metasoo.com/MetaSoo/about/privacy.htm </loc>
<Lastmod> 2008-08-12 </lastmod>
<Changefreq> weekly </changefreq>
<Priority> 0.8 </Priority>
</URL>
<URL>
<Loc> http://www.metasoo.com/MetaSoo/blog/ </loc>
<Lastmod> 2008-08-12 </lastmod>
<Changefreq> daily </changefreq>
<Priority> 1.0 </Priority>
</URL>
</Urlset>

Note that for websites with more content, the number of URLs in each XML file is limited. Each sitemaps file provided contains no more than 50,000 URLs, and cannot exceed 10 MB (10,485,760) if not compressed ). Multiple sitemap files need to be generated if the limit is exceeded. You can create an index file for convenient submission.
Such as: daily bull market network sitemap: http://www.365bull.com/365bullcnsitemap.xml

<? XML version = "1.0" encoding = "UTF-8" ?> - < Sitemapindex Xmlns =" Http://www.sitemaps.org/schemas/sitemap/0.9 " > - < Sitemap > < Loc > Http://www.365bull.com/365bullcnsitemap1.xml </ Loc > < Lastmod > 2008-08-15t07: 45: 22 + 08: 00 </ Lastmod > </ Sitemap > - < Sitemap > < Loc > Http://www.365bull.com/365bullcnsitemap2.xml </ Loc > < Lastmod > 2008-08-15t07: 45: 22 + 08: 00 </ Lastmod > </ Sitemap > - < Sitemap > < Loc > Http://www.365bull.com/365bullcnsitemap3.xml </ Loc > < Lastmod > 2008-08-15t07: 45: 22 + 08: 00 </ Lastmod > </ Sitemap > - </ Sitemapindex > Another simple method is to submit RSS as sitenap. For example, mxsearch uses RSS as submitted by sitenap, and the search engine has a very good indexing effect.

Create a sitemap file:

There are three methods to create a sitemap file: 1. manual editing; 2. Using tools; 3. Compiling the background by yourselfProgram.

Generally, small websites can be edited manually. Tools are generally used in two ways: 1. Imitating web crawlers to traverse from the client does not make much sense; 2. Placing programs on the website server and traversing files to create sitemap, the disadvantage is that you need to constantly filter useless files and not put them into sitemap.

It is recommended that qualified webmasters write their own programs to generate sitemap.

Sitemapt submit:

There are two methods to submit sitemapt:
1. Ping the recipient's address (write the address later)
2. Manually submit to various search engines:
Google: https://www.google.com/webmasters/tools/
Yahoo: http://sitemap.cn.yahoo.com/
Live: http://webmaster.live.com/

I will continue to write about sitemap and its search engine promotion practices, hoping to follow the search blog.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.