URL site Normalization

Source: Internet
Author: User
Tags session id
URL normalization (URLs canonicalization) is a big problem that has emerged in Google search results in the past year. It refers to the search engine to choose the best URL URLs as a real URL of the process.

For example, the following URLs generally refer to the same file or Web page:

Http://www.domainname.com
Http://domainname.com
Http://www.domainname.com/index.html
Http://domainname.com/index.html

However, technically speaking, these URL URLs are different. Although in most cases, these URLs are returned with the same file, which is your homepage. However, technically speaking, the host can return different content to these several URLs completely.

When search engines want to standardize URLs, search engines need to choose the best representation from these choices. Generally speaking, your homepage should be fixed, only one. But sometimes, in many sites webmaster links back to the home page, the URL used is not unique. It is likely to be on your site, one will be linked to the URL http://www.domainname.com, a moment to link to the URL http://www.domainname.com/index.html.

Although this will not cause any trouble to visitors, because these URLs are the same file, but for Google is causing confusion, which is your real home page. If on your site, different versions of the Web site appear in large numbers, then these two URLs may be included in the database Google, this will result in copying content pages.

The so-called Copy Content Web page, refers to two or more pages of content is the same or most similar. Many times, copying Web pages may be cheating. Even if it's not cheating, search engines usually just pick one of them to return the search results, and the other copy pages are at the bottom of the line, so they can't be found at all.

When your site appears URL normalization problem, it is possible to create a suspected copy of the Web page, thus affecting search engine results rankings.

From the webmaster point of view, you should do two things:

1 You use only one URL inside your site when linking to other pages, especially the home page. Whether it includes www or not, you should use only one version from beginning to end. So the search engine will know which one is the normalized homepage URL.

2 but you can't control which Web site you use to connect to your home page. So you should be on your host server, put all the URLs that are likely to be homepage URLs, do 301 redirect to the URL version of your chosen homepage. That is to say, from the following URLs

Http://domainname.com
Http://www.domainname.com/index.html
Http://domainname.com/index.html

Have to do 301 redirect to this URL

Http://www.domainname.com

It is important that if your site has a problem with URL normalization, you must not use Google's URL to delete the feedback form, to request the deletion of one version of the site. For example, what you want is a version with www

Http://www.domainname.com

You must not go to the Google website to fill in the form, requesting no WWW home page URL

Http://domainname.com

be deleted. Because of that, your entire domain name may be deleted for 6 months.

Of course, there are other types of URL normalization issues besides the two versions that contain WWW and do not include www. For example, sometimes the search engine will remove or add the last slash at the end of the URL. Sometimes you try to convert uppercase letters to lowercase, and sometimes you try to remove the conversation ID (session ID) and so on, which can cause Web site normalization issues.

Original: Zac Address:


URL normalization (URLs canonicalization) is a big problem that has emerged in Google search results in the past year. It refers to the search engine to choose the best URL URLs as a real URL of the process.

For example, the following URLs generally refer to the same file or Web page:

Http://www.domainname.com
Http://domainname.com
Http://www.domainname.com/index.html
Http://domainname.com/index.html

However, technically speaking, these URL URLs are different. Although in most cases, these URLs are returned with the same file, which is your homepage. However, technically speaking, the host can return different content to these several URLs completely.

When search engines want to standardize URLs, search engines need to choose the best representation from these choices. Generally speaking, your homepage should be fixed, only one. But sometimes, in many sites webmaster links back to the home page, the URL used is not unique. It is likely to be on your site, one will be linked to the URL http://www.domainname.com, a moment to link to the URL http://www.domainname.com/index.html.

Although this will not cause any trouble to visitors, because these URLs are the same file, but for Google is causing confusion, which is your real home page. If on your site, different versions of the Web site appear in large numbers, then these two URLs may be included in the database Google, this will result in copying content pages.

The so-called Copy Content Web page, refers to two or more pages of content is the same or most similar. Many times, copying Web pages may be cheating. Even if it's not cheating, search engines usually just pick one of them to return the search results, and the other copy pages are at the bottom of the line, so they can't be found at all.

When your site appears URL normalization problem, it is possible to create a suspected copy of the Web page, thus affecting search engine results rankings.

From the webmaster point of view, you should do two things:

1 You use only one URL inside your site when linking to other pages, especially the home page. Whether it includes www or not, you should use only one version from beginning to end. So the search engine will know which one is the normalized homepage URL.

2 but you can't control which Web site you use to connect to your home page. So you should be on your host server, put all the URLs that are likely to be homepage URLs, do 301 redirect to the URL version of your chosen homepage. That is to say, from the following URLs

Http://domainname.com
Http://www.domainname.com/index.html
Http://domainname.com/index.html

Have to do 301 redirect to this URL

Http://www.domainname.com

It is important that if your site has a problem with URL normalization, you must not use Google's URL to delete the feedback form, to request the deletion of one version of the site. For example, what you want is a version with www

Http://www.domainname.com

You must not go to the Google website to fill in the form, requesting no WWW home page URL

Http://domainname.com

be deleted. Because of that, your entire domain name may be deleted for 6 months.

Of course, there are other types of URL normalization issues besides the two versions that contain WWW and do not include www. For example, sometimes the search engine will remove or add the last slash at the end of the URL. Sometimes you try to convert uppercase letters to lowercase, and sometimes you try to remove the conversation ID (session ID) and so on, which can cause Web site normalization issues.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.