URL normalization means that when there is more than one link pointing to a webpage with the same content, the search engine can select only one of the favorite URLs through various methods, it also tells the search engine not to include and index other URLs. From the perspective of the search engine, URL normalization reduces the indexing of duplicate pages of a website. Repeat pages on websites are also a part of the search engine optimization (SEO. URL standardization is a process of standardizing URLs. We recommend that you specify the URL paradigm for the official Google Blog.
Nonstandard URL
The following is an example of URL standardization: Venetian casino.
- Domain names starting with and without WWW are not divided, such as top-level domain names www.a.com and a.com; subdomain names www.a. B .com and a. B .com;
- Dynamic Web site and static Web site: such as http://www.nowamagic.net/archives/137.html and http://www.nowamagic.net /? P = 137;
- The URL contains redundant parts, such as the default port number 80, the default file name default. php, and index.html. The URL contains redundant "/";
- Empty database query; invalid query variable;
- Use IP addresses instead of domain name URLs;
- Case Insensitive: http://www.abc.cn/ABOUT.php and http://www.abc.cn/about.php;
As Google's search engine becomes more intelligent, it automatically analyzes multiple websites pointing to the same duplicate page, slowly select one of the URLs that Google considers to be the best for indexing. However, this process significantly increases the difficulty and time cycle of crawling index pages by search engines. At the same time, the internal and external link policies adopted by the individual will also affect Google's smart URL standardization behavior. Multiple URLs direct to the same content page. Although non-malicious duplicate pages are not punished by search engines, they will at least distribute the weight of the page. Therefore, it is necessary to adopt a certain method to standardize the website.
URL Normalization Method
- The URL of the top-level domain name.
Optimize the internal link structure of the website, including uniformly using the specified URL format in the website architecture. The same URL is always used when the hyperlink address is added to the article, if the search engine sees which absolute address is the most used, it will naturally treat it differently.
For Google search engines, we can use the Google website administrator tool to set the preferred domain to specify which domain name prevails. (Procedure: log on to your Google account and choose "add Website"> "verify ownership"> "Pass Verification"> click "Manage Website"> "website configuration"> "Settings"> "preferred domain ), (Note: When verifying the ownership of the Website, you must verify both www.domain.com and domain.com. There are two ways to verify the ownership: Add a metatag on the homepage, download the HTML file provided by it and upload it to the root directory metadata ;).
Take a Wordpress blog as an example. For example, if the preferred domain of a remote blog is nowamagic.net, set the blog address and installation address in the background of the control panel, in this case, the addresses displayed for all calls on the home page are nowamagic.net: Control Panel-settings-General-blog address and blog installation address changed to nowamagic.net. Note: After you change the Wordpress blog address, you will not be able to log on to the background. In this case, you need to modify it in database management. If you are using a VM, generally, you can use the phpMyAdmin tool installed in the virtual host management background to manage databases. Find wp_options in the data table and change the blog installation address siteurl and blog address home to nowamagic.net.
- Specify URL paradigm for duplicate pages
Using the rel = "canonical" label attribute to solve the problem of duplicate pages is also one of the proud features introduced by Google, specifying the URL paradigm.
- Set the robots.txt file to prevent URLs that do not need to be searched by the search engine, in the URL format.
- 301 redirection
301 redirection is to permanently redirect a page to another page. 301 redirection is also one of the most user-friendly methods for Seo. The page on which the search engine permanently redirects the record also solves the URL standardization problem.
For example, on a Linux Apache server, you can modify the. htaccess file to implement 301 redirection. Add the following code to the. htaccess file, which must be written before the URL rewrite rule.
Redirect 301/old.htm http://www.domain.com/new.htm or redirect permanent/old.htm http://www.domain.com/new.htm
If all domain.com needs to be 301 permanently redirected to www.domain.com, mod_rewrite is also required.
RewriteEngine onRewriteCond %{http_host} ^www.dom.com [NC]RewriteRule ^(.*)$ http://dom.com/$1 [L,R=301]
Other Windows hosts can be implemented through PHP, ASP, and JSP dynamic scripts.
Search engine-friendly URL standardization suggestions