Intermediary transaction SEO diagnosis Taobao guest Cloud host technology Hall
In the article "Viewing search engine spiders crawling your site through an HTTP status code," I introduce some of the HTTP status codes and meanings that are often involved, such as the HTTP status codes that are frequently discussed and related to this article:
404: The server cannot find the specified resource, the requested Web page does not exist (for example, the browser requested the Web page is deleted or moved, but does not exclude the possibility of future links valid);
410: The requested Web page does not exist (Note: 410 represents permanence, while 404 indicates temporary);
200: The server successfully returns the requested Web page;
301: URL Permanent redirect
302: Site Temporary redirect
Note: Most search engines treat the "404" and "410" status equally, such as Google. (See Matt Cutts's note)
Understanding of HTTP404 Status Codes
HTTP 404 error means that the link to the page does not exist, that is, the original page URL invalidation, this situation will often occur, it is difficult to avoid, such as: Web site URL generation rules change, Web page file name or move location, import link spelling errors, etc., resulting in the original URL address is inaccessible When the server receives a similar request, it returns a 404 status code telling the browser that the resource being requested does not exist. However, the default 404 error pages of the Web server, both Apache and IIS, are simple, inflexible, and unfriendly to the user, unable to provide the user with the necessary information to get more clues, which will undoubtedly result in the loss of the user.
As a result, many sites use a custom 404 error to provide a user experience to avoid user churn. In general, customizing 404 pages is a common practice of placing Web site navigation links, search boxes, and featured services in a Web site, so that you can effectively help users access the site and information they need.
The influence of HTTP404 on SEO
Customizing the 404 error page is a good way to provide a user experience, but it often fails to notice the impact on the search engine, such as a bad server-side configuration that results in the return of a "200" status code or a custom 404 error page Using meta-refresh to return a "302" status code. A custom 404 error page that is correctly set should not only be displayed correctly, but should also return the "404" error code instead of "200" or "302". Although the HTTP status code is "404" or "200" for the user, it is very important for the search engine.
(i) Custom 404 error page Return "200" status code
When a search engine asks a URL for a "404" state response, that means that the URL is no longer indexed and the page is deleted from the index database by feedback to the datacenter, although the deletion process may take a long time; when the search engine gets "200" When the status response is considered valid, the URL is indexed and included in the index database, which results in the exact same content of the two different URLs: Customizing the contents of the 404 error page, which causes the replication page issue to occur. For search engines, especially Google, it is not only difficult to get the trust index TrustRank, but also greatly reduce Google's assessment of the quality of the site. (Why does a status code return "200" occur??) Please refer to "The basic principle of customizing the 404 error Page" below.
I have been using Google Sitemap, when we submit an XML format sitemap file, Google will verify our identity to ensure that the site is legitimate managers. There are two ways to verify: Upload an HTML page with the specified name to the site root or add an identity meta tag to the Web page meta area. I usually use the way to upload HTML pages, but Google prompts me to not find this page in the root directory (in fact I have uploaded, and through the browser to access), this is a very scary question, see figure:
(ii) Custom 404 error page Use Meta Refresh to return "302" status code
Often see a custom 404 error page for many web sites to take the form of first displaying an error message and then using Meta Refresh to jump the page to the home page, Sitemap, or other similar page. Depending on how the implementation is different, such 404 pages may return "200" status code, may return "302", but no matter which, from the SEO technology point of view, is not a suitable choice.
We have talked about the status of "200", so what will the search engine do when the 404 page returns "302"? Theoretically, for the "302" error, the search engine that the page exists, but temporarily changed the address, will still index the page, so that the same will appear similar to the "200" status code when the duplicate text problem; The mainstream search engine, which is represented by Google, is increasingly demanding 302 redirects, and this kind of improper use of the 302 redirect is a big risk.
Make sure that the custom 404 error page Returns a "404" Status Code
After the custom 404 error page is set up, be sure to check to see if it correctly returns the "404" Status Code. You can use the Server Header Check tool to enter a URL that does not exist for a Web page, and look at the return of the HTTP header to make sure that it returns "404 Not Found".
404 Error handling
(i) The basic principle of customizing 404 error pages
The first thing to be clear is that 404 errors should work at the server level rather than at the page level. For customizing 404 pages with dynamic pages such as PHP script type, you must ensure that the server has successfully sent a "404" Status Code before PHP execution, otherwise, once the ISAPI level is executed, the returned status code can only be "200" or another redirect status code such as "302".
Second, when customizing the 404 error pages of a Web site, you should use a relative path instead of an absolute path to the error page URL link that you set up, and the custom 404 page should be placed under the site root directory. Although invalid links may be multiple forms of URLs, when a 404 access error occurs, the Web server automatically moves it to the custom 404 error page, which is not related to the shape of the URL.
Two) Apache Set 404 Error page
The way to set the 404 error page for Apache server is simple: Just add the following in the. htaccess file.
ErrorDocument 404/notfound.php
Note:
1. Remember not to turn the 404 error to the homepage of the website, otherwise it may cause the homepage to disappear in the search engine
2. Remember not to use an absolute URL (for example:/nofound.php form), if the status code returned using an absolute URL is "302" + "200" (tested)
Three) Iis/asp.net Set 404 Error page
First, modify the settings for the application root directory, and open the "Web.config" file editor, adding the following:
Note: In the above example "error.asp" is the default 404 page of the system, "notfound.asp" is a custom 404 page, please modify the corresponding file name when using.
Then, in the Custom 404 page "Notfound.asp", add:
<%
Response.Status = "404 Not Found"
%>
This ensures that IIS can correctly return the "404" Status Code
(iv) Set up 404 static pages under Iis/asp.net
The way to set a static 404 error page is simpler, in IIS Manager, right-click the Web site you want to manage, open the custom error message page in properties, and set the appropriate error message page for 404. However, it is important to select "file" or "Default" in "Message type" instead of "URL", which will result in the return of the "200" status code.