From: http://www.hbcms.com/cms/33/219.html
In the article "view how search engine spider crawls your website through HTTP status code", I introduced some frequently-involved HTTP status codes and their meanings, for example, the HTTP Status Code 404 related to this article is frequently discussed: the server cannot find the specified resource and the requested webpage does not exist (for example, the webpage requested by the browser is deleted or moved, but it does not rule out the possibility that the link will be valid in the future );
410: the requested webpage does not exist (Note: 410 indicates permanent, and 404 indicates temporary );
200: the server returns the requested webpage;
301: Permanent URL redirection
302: Temporary URL redirection
Note: Most search engines treat "404" as "410", such as Google. (See the description of Matt Cutts)
Understanding of http404 status code
An HTTP 404 error means that the webpage to which the link points does not exist, that is, the URL of the original webpage is invalid. This situation often occurs and is difficult to avoid. For example: the original URL address cannot be accessed because the webpage URL generation rules are changed, the webpage file is renamed or moved, and the import link is misspelled. When the Web server receives a similar request, A 404 status code is returned to tell the browser that the requested resource does not exist. However, no matter Apache or IIS, the default 404 error page of the Web server is very simple, dull, and unfriendly to users, and cannot provide users with the necessary information to obtain more clues, this will undoubtedly lead to the loss of users.
Therefore, many websites use custom 404 error methods to provide user experience to avoid user loss. Generally, a general approach to customizing a 404 page is to place the website quick navigation link, search box, and special services provided by the website on the page, this can effectively help users access the site and obtain the required information.
Impact of http404 on Seo
Custom 404 error pages are good practices for providing user experience, but the impact on search engines is often not noticed during the application process. For example: if the server configuration is incorrect, "200" status code is returned or the custom 404 error page uses Meta Refresh to return "302" status code. The custom 404 error page set correctly should not only be correctly displayed, but also "404" error should be returnedCodeInstead of "200" or "302 ". Although there is no difference between the HTTP status code "404" and "200" for the accessed users, this is very important for search engines.
(1) The custom 404 error page returns the "200" status code
When a search engine spider returns a 404 status response when requesting a URL, it knows that the URL has expired and no longer indexes the webpage, and report to the data center that the webpage indicated by the URL is deleted from the index database. Of course, the deletion process may take a long time. When the search engine returns the "200" status, the URL is regarded as valid, and then indexed and indexed to the index database. The result is that the two URLs have identical content: customize the content of the 404 error page, which may cause the copy webpage problem. For search engines, especially Google, it is not only difficult to obtain the trust index trustrank, but also greatly reduces Google's evaluation of website quality. (Why is the "200" status code returned ?? See the following content "basic principles for customizing the 404 error page ")
I have been using Google sitemap. When we submit a website map file in XML format, Google will verify our identity to ensure that it is the legal administrator of the website. There are two authentication methods: Upload the HTML page with the specified name to the website root directory or add a meta tag with the identity in the meta area of the webpage. I usually upload an HTML webpage, but Google prompts me That this webpage cannot be found under the root directory of my website (actually I have uploaded it and can access it through a browser ), this is a terrible problem, as shown in the figure below:
!!! Original width: 815 original height: 431 retry times: 1 !!! "Src =" http://www.hbcms.com/hbcms/upload/image/original/71/7160e23832a1bea28278aa7ad824235a.gif "width =" 620 "border =" 1 ">
(2) Use Meta Refresh to return the "404" status code on the custom 302 error page
The custom 404 error pages of many websites are usually displayed in the following format: first, an error message is displayed, and then the page is redirected to the homepage, webpage map, or other similar pages of the website through Meta Refresh. Depending on the specific implementation method, this type of 404 page may return a "200" status code or "302", but either way, from the SEO technical perspective, is not a proper choice.
We have already talked about the "200" status, so what will the search engine do when "404" is returned on the 302 Page? In theory, for the "302" error, the search engine considers that the webpage exists, but the address is changed temporarily and the page will be indexed, duplicate text similar to the "200" Status Code may also occur. Second, mainstream search engines, represented by Google, have increasingly strict requirements on the applicability of 302 redirection, this type of improper use of 302 redirection poses a great risk.
Make sure that the custom 404 error page returns the "404" status code
After setting the custom 404 error page, make sure that the "404" status code is returned correctly. You can use the server header check tool to enter a URL without a webpage. Check whether the HTTP header is returned and make sure it returns "404 Not Found ".
404 error handling method
(1) Basic principles for customizing 404 error pages
First, it should be clear that Error 404 should work at the server level rather than the webpage level.. When customizing dynamic pages such as PHP script 404 pages, make sure that the server has successfully sent the "404" status code before PHP Execution. Otherwise, once the ISAPI level is reached, the returned status code can only be "200" or other redirection status codes such as "302.
Second, when customizing the 404 error page of a website, the relative path rather than the absolute path should be used for the URL link of the configured error page, and the custom 404 page should be placed under the root directory of the website.. Although invalid links may be URLs in multiple forms, when a 404 access error occurs, the web server automatically redirects the link to the custom 404 error page, this has nothing to do with the shape of the URL.
(2) set the 404 error page in Apache
Setting the 404 error page for Apache server is simple. You only need to add the following content to the. htaccess file: errordocument 404/notfound. php
Note:
1. Do not redirect the 404 error to the website homepage. Otherwise, the homepage may disappear in the search engine.
2. Do not use an absolute URL (for example, http://www.bloghuman.com/nofound.php). If you use an absolute URL, the returned status code is "302" + "200" (tested)
(3) set the 404 error page under IIS/Asp.net
First, modify the applicationProgramSet the root directory, open the "Web. config" file, and add the following content to it: <configuration>
<System. Web>
<Customerrors mode = "on" defaultredirect = "error. asp">
<Error statuscode = "404" Redirect = "notfound. asp"/>
</Customerrors>
</System. Web>
</Configuration>
Note: In the preceding example, "error. asp" is the default 404 page and "notfound. asp" is the custom 404 page. Modify the file name when using it.
Then, add the following content to the custom 404 page "notfound. asp:
<%
Response. Status = "404 Not found"
%>
In this way, you can ensure that IIS returns the "404" status code correctly.
(4) set the 404 static page under IIS/Asp.net
Setting a static 404 error page is relatively simple. Right-click the website you want to manage in IIS manager to open the custom error information page in properties, set the error information page for "404. However, in "Message Type", you must select "file" or "Default Value" instead of "url". Otherwise, the "200" status code will be returned.
Appendix: Server-header detection tool
Http://www.webrankinfo.com/english/tools/server-header.php