URL rewrite and URL rewriting

Source: Internet
Author: User
Tags servervariables

A few days ago, I saw an article about URL rewriting in the garden.ArticleObtain the URL after isapi_rewrite is rewritten. url-rewrite is no longer a new technology. This topic has been discussed many times. You can search for URL-Rewrite to find many articles and components related to URL-Rewrite. I have been familiar with this stuff many times before. Let's just talk about it. Scottgu has a very classic URL-Rewrite blog
Tip/TRICK: URL rewriting with ASP. NET http://weblogs.asp.net/scottgu/archive/2007/02/26/tip-trick-url-rewriting-with-asp-net.aspx

Why URL-rewrite?
Scottgu's blog provides two important reasons:
1. Make sure that the structure of the webapplication is adjusted. When the page location is moved, the user's favorite URL will not become a dead link.
2. Seo optimization.
Extracted from scottgu blog's original article
---------------------------------------------------------------------------
Why does URL Mapping and rewriting matter?
The most common scenarios where developers want greater flexibility with URLs are:
1) handling cases where you want to restructure the pages within your web application, and you want to ensure that people who have bookmarked old URLs dont break when you move pages around. URL-rewriting enables you to transparently forward requests to the new page location without breaking browsers.
2) improving the search relevancy of pages on your site with search engines like Google, Yahoo and live. specifically, URL rewriting can often make it easier to embed common keywords into the URLs of the pages on your sites, which can often increase the chance of someone clicking your link. moving from using querystring arguments to instead use fully qualified URLs can also in some cases increase your priority in search engine results. using techniques that force referring links to use the same case and URL entrypoint (for example: Invalid instead of California) can also avoid diluting your PageRank processing SS multiple URLs, and increase your search results.
In a world where search engines increasingly drive traffic to sites, extracting any little improvement in your page ranking can yield very good ROI to your business. increasingly this is driving developers to use URL-rewriting and other SEO (Search Engine Optimization) techniques to optimize sites (note that Seo is a fast moving space, and the recommendations for increasing your search relevancy evolve monthly ). for a list of some good search engine optimization suggestions, Id recommend reading The SSW rules to better Google rankings, as well as marketpositions article on how URLs can affect top search engine ranking.
---------------------------------------------------------------------------
The scenario described in the first reason is often encountered in the Web site revision. When the web site is revised, the location of some pages and the structure of parameters in querystring are often adjusted. It is very likely that the links previously added to favorites become dead links. In this scenario, URL-rewrite is like the concept of an intermediate layer in the software architecture technology. url-Rewrite URLs that are made public to the outside are overwritten. This URL is added to the user's favorites and remains unchanged, when the web site is adjusted, the location of the internal page is changed, and the actual internal URL address is changed. In this case, the internal rewrite rules are modified, rewrite the original public URL to the new internal URL. This ensures that the external URL remains unchanged. In fact, the page location has been adjusted internally. Although URL-Rewrite can prevent the generation of dead links, most sites do not use URL-Rewrite to prevent the generation of dead links during revision or adjustment, generally, you can directly modify the 404 The page cannot be found page, change the 404 error page to a more friendly prompt page, and jump to the homepage of the website in a few seconds.

The second reason is Seo. If your website is an internal oa erp crm site, you only need your internal staff to access it. In fact, there is no need to do Seo, because such sites do not need a search engine to include the site, nor need others to find the site through the search engine, so there is no need for Seo Optimization for such sites. Seo optimization is very important if your website is a commercial site, news site, or entertainment site, the more users visit the better site, in this case, it is necessary to optimize Seo through URL-rewrite. As search engines gradually become the preferred tool for people to search for information and obtain resources, the impact of search engines on a site is growing. below is zhangsichu.com 9-1 ~ 9-10 third-party route data statistics during this period.

By recording the Referer in the httpheader, you can obtain the page on which the user is located before browsing this page. So that the user reaches the page through that page.
Of the 266 independent IP addresses, 200 are from the search engine. That is to say, there are 200 users who first use the search results of the search engine and then come to zhangsichu.com. Accounting for 75.2%. More than half of the people are searched. It fully demonstrates the importance of Seo for the site. In this case, you must perform URL-Rewrite for Seo optimization.


If your site does not need to consider URL compatibility to prevent dead links or SEO optimization, there is no need to perform URL-rewrite. URL-rewrite is a harmful process for performance.

Common URL-Rewrite Solutions
URL-Rewrite can occur either on the Web Server (IIS/Apache) or on the Web ApplicationProgramLevel 1 (ASP. NET/JSP/PHP /...).


1. Web application-level URL-Rewrite
URL-Rewrite at the web application level. There are three well-known ready-made components.
1)URL-Rewrite provided by MicrosoftHttp://msdn2.microsoft.com/zh-cn/library/ms972974.aspx
2)Urlrewriter. Net of Open SourceHttp://urlrewriter.net/
3)UrlrewritingHttp://www.urlrewriting.net/en/Download.aspx

The core working principle of this component is to add httpmodule to the Web. config of your web application. Use this httpmodule to process the rewrite. (In fact, you can also inherit system. Web. httpapplication and insert a method to application_beginrequest to overwrite it)

Core ProcessingCodeThe following code is extracted from the urlrewriter. NET component.
1) inherit from the ihttpmodule to get an httpmodule. This httpmodule needs to be configured in Web. config, indicating that all requests must pass through this httpmodule.

Public sealed class rewriterhttpmodule: ihttpmodule {// <summary> // initialises the module. /// </Summary> /// <Param name = "context"> the application context. </param> void ihttpmodule. init (httpapplication context) {context. beginrequest + = new eventhandler (beginrequest );}... Private void beginrequest (Object sender, eventargs e) {// Add our poweredby header httpcontext. current. response. addheader (constants. headerxpoweredby, configuration. xpoweredby); _ rewriter. rewrite ();}}

 

2) read and rewrite rules, determine whether to rewrite, determine how to rewrite, and rewrite.

 

Public void rewrite () {string originalurl = contextfacade. getrawurl (). replace ("+", ""); rawurl = originalurl; // create the context rewritecontext context = new rewritecontext (this, originalurl, contextfacade. gethttpmethod (), contextfacade. mappath, contextfacade. getservervariables (), contextfacade. getheaders (), contextfacade. getcookies (); // process each rule. processrules (context); // append Y headers defined. appendheaders (context); // append any cookies defined. appendcookies (context); // rewrite the path if the location has changed. contextfacade. setstatuscode (INT) context. statuscode); If (context. location! = Originalurl) & (INT) context. statuscode <400) {If (INT) context. statuscode <300) {// successful status if less than 300 _ configuration. logger. info (messageprovider. formatstring (message. rewritingxtoy, contextfacade. getrawurl (), context. location); // verify that the URL exists on this server. handledefadocument document (context); // verifyresultexists (context); contextfacade. rewritepath (context. location);} else {// redirection _ configuration. logger. info (messageprovider. formatstring (message. redirectingxtoy, contextfacade. getrawurl (), context. location); contextfacade. setredirectlocation (context. location) ;}} else if (INT) context. statuscode> = 400) {handleerror (context);} else if (handledefaultdocument (context) {contextfacade. rewritepath (context. location);} // sets the context items. setcontextitems (context );}

 

This rewrite is an ASP. Net pipeline-level rewrite that can rewrite all requests taken over by Asp.net.

 

 

Here the/PD/book. aspx request is overwritten to/PD. aspx? CG = books.
Web application-level URL-Rewrite can only rewrite the requests taken over by the web application. It cannot rewrite. js. jpg. The reason is that after these requests arrive at IIS, IIS does not distribute these requests to ASP. NET at all, so these requests will not be overwritten. In IIS, you can configure which suffixes of requests are sent to ASP. NET by IIS.

 

 

If you must. net-level pair. rewrite JS requests, which can be specified here. JS requests are processed by ASP. net, but you need to handle it yourself. JS response. The Web server-level URL-Rewrite can better solve this problem.

2. Web server-level URL-Rewrite

 

Apache server
The Apache server supports URL-rewrite. Open loadmodule rewrite_module modules/mod_rewrite.so in config and configure the rewritten regular expression. For example:

Extracted from apache2.2 Chinese Reference ManualChinese manual Apache-urlrewrite

------------------------------------------- Description: The purpose of this rule is to force a specific host name to replace other names. For example, if you want to force www.example.com to replace example.com, you can modify it based on the following solution: rewritecond % {http_host} of the site running on non-port 80 }! ^ Fully \. Qualified \. domain \. name [Nc] rewritecond % {http_host }! ^ $ Rewritecond % {server_port }! ^ 80 $ rewriterule ^/(. *) http://fully.qualified.domain.name: % {server_port}/$1 [L, R] on the site running on port 80 rewritecond % {http_host }! ^ Fully \. Qualified \. domain \. name [Nc] rewritecond % {http_host }! ^ $ Rewriterule ^/(. *) http://fully.qualified.domain.name/#1 [L, R] Then

 

IIS6/iis7 Web Server
Iis7's new "pipeline mode" is actually a more in-depth integration of some concepts in ASP. NET with IIS. An article in the iis7 Program Manager: Mike volodarsky's blog analyzes this aspect:
Breaking changes for ASP. NET 2.0 applications running in integrated mode on IIS 7.0

 

The "Classic mode" of iis7 is basically the same as that of IIS 6.

In IIS6 + ASP. NET application-level URL-rewrite, the rewrite operation can only take place after the request is allocated to the ASP. NET engine. This is changed in iis7. Iis7 can rewrite requests without extension names. ASP. NET and iis7 are deeply integrated. Iis7 can execute an httpmodule anywhere in the IIS request pipeline. The following is an ASP. NET rewrite configuration under iis7:

Extracted from scottgu's blog

 

<? XML version = "1.0" encoding = "UTF-8"?> <Configuration> <configsections> <section name = "rewriter" requirepermission = "false" type = "intelligencia. urlrewriter. configuration. rewriterconfigurationsectionhandler, intelligencia. urlrewriter "/> </configsections> <system. web> 

 

Where: <rewrite url = "~ /Products/(. +) "to = "~ /Products. aspx? Category = $1 "/> ~ in this rule ~ The regular expression/products/(. +. All links under/products/are matched.
To rewrite IIS6 at the server level, you must use ISAPI filters rewrite.

ISAPI filters has two well-known projects:
1) helicon techs ISAPI rewrite: http://www.isapirewrite.com/provides a full version of the isapi url rewrite product for $99 (30 days free trial) and a free lightweight version.
2) Ionics ISAPI rewrite: http://cheeso.members.winisp.net/IIRF.aspx all free open source components.
InISAPI filter programming rewrite URL.

The biggest difference between server-level rewriting and application-level rewriting is that they happen at different times. Is/PD/book. aspx rewritten to/PD. aspx at the server level? CG = books

 

 

 

The request has not been written to the ASP. NET engine.

3. ASP. some minor details of rewriting at the net level (part of the content is from scottgu's blog)
if there is a form on the page and the form is runat = server

, then the form action on this page is the original URL after rewriting, not the cleaned URL after rewriting. For example,/PD/book. aspx is rewritten to/PD. aspx? CG = books. The actual address accessed by the user's browser is/PD/book. aspx. After the server-level rewrite, the request becomes/PD. aspx? CG = books. In this case, the form action will be render to/PD. aspx? CG = books. In this case, you want to render the action into/PD/book. aspx, so that the page is PostBack to the same location. In some cases, the action is render to/PD. aspx? CG = books does not affect normal operations, as long as/PD. aspx? CG = books is not overwritten to match the rule,/PD. aspx? CG = books will be correctly sent back to the ASP. NET engine. However, the address bar on the browser changes and the real address is exposed. If this URL is matched by another rule, the form action must be correctly render to/PD/book. aspx, which is a unified rewrite URL.

Solution:
1) wrap the Form Control yourself. Write the URL in a hidden field and return it together with PostBack. Modify the action to the URL in the hidden field during render.
2) use JavaScript to modify the action before form submit, for example, using role Doc ument. Forms [0]. Action = Window. location;
3) use ASP. NET 2.0 control adapter (blog from scottgu)
This type of rewriting is used when ASP. NET application, use context. request. rawurl: Enter the form action. When the IIS application server is used for rewriting, the clean URL is recorded in the request. in servervariables ["http_x_rewrite_url"], use request. servervariables ["http_x_rewrite_url"] fill in the Form action. The form action is extended by the control adapter to form control. The writeattribute method of the override form control is used, specify the form action again during render.

Source code extracted from scottgu's blog

 

Path compatibility after Rewriting
Rewrite/PD/book. aspx to/PD. aspx? In the CG = books scenario, if there are resources in the relative position on the page, such as src = "../logo.gif.pdf or src?#logo.gif" of an IMG ". In this case, the location where the browser requests these resource baselines is/PD/That is, src = "../logo.gif". The Request Path is/logo.gif, and src?#logo.gif. The Request Path is/PD/logo.gif. But in fact, the reference location of these resources is/because the original URL is/PD. aspx? CG = books. In this case, the resource cannot be found.
1) Use server-side IMg ~ Path can solve this problem (from scottgu's blog ).
2) use the <base href = "http: // xxx/"> </base> label, which must be written in the head. Tell the page that the reference paths of all relative paths on the page are http: // xxx/, which solves the problem of path failure after rewriting.
Base label description: http://www.w3school.com.cn/tags/tag_base.asp

At this point, the URL-Rewrite issue has been discussed. There will certainly be various problems in the actual project, but the solution is probably the combination and expansion of the above technologies. I hope to discuss the URL-Rewrite issue through the above discussion, it will be helpful for new problems.

 

 

Author: Grape city control technical team. Zhang sichu

Title: web solution expert

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.