Apache URL Rewrite rules

Source: Internet
Author: User
Tags deprecated response code

1. Introduction

Apached's rewrite function, which is the Mod_rewrite module function, is a module of Apache. It's very powerful and can manipulate all parts of the URL.

So we can rewrite the URL and give the user a generous URL that can be converted to a real resource path when the user accesses it via the Mod_rewrite module function. There are many features that can be achieved through mod_rewrite, such as hiding real addresses, implementing URL jumps, domain name jumps, anti-theft chains, restricting access to resource types, and more.

2. Work Flow

The Mod_rewrite module uses two hook programs at run time.

The first is a hook from the URL to the file name conversion. When there is access to the Apache server, the server confirms the appropriate host (or virtual host), and the Mod_rewrite module begins to work, which will first process the instructions provided by the Mod_rewrite module in the server global, and then rewrite according to the instructions provided by the user.

The second one is to fix the URL of the hook. The Mod_rewrite module handles non-global settings at this stage. For example, the settings in the. htaccess file in the directory. However, the translation of the URL has been completed (converted from a URL to a file name), so it is not possible to overwrite the directory level URL at the same time, but the Moe_rewrite module will convert the translated URL back to the state of the URL and continue with the directory-level URL rewriting. (The Mod_rewrite module will restart a request loop processing using the callback function of the post-read request phase)

Processing of rewirte module rule set

When mod_rewrite begins execution in these two API phases, it reads the set of rules that are configured in the configuration structure (either at the server level established at service startup, or at the directory level that is traversed by the directory), and then start the URL rewriting engine to process (with one or more conditions) Rule set. Both server-level and directory-level rule sets are handled by the same URL rewriting engine, except that the final result is handled differently.

The order of rules in a rule set is important because the rewrite engine is processed in a special order: each rule is traversed (the Rewriterule Directive), and if a rule with a matching condition appears, it is possible to go back through the existing rule condition (Rewritecond Directive). Due to historical reasons, the conditional rules are pre-set, so the control process is a bit verbose, the details are shown in Figure 1.

It is visible that the URL first matches the pattern of each rule, and if the match fails, mod_rewrite immediately terminates processing of the rule and then processes the next rule. If the match succeeds, Mod_rewrite will look for the corresponding rule condition, and if one condition is not, simply replace the URL with the new value constructed by substitution, and then proceed with the other rule, but if the condition exists, an inner loop is processed one after the other in the order in which it is listed. The rule condition is handled differently: The URL does not match the pattern, but instead establishes a teststring string by extending the variable, reverse referencing, finding the mapping table, and then using it to match the Condpattern. If the match fails, the entire set of conditions and the corresponding rule fails, and if the match succeeds, the next rule is executed until all conditions have been executed. If all conditions match, the URL is replaced with substitution, and processing continues. (This section refers to the translator: Jin Bu)

Network pictures:

3. URL rewrite instructions

The simplest rewrite instructions can be simple enough to make you not imagine!

It takes just two steps to get it done. First use Rewriteengine to turn on the Mod_rewrite module function; second, define URL rewrite rules by rewriterule

1), url rewrite instruction routines
1---------------------------------------------------------------2Rewriteengine on#turn on the mod_rewrite module function3Rewritebase Path#base URL (use alias to set the alias)4Rewritecond teststring Condpattern [flags]#overriding conditions (can be multiple)5Rewriterule Pattern Substitution [flags]#Rewrite rules6----------------------------------------------------------------7 #4, 5 lines can be multiple8 #Execute rewriterule in order one at a-([Flags not terminating])9 ##以上是常用的指令, there are some very rare instructions, need to find out their own information to understand
2), Rewriterule Pattern Substitution [flags]

1. Pattern is a Perl-compatible regular expression that acts on the current URL. The current URL is the value of the URL at the time the rule is in effect. It may be different from the URL that was requested, because it might have been modified by another rewriterule or alias directive.

2, Substitution is when the URL and pattern match successfully. The string to substitute for.

    • You can reverse reference $n (n=0~9) to pattern, representing the contents of the nth parenthesis in a regular expression
    • Rewritecond Reverse Reference%n (n=0~9) for the last match, representing the contents of the last matching rewritecond nth pair of parentheses
    • Server variable%{varname}
    • mapping function Call ${mapname:key|default} (through REWRITEMAP Directive definition Mapping assisted completion)

3. [flags], markers, and more are separated by commas.

Marker (excerpt from online):

Redirect| R [=code] (Force redirect redirect)

A substitution that is prefixed with http://thishost[:thisport]/(making the new URL a URI) can be forced to perform an external redirect. If code is not specified, an HTTP response code of 302 (temporary move) is generated. If you need to use a different response code in the range of 300-400, you can specify this value here, plus one of the following symbol names: Temp (default), permanent, seeother. It can be used to feedback the normalized URL to the client, such as rewriting "/~" to "/u/", or/u/user plus slashes, etc.

Note: When using this tag, you must ensure that the replacement field is a valid url! Otherwise, it will point to an invalid location! And keep in mind that this tag itself is just a prefix to the URL plus http://thishost[:thisport]/, and the rewrite operation will still continue. In general, you will want to stop the rewrite operation and immediately redirect, then you also need to use the ' L ' tag.

forbidden| F (mandatory URL for forbidden Forbidden)

Forces the current URL to be disabled, that is, immediately feedback an HTTP response code of 403 (forbidden). With this tag, you can link several rewriteconds to conditionally block certain URLs.

gone| G (mandatory URL is deprecated gone)

Forces the current URL to be obsolete, that is, immediately feedback an HTTP response code 410 (deprecated). Using this tag, you can indicate that the page has been deprecated and does not exist.

proxy| P (Force agent proxy)

This flag causes the replacement ingredient to be internally coerced to the proxy request and immediately (that is, the rewrite rule handles immediate interrupts) handing over the processing to the proxy module. You must make sure that this replacement string is a valid URI (such as a common one that starts with http://hostname) that can be handled by the Apache proxy module. With this tag, some remote components can be mapped to the local server namespace, thereby enhancing the functionality of the Proxypass directive.

Note: To use this feature, the proxy module must be compiled on the Apache server. If you are not sure, you can check the output of "httpd-l" for mod_proxy.c. If so, mod_rewrite can use this feature, and if not, you must enable Mod_proxy and recompile the HTTPD program.

Last| L (last Rule)

Stops the rewrite operation immediately and no longer applies another rewrite rule. It corresponds to the last command in Perl or the break command in the C language. This tag prevents the currently overridden URL from being overridden by its successor rule. For example, use it to rewrite the URL of the root path ('/') to a URL that actually exists, for example, '/e/www/'.

Next| N (re-execute next round)

Re-executes the rewrite operation (starting with the first rule). The URL that was processed again is not the original URL, but the URL that was processed by the last rewrite rule. It corresponds to the next command in Perl or the Continue command in the C language. This tag can restart the rewrite operation, that is, immediately return to the head of the loop.
But be careful not to create a dead loop!

chain| C (link to next rule chained)

This tag causes the current rule to be linked to the next rule (which itself can be linked to its successor rule and can be so repeated). It produces such an effect: if a rule is matched, it will usually continue to process its successor, that is, the tag does not work, and if the rule cannot be matched, then its subsequent linked rules are ignored. For example, when performing an external redirect, for a directory-level rule set, you may need to delete ". www" (where ". www" should not appear).

type| T=mime-type (force MIME type types)

The mandatory MIME type for the target file is Mime-type. For example, it can be used to simulate the Scriptalias directive in Mod_alias to internally force the MIME type of all files in the mapped directory to be "application/x-httpd-cgi".

nosubreq| NS (only for no internal sub-request processing no internal sub-request)

This token forces the rewrite engine to skip the rewrite rule when the current request is an internal child request. For example, when Mod_include tries to search for possible directory default files (index.xxx), Apache generates child requests internally. A child request, which is not necessarily useful, and may even throw an error if the entire ruleset is working. Therefore, you can use this tag to exclude certain rules.

Follow these guidelines according to your needs: If you use URL prefixes with CGI scripts to force them to be handled by CGI scripts, the error rate (or overhead) of processing a child request is high, in which case you can use this tag.

nocase| NC (ignoring casing no case)

It makes the pattern ignore case, that is, ' A-Z ' and ' A-Z ' are not different when pattern matches the current URL.

qsappend| QSA (Append request string query string append)

This flag forces the rewrite engine to append a request string to an existing replacement string, rather than a simple replacement. You can use this tag if you need to add information to the request string through a rewrite rule.

Noescape|ne (do not escape URI in output no URI escaping)

This flag prevents Mod_rewrite from applying a general URI escape rule to the overridden result. In general, special characters (such as '% ', ' $ ', '; ') And so on) will be escaped to the equivalent hexadecimal encoding. This tag prevents such escapes from allowing symbols such as percent signs to appear in the output, such as:

The rewriterule/foo/(. *)/bar?arg=p1=$1 [R,ne] can turn '/foo/zed ' to a secure request '/bar?arg=p1=zed '.

Passthrough|pt (hand over to the next processor pass through)

This flag forces the rewrite engine to set the URI field in the internal structure Request_rec to the value of the FileName field, which is only a small modification to the Alias,scriptalias from other URIs to the file name translator Redirect The output of the command is subsequently processed. Give an example of what it means: if you want to rewrite/ABC as/def through the mod_rewrite rewrite engine, and then convert/def to/ghi by Mod_alias, you can:

Rewriterule ^/abc (. *)/def$1 [PT]

Alias/def/ghi
If the PT tag is omitted, although mod_rewrite works fine, that is, as a URI to the file name translator using the API, it can rewrite uri=/abc/... For filename=/def/..., however, subsequent mod_alias are invalidated when attempting to translate the URI to the file name.

Note: You must use this tag if you need to mix a different module that contains a URI to the file name translator. Mixed use of Mod_alias and mod_rewrite is a typical example.

For Apache hackers

If the current Apache API in addition to the URI to the file name hook, there is a file name to the file name of the hook, you do not need this tag! However, if there is no such a hook, then this tag is the only solution. Apache Group has discussed this issue and will add such a hook in Apache version 2.0.

skip| S=num (Skip the successor rule skip)

This flag forces the rewrite engine to skip the NUM rules succeeding the current matching rule. It can implement a pseudo-If-then-else construct: The last rule is the then clause, and the skip=n rule that is skipped is the ELSE clause. (IT and ' chain| The C ' tag is different!)

env| E=var:val (SET environment variable environment variable)

This flag causes the value of the environment variable VAR to be Val, and Val can contain an extensible inverse reference to the regular expression $n and%n. This tag can be used multiple times to set multiple variables. These variables can be referenced indirectly in many subsequent cases, but usually in Xssi (via) or CGI (e.g. $ENV {' VAR '}), or by%{env:var} in the pattern of subsequent rewritecond instructions. Use it to peel and remember some information from the URL.

cookie|co=name:val:domain[:lifetime[:p Ath]] (set cookies)

It sets a cookie on the client browser. The name of the cookie is "name" and its value is Val. The Domain field is the field of the cookie, such as '. Apache.org ', the optional lifetime is the number of minutes of the cookie's lifetime, and the optional path is the cookie.

3), Rewritecond teststring Condpattern [flags]

The Rewritecond directive defines a rule condition . There may be one or more rewritecond directives in front of a rewriterule instruction, only if the template matches successfully and those conditions are met (that is, the pattern match in Rewriterule succeeds), the rule condition is applied to the current URL processing.

1. TestString is a plain text string

    • You can reverse-reference the pattern $n (n=0~9), immediately following the contents of the nth parenthesis in the Rewriterule regular expression that follows Rewritecond
    • Reverse reference%n (N=0~9), which represents the contents of the nth pair of parentheses in Condpattern in Rewritecond
    • Server variable%{varname}

2, Condpattern is a conditional pattern, a regular expression applied to the current instance teststring. That is, the teststring is matched with the conditional pattern condition. If the match is Rewritecond the value is rrue and vice versa is False

You can use the following special variables ( you can use '! ') Implementation reversal ):

' >condpattern ' (greater than) treats Condpattern as a normal string, comparing it to teststring, when the teststring character is greater than Condpattern is true.

' =condpattern ' (equals) treats Condpattern as a normal string, compares it to TestString, and is true when TestString is exactly the same as Condpattern. If Condpattern is just "" (two quotes close together) It is necessary to teststring the empty string to be true.

'-d ' (Whether it is a directory) treats teststring as a directory name, checking if it exists and whether it is a directory.

'-F ' (whether regular file) treats teststring as a file name, checking to see if it exists and is a regular file.

'-S ' (Whether it is a regular file that is not 0 in length) treats teststring as a file name, checking if it exists and whether it is a regular file with a length greater than 0.

'-l ' (symbolic Link) teststring as a file name, check if it exists and whether it is a symbolic link.

'-F ' Check whether the teststring is a legitimate file (through Subrequest to check if a file is accessible) and access it through the current settings of the server-wide access control. This check is done through an internal subrequest, so use this feature with care to reduce server performance.

'-u ' Check if the teststring is a legitimate URL (through Subrequest to check for the presence of a URL) and access through the current set of server-wide access controls. This check is done through an internal subrequest, so use this feature with care to reduce server performance.

3, [flags] is the third parameter, and multiple flags are separated by commas

' nocase| NC ' (case-insensitive) in the extended teststring and Condpattern, the case of text is not differentiated when compared. Note that this flag has no effect on file system and Subrequest checks.

' ornext| or ' (establish a relationship with the next condition) by default, the relationship between the two conditions is and, using this flag to change the relationship to or.

4), Rewrite Server variables (only a few are listed)

HTTP headers:http_user_agent, Http_referer, Http_cookie, Http_host, http_accept

Connection & Request:remote_addr, query_string

Server Internals::document_root, Server_port, Server_protocol

System Stuff:time_year, Time_mon, Time_day

5), simple regular expression rules

. Match any single character

[chars] Match string: chars

[^chars] mismatch string: chars

TEXT1|TEXT2 selectable string: Text1 or Text2

? Match 0 to 1 characters

* Match 0 to more characters

+ match 1 to more characters

^ String Start flag

$ string End Flag

\ n Escape Character flag

"Note": a generation of Apache requires a URL with a slash and second-generation Apache is not allowed, so use ^/?

4. Example Analysis

Example 1 (simple example):

(in. htaccess for regulatory rewriting)

rewriteengine on Rewriterule   ^user/(w+)/? $user. php?id=$1

^: The beginning of the input with user/begins with the requested address

(w+): Extract all the letters and pass them to $

/?: Optional Slash

$: Terminator

To be replaced by: user.php?id=*

Note: Some Apache (which version is forgotten) is not compatible with shorthand mode w+ = [a-za-z_-]

Example 2 (Disable IE and opera browser access):

%{http_user_agent} ^msie [NC,%{http_user_agent} ^^.*-[f,l]       #'-' means no replacement URL

Example 3 (illegal path back to home page):

%{request_filename}-%{request_filename}-^ (. *) $ index.php/$1 [L]

Example 4 (anti-theft chain):

%{http_referer}!^http://(. +.)? mysite.com/[NC]       #判断请求的是否是自己的域名rewritecond%{http_referer}!^. *. (jpe?g|gif|bmp|png) $/images/nohotlink.jpg [L]      #返回警告图片

Example 5 (Change access URL directory name):

That hides the real directory name.

^/?old_dir/([a-z\.] +) $  new_dir/$1 [r=301,L]#new_dir for real directory

Example 6 (create no file suffix link):

# determine if the suffix file exists Rewriterule ^/? ([a-za-z0-9]+) $ $.  # determine if the suffix file exists rewriterule ^/? ([a-za-z0-9]+) $ $1.html [L]

Example 7 (limit to show pictures only):

%{request_filename}  !^.*\. ( Gif|jpg|jpeg|png|  . *$-[f,l]

Example 8 (file does not exist redirect 404):

rewriteengine Onrewritecond   %{request_filename}  !  Frewritecond  %{request_filename}  !.?/404.php [L]

(These are some of their own views and summary, if there is insufficient or wrong place please point out)

That leaves with the wind

Statement: The above represents only the point of view or conclusion that I have summed up in a certain time in my work study. Please give the original link in the article page when reproduced

Apache URL Rewrite rules

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.