Apache URL rewriting rules and apacheurl Rewriting
1. Introduction
The rewrite function of Apached is the mod_rewrite module function, which is a module of apache. It is very powerful and can operate all parts of the URL.
Therefore, we can rewrite the url to provide users with a brief and generous url. When users access the url, they can use the mod_rewrite module function to convert it to a real resource path. There are many functions that can be implemented through mod_rewrite, such as hiding real addresses, implementing URL redirection, Domain Name Redirection, anti-leeching, and restricting access to resource types.
2. Workflow
The mod_rewrite module uses two Hook programs at runtime.
The first one is the Hook for converting from URL to file name. When an access arrives at the Apache server, the server will confirm the corresponding host (or virtual host). Then the mod_rewrite module starts to work, it will first process the commands provided by the mod_rewrite module in the global server, and then rewrite them according to the commands provided by the user.
The second is to modify the URL Hook. In this phase, the mod_rewrite module processes non-global settings. For example, the setting in the. htaccess file in the directory. However, URL translation has been completed (converted from URL to file name). Therefore, directory-level URLs cannot be rewritten, however, the moe_rewrite module will convert the translated URL to the URL status again and rewrite the directory-level URL. (The mod_rewrite module uses the call-back function in the Post-Read Request stage to re-process a request)
Rewirte module rule set Processing
When mod_rewrite starts to be executed in these two API phases, it will read the configured structure (or the server level established at Service Startup, or, you can traverse the directory-level rule set collected by the directory, and then start the URL rewriting engine to process (with one or more conditions) the rule set. Both server-level and directory-level rule sets are processed by the same URL rewriting engine, but the final results are processed differently.
The order of rules in a rule set is very important, because the rewrite engine processes the rules in a special order: traverse each rule one by one (RewriteRule instructions). If a rule that matches the condition appears, then, you may traverse the existing rule conditions (RewriteCond command ). Due to historical reasons, the condition rules are pre-configured, so the control process is somewhat lengthy. See figure-1 for details.
1 --------------------------------------------------------------- 2 RewriteEngine on # enable mod_rewrite module function 3 RewriteBase path # Reference URL (this is required to set aliases using alias) 4 RewriteCond TestString CondPattern [flags] # rewrite conditions (multiple conditions are allowed) 5 RewriteRule Pattern Substitution [flags] # rewrite Rule 6 limit 7 #4 and 5 rows can have multiple 8 # execute RewriteRule one by one in order ([If flags is not terminated]) 9 ## the above are commonly used commands, and there are some rare commands, you need to check the information to understand2) RewriteRule Pattern Substitution [flags]
1. pattern is a perl-Compatible Regular Expression acting on the current URL. The current URL refers to the value of the URL when the rule takes effect. It may be different from the requested URL, because it may have been modified by other RewriteRule or alias commands.
2. Substitution is when the URL matches Pattern successfully. String to replace.
- You can reverse reference $ N (N = 0 ~ 9) indicates the content in the nth parentheses of the regular expression.
- Reverse reference % N (N = 0 ~ 9) indicates the content in the nth pair of the last matched RewriteCond.
- Server variable % {VARNAME}
- Ing function call $ {mapname: key | default} (via RewriteMap command definition ing Assistance)
3. [flags], which are separated by commas.
Identifier (excerpted on the Internet ):
Redirect | R [= code] (Force redirect)
Substitution prefixed with http: // thishost [: thisport]/(make the new URL a URI) can forcibly execute an external redirection. If the code is not specified, an HTTP response code 302 (temporary movement) is generated ). To use other response codes in the range of-, you only need to specify this value here. In addition, you can also use one of the following symbol names: temp (default), permanent, seeother. the canonicalized URL can be fed back to the client. For example, rewrite "/~". "/U/", or add a slash to/u/user, and so on.
Note: When using this tag, make sure that the field to be replaced is a valid URL! Otherwise, it will point to an invalid location! Remember that this tag is only prefixed with http: // thishost [: thisport]/on the URL, and the rewrite operation will continue. In general, if you want to stop the rewrite operation and redirect immediately, you also need to use the 'l' flag.
Forbidden | F (force the URL to be a forbidden)
Force the current URL to be disabled. That is, an HTTP response code 403 (Forbidden) is immediately reported ). With this tag, you can link several RewriteConds to block some URLs with conditions.
Gone | G (force the URL to be an obsolete gone)
Force the current URL to be obsolete, that is, immediately feedback an HTTP response code 410 (obsolete ). Use this tag to indicate that the page has been deprecated and does not exist.
Proxy | P (Force proxy)
This flag forces the replacement component internally as a proxy request, and immediately transfers the process to the proxy module (that is, rewrite rule processing is interrupted immediately. You must ensure that the replacement string is valid (for example, a common URI starting with http: // hostname) and can be processed by the Apache proxy module. Using this tag, you can map some remote components to the local server namespace, thus enhancing the ProxyPass command function.
Note: To use this function, the proxy module must be compiled on the Apache server. If you are not sure, check whether mod_proxy.c exists in the output of "httpd-l. If yes, mod_rewrite can use this function. If not, you must enable mod_proxy and re-compile the "httpd" program.
Last | L (last rule last)
Stop the rewrite operation immediately and no other rewrite rules will be applied. It corresponds to the last command in Perl or the break command in C language. This mark can prevent the URL that has been rewritten from being overwritten by its subsequent rules. For example, you can use it to rewrite the root path URL ('/') to an existing URL, for example, '/e/www /'.
Next | N (re-execute next round)
Re-execute the rewrite operation (starting from the first rule again). At this time, the URL processed again is not the original URL, but the URL processed by the last rewrite rule. It corresponds to the next command in Perl or the continue command in C language. This flag allows you to re-start the rewrite operation, that is, immediately return to the loop header.
But be careful not to create an endless loop!
Chain | C (chained linked to the next rule)
This flag links the current rule with the next rule (which can be connected to its successor rule itself and can be so repeated. It produces the following effect: if a rule is matched, it will usually continue to process its successor rule, that is, this tag does not work; if the rule cannot be matched, the subsequent link rules are ignored. For example, when executing an external redirection, you may need to delete ". www" (". www" should not appear here) for a directory-level rule set ).
Type | T = MIME-type (mandatory MIME type)
The MIME type of the target file must be MIME-type. For example, it can be used to simulate the ScriptAlias command in mod_alias. The MIME type of all files in the ing directory must be "application/x-httpd-cgi ".
Nosubreq | NS (used only to process no internal sub-request for internal sub-requests)
When the current request is an internal subrequest, this flag forces the rewrite engine to skip this rewrite rule. For example, when mod_include tries to search for a possible default directory file (index. xxx), Apache will generate a request internally. It is not necessarily useful for subrequests. If the entire rule set works, it may even cause errors. Therefore, you can use this tag to exclude certain rules.
Follow the following principles as needed: if you use a URL prefix with CGI scripts to force them to be processed by CGI scripts, the error rate (or overhead) of sub-requests is high, in this case, you can use this tag.
Nocase | NC (case Insensitive)
It makes Pattern case insensitive, that is, when Pattern matches the current URL, there is no difference between 'a-Z' and 'a-Z.
Qsappend | QSA (append request string query string append)
This flag forces the rewrite engine to append a request string to an existing replacement string, instead of simply replacing it. If you need to add information to the request string by using the rewrite rule, you can use this flag.
Noescape | NE (no URI escaping is not escaped in the output)
This flag prevents mod_rewrite from applying regular URI escape rules to rewrite results. Generally, special characters (such as '%', '$', ';') are escaped as hexadecimal encoded values. This mark can prevent such escaping to allow symbols such as percent signs to appear in the output, for example:
RewriteRule/foo/(. *)/bar? Arg = P1 = $1 [R, NE] can redirect '/foo/zed' to a secure request'/bar? Arg = P1 = zed '.
Passthrough | PT (transferred to the next processor pass through)
This flag forces the rewrite engine to set the uri field in the internal structure request_rec to the value of the filename field. It is only a small modification so that it can be used for the Alias, ScriptAlias, redirect and other command output for subsequent processing. For example, if you want to use the mod_rewrite rewrite engine to rewrite/abc as/def and then use mod_alias to convert/def to/ghi, you can do this:
RewriteRule ^/abc (. *)/def $1 [PT]
Alias/def/ghi
If the PT tag is omitted, although mod_rewrite works normally, that is, as a URI using the API to the file name translator, it can rewrite uri =/abc /... Is filename =/def /..., However, subsequent mod_alias attempts to translate the URI to the file name will become invalid.
Note: If you need to mix different modules that contain URIs to the file name translator, you must use this tag .. Mixed Use of mod_alias and mod_rewrite is a typical example.
For Apache hackers
If there is a hook from the URI to the file name in Apache API besides the hook from the URI to the file name, you do not need to mark it! However, if there is no such hook, this tag is the only solution. Apache Group discussed this issue and added such a hook in Apache 2.0.
Skip | S = num (skip the next rule skip)
This flag forces the rewrite engine to skip the num rules next to the current matching rule. It can construct a pseudo if-then-else: the last rule is the then clause, And the skipped skip = N rules are the else clause. (it is different from the 'Chain | C' mark !)
Env | E = VAR: VAL (set the environment variable)
This flag sets the environment variable VAR value to VAL, which can contain extensible reverse referenced Regular Expressions $ N and % N. This tag can be used multiple times to set multiple variables. These variables can be indirectly referenced in many cases, but they are usually in XSSI (via) or CGI (for example, $ ENV {'var, you can also use % {ENV: VAR} as a reference in the pattern of the subsequent RewriteCond command. You can use it to strip from the URL and remember some information.
Cookie | CO = NAME: VAL: domain [: lifetime [: path] (set cookie)
It sets a cookie on the client browser. The cookie NAME is NAME and its value is VAL. The domain field is the cookie domain, such as '.apache.org'. The optional life time is the number of minutes in the cookie life cycle, and the optional path is the cookie path.
3), RewriteCond TestString CondPattern [flags]
The Rewritecond command defines a rule.Condition. There may be one or more rewritecond commands before a rewriterule command. The rule is only valid when the template matches successfully and these conditions are met (that is, the pattern in the RewriteRule matches successfully ).ConditionIs applied to the current URL for processing.
1. TestString is a plain text string.
- You can reverse reference $ N (N = 0 ~ 9), followed by the content in the nth parenthesis of the RewriteRule Regular Expression Following RewriteCond
- Reverse reference % N (N = 0 ~ 9) indicates the content in the nth pair of brackets in CondPattern in RewriteCond.
- Server variable % {VARNAME}
2. CondPattern is the condition pattern, a regular expression applied to the current instance TestString. That is, TestString matches the condition pattern. If yes, the value of RewriteCond is Rrue. Otherwise, the value is False.
You can use the following special variables (Available '! 'Implement Inversion):
'> CondPattern'(Greater than) Use condPattern as a normal string and compare it with TestString. When the character of TestString is greater than CondPattern, it is true.
'= CondPattern'(Equal to) Use condPattern as a normal string and compare it with TestString. When TestString and CondPattern are completely identical, they are true. if CondPattern is only "" (two quotation marks are placed together), TestString must be a null string to be true.
'-D'(Whether it is a directory) treats testString as a directory name and checks whether it exists and whether it is a directory.
'-F'(Whether it is regular file) treats testString as a file name and checks whether it exists and whether it is a regular file.
'-S'(Whether it is a regular file with a length not 0) treats testString as a file name and checks whether it exists and whether it is a regular file with a length greater than 0.
'-L'(Whether it is a symbolic link) treats testString as a file name and checks whether it exists and whether it is a symbolic link.
'-F'(Check whether a file is accessible through subrequest) Check whether TestString is a legal file and access through the access control currently set in the server scope. This check is completed through an internal subrequest, so you need to use this function with caution to reduce server performance.
'-U'(Use subrequest to check whether a URL exists) Check whether TestString is a legal URL and access through the access control currently set in the server range. This check is completed through an internal subrequest, so you need to use this function with caution to reduce server performance.
3. [flags] is the third parameter. Multiple flags are separated by commas.
'Nocase | NC'(Case Insensitive) in extended TestString and CondPattern, the text is case insensitive during comparison. Note that this flag does not affect the file system and subrequest checks.
'Ornext | OR'(Establish the OR relationship with the next condition) by default, the relationship between the two conditions is "AND". Use this sign to change the relationship to "OR.
4). Rewrite
Server variables (only a few)
HTTP headers: HTTP_USER_AGENT, HTTP_REFERER, HTTP_COOKIE, HTTP_HOST, HTTP_ACCEPT
Connection & request: REMOTE_ADDR, QUERY_STRING
Server internals: DOCUMENT_ROOT, SERVER_PORT, SERVER_PROTOCOL
System stuff: TIME_YEAR, TIME_MON, TIME_DAY
5) simple regular expression rules
. Match any single character
[Chars] matched string: chars
[^ Chars] unmatched string: chars
Text1 | string that can be selected by text2: text1 or text2
? Matches 0 to 1 Characters
* Matches 0 to multiple characters
+ Match 1 to multiple characters
^ String start flag
$ String end flag
\ N Escape Character Mark
[Note]: the first generation of Apache requires a slash in the URL, but the second generation of Apache does not allow it, so ^ /?
4. Example Analysis
Example 1 (simple example ):
(Rewrite regulations in. htaccess)
RewriteEngine ON RewriteRule ^user/(w+)/?$user.php?id=$1
^: Enter the request address starting with user/
(W +): Extract all letters and send them to $1.
/? : Optional slash
$: Terminator
Replace with: user. php? Id = *
Note: Some apache versions are not compatible with the abbreviated Mode w + => [a-zA-Z _-]
Example 2 (Prohibit Internet Explorer and operabrowser access ):
RewriteEngine onRewriteCond % {HTTP_USER_AGENT} ^ MSIE [NC, OR] RewriteCond % {HTTP_USER_AGENT} ^ Opera [NC] RewriteRule ^. *-[F, L] # '-' indicates that the URL is not replaced.
Example 3 (return to the home page through an invalid path ):
RewriteEngine onRewriteCond %{REQUEST_FILENAME} !-fRewriteCond %{REQUEST_FILENAME} !-dRewriteRule ^(.*)$ index.php/$1 [L]
Example 4 (Anti-leech ):
RewriteEngine OnRewriteCond % {HTTP_REFERER }! ^ Http: // (. + .)? Mysite.com/[NC] # determine whether the request is your own domain name RewriteCond % {HTTP_REFERER }! ^ $ # {HTTP_REFERER} is not empty RewriteRule. *. (jpe? G | gif | bmp | png) $/images/nohotlink.jpg [L] # returns a warning Image
Example 5 (change the access URL directory name ):
Hiding the real directory name
RewriteEngine OnRewriteRule ^ /? Old_dir/([a-z \.] +) $ new_dir/$1 [R = 301, L] # new_dir is the real directory
Example 6 (create a link without a file suffix ):
RewriteEngine OnRewriteCond % {REQUEST_FILENAME}. php-f # determine whether the suffix file contains RewriteRule ^ /? ([A-zA-Z0-9] +) $ 1.php [L] RewriteCond returns request_filename=.html-f # determines if the suffix file contains RewriteRule ^ /? ([A-zA-Z0-9] +) $ 20.1.html [L]
Example 7 (only images can be displayed ):
RewriteEngine onRewriteCond %{REQUEST_FILENAME} !^.*\.(gif|jpg|jpeg|png|swf)$RewriteRule .*$ - [F,L]
Example 8 (redirection 404 does not exist in the file ):
RewriteEngine onRewriteCond %{REQUEST_FILENAME} !fRewriteCond %{REQUEST_FILENAME} !dRewriteRule .? /404.php [L]
(The above are some of your own opinions and conclusions. If you have any shortcomings or errors, please point them out)
Author: The leaf goes with the wind
Statement: The above only represents the point of view or conclusion I have summarized at a certain time in my work and study. When reprinting, please provide the original article link clearly on the Article Page