|
Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://neptun.sai.msu.su/manual/mod/mod_rewrite.html
Дата изменения: Thu Nov 20 23:24:50 2003 Дата индексирования: Mon Oct 1 20:06:34 2012 Кодировка: Поисковые слова: annular solar eclipse |
![]()
Apache HTTP Server Version 1.3
Module mod_rewrite
URL Rewriting EngineThis module provides a rule-based rewriting engine to rewrite requested URLs on the fly.
Status: Extension
Source File: mod_rewrite.c
Module Identifier: rewrite_module
Compatibility: Available in Apache 1.2 and later.
Summary
``The great thing about mod_rewrite is it gives you all the configurability and flexibility of Sendmail. The downside to mod_rewrite is that it gives you all the configurability and flexibility of Sendmail.''-- Brian Behlendorf
Apache GroupWelcome to mod_rewrite, the Swiss Army Knife of URL manipulation!`` Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo. ''-- Brian Moore
bem@news.cmc.netThis module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule to provide a really flexible and powerful URL manipulation mechanism. The URL manipulations can depend on various tests, for instance server variables, environment variables, HTTP headers, time stamps and even external database lookups in various formats can be used to achieve a really granular URL matching.
This module operates on the full URLs (including the path-info part) both in per-server context (
httpd.conf) and per-directory context (.htaccess) and can even generate query-string parts on result. The rewritten result can lead to internal sub-processing, external request redirection or even to an internal proxy throughput.But all this functionality and flexibility has its drawback: complexity. So don't expect to understand this entire module in just one day.
This module was invented and originally written in April 1996
and gifted exclusively to the The Apache Group in July 1997 byRalf S. Engelschall
rse@engelschall.com
www.engelschall.com
Table Of Contents
Internal Processing
Configuration Directives
Miscellaneous
- RewriteEngine
- RewriteOptions
- RewriteLog
- RewriteLogLevel
- RewriteLock
- RewriteMap
- RewriteBase
- RewriteCond
- RewriteRule
Internal Processing
The internal processing of this module is very complex but needs to be explained once even to the average user to avoid common mistakes and to let you exploit its full functionality.
API Phases
First you have to understand that when Apache processes a HTTP request it does this in phases. A hook for each of these phases is provided by the Apache API. Mod_rewrite uses two of these hooks: the URL-to-filename translation hook which is used after the HTTP request has been read but before any authorization starts and the Fixup hook which is triggered after the authorization phases and after the per-directory config files (
.htaccess) have been read, but before the content handler is activated.So, after a request comes in and Apache has determined the corresponding server (or virtual server) the rewriting engine starts processing of all mod_rewrite directives from the per-server configuration in the URL-to-filename phase. A few steps later when the final data directories are found, the per-directory configuration directives of mod_rewrite are triggered in the Fixup phase. In both situations mod_rewrite rewrites URLs either to new URLs or to filenames, although there is no obvious distinction between them. This is a usage of the API which was not intended to be this way when the API was designed, but as of Apache 1.x this is the only way mod_rewrite can operate. To make this point more clear remember the following two points:
- Although mod_rewrite rewrites URLs to URLs, URLs to filenames and even filenames to filenames, the API currently provides only a URL-to-filename hook. In Apache 2.0 the two missing hooks will be added to make the processing more clear. But this point has no drawbacks for the user, it is just a fact which should be remembered: Apache does more in the URL-to-filename hook than the API intends for it.
- Unbelievably mod_rewrite provides URL manipulations in per-directory context, i.e., within
.htaccessfiles, although these are reached a very long time after the URLs have been translated to filenames. It has to be this way because.htaccessfiles live in the filesystem, so processing has already reached this stage. In other words: According to the API phases at this time it is too late for any URL manipulations. To overcome this chicken and egg problem mod_rewrite uses a trick: When you manipulate a URL/filename in per-directory context mod_rewrite first rewrites the filename back to its corresponding URL (which is usually impossible, but see theRewriteBasedirective below for the trick to achieve this) and then initiates a new internal sub-request with the new URL. This restarts processing of the API phases.Again mod_rewrite tries hard to make this complicated step totally transparent to the user, but you should remember here: While URL manipulations in per-server context are really fast and efficient, per-directory rewrites are slow and inefficient due to this chicken and egg problem. But on the other hand this is the only way mod_rewrite can provide (locally restricted) URL manipulations to the average user.
Don't forget these two points!
Ruleset Processing
Now when mod_rewrite is triggered in these two API phases, it reads the configured rulesets from its configuration structure (which itself was either created on startup for per-server context or during the directory walk of the Apache kernel for per-directory context). Then the URL rewriting engine is started with the contained ruleset (one or more rules together with their conditions). The operation of the URL rewriting engine itself is exactly the same for both configuration contexts. Only the final result processing is different.The order of rules in the ruleset is important because the rewriting engine processes them in a special (and not very obvious) order. The rule is this: The rewriting engine loops through the ruleset rule by rule (
RewriteRuledirectives) and when a particular rule matches it optionally loops through existing corresponding conditions (RewriteConddirectives). For historical reasons the conditions are given first, and so the control flow is a little bit long-winded. See Figure 1 for more details.
Figure 1: The control flow through the rewriting ruleset As you can see, first the URL is matched against the Pattern of each rule. When it fails mod_rewrite immediately stops processing this rule and continues with the next rule. If the Pattern matches, mod_rewrite looks for corresponding rule conditions. If none are present, it just substitutes the URL with a new value which is constructed from the string Substitution and goes on with its rule-looping. But if conditions exist, it starts an inner loop for processing them in the order that they are listed. For conditions the logic is different: we don't match a pattern against the current URL. Instead we first create a string TestString by expanding variables, back-references, map lookups, etc. and then we try to match CondPattern against it. If the pattern doesn't match, the complete set of conditions and the corresponding rule fails. If the pattern matches, then the next condition is processed until no more conditions are available. If all conditions match, processing is continued with the substitution of the URL with Substitution.
Quoting Special Characters
As of Apache 1.3.20, special characters in TestString and Substitution strings can be escaped (that is, treated as normal characters without their usual special meaning) by prefixing them with a slosh ('\') character. In other words, you can include an actual dollar-sign character in a Substitution string by using '
\$'; this keeps mod_rewrite from trying to treat it as a backreference.Regex Back-Reference Availability
One important thing here has to be remembered: Whenever you use parentheses in Pattern or in one of the CondPattern, back-references are internally created which can be used with the strings$Nand%N(see below). These are available for creating the strings Substitution and TestString. Figure 2 shows to which locations the back-references are transfered for expansion.
Figure 2: The back-reference flow through a rule We know this was a crash course on mod_rewrite's internal processing. But you will benefit from this knowledge when reading the following documentation of the available directives.
Configuration Directives
RewriteEngine
Syntax: RewriteEngine on|off
Default:RewriteEngine off
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite.c
Compatibility: Apache 1.2
The
RewriteEnginedirective enables or disables the runtime rewriting engine. If it is set tooffthis module does no runtime processing at all. It does not even update theSCRIPT_URxenvironment variables.Use this directive to disable the module instead of commenting out all the
RewriteRuledirectives!Note that, by default, rewrite configurations are not inherited. This means that you need to have a
RewriteEngine ondirective for each virtual host in which you wish to use it.
RewriteOptions
Syntax: RewriteOptions Option
Default:RewriteOptions MaxRedirects=10
Context: server config, virtual host, directory, .htaccess
Override: FileInfo
Status: Extension
Module: mod_rewrite.c
Compatibility: Apache 1.2;MaxRedirectsis available in Apache 1.3.28 and later
The
RewriteOptionsdirective sets some special options for the current per-server or per-directory configuration. The Option strings can be one of the following:
inherit- This forces the current configuration to inherit the configuration of the parent. In per-virtual-server context this means that the maps, conditions and rules of the main server are inherited. In per-directory context this means that conditions and rules of the parent directory's
.htaccessconfiguration are inherited.MaxRedirects=number- In order to prevent endless loops of internal redirects issued by per-directory
RewriteRules,mod_rewriteaborts the request after reaching a maximum number of such redirects and responds with an 500 Internal Server Error. If you really need more internal redirects than 10 per request, you may increase the default to the desired value.
RewriteLog
Syntax: RewriteLog file-path
Default: None
Context: server config, virtual host
Override: Not applicable
Status: Extension
Module: mod_rewrite.c
Compatibility: Apache 1.2
The
RewriteLogdirective sets the name of the file to which the server logs any rewriting actions it performs. If the name does not begin with a slash ('/') then it is assumed to be relative to the Server Root. The directive should occur only once per server config.
Note: To disable the logging of rewriting actions it is not recommended to set file-path to /dev/null, because although the rewriting engine does not then output to a logfile it still creates the logfile output internally. This will slow down the server with no advantage to the administrator! To disable logging either remove or comment out theRewriteLogdirective or useRewriteLogLevel 0!
Security: See the Apache Security Tips document for details on why your security could be compromised if the directory where logfiles are stored is writable by anyone other than the user that starts the server. Example:
RewriteLog "/usr/local/var/apache/logs/rewrite.log"
RewriteLogLevel
Syntax: RewriteLogLevel Level
Default:RewriteLogLevel 0
Context: server config, virtual host
Override: Not applicable
Status: Extension
Module: mod_rewrite.c
Compatibility: Apache 1.2
The
RewriteLogLeveldirective sets the verbosity level of the rewriting logfile. The default level 0 means no logging, while 9 or more means that practically all actions are logged.To disable the logging of rewriting actions simply set Level to 0. This disables all rewrite action logs.
Notice: Using a high value for Level will slow down your Apache server dramatically! Use the rewriting logfile at a Level greater than 2 only for debugging! Example:
RewriteLogLevel 3
RewriteLock
Syntax: RewriteLock file-path
Default: None
Context: server config
Override: Not applicable
Status: Extension
Module: mod_rewrite.c
Compatibility: Apache 1.3
This directive sets the filename for a synchronization lockfile which mod_rewrite needs to communicate with RewriteMap programs. Set this lockfile to a local path (not on a NFS-mounted device) when you want to use a rewriting map-program. It is not required for other types of rewriting maps.
RewriteMap
Syntax: RewriteMap MapName MapType:MapSource
Default: not used per default
Context: server config, virtual host
Override: Not applicable
Status: Extension
Module: mod_rewrite.c
Compatibility: Apache 1.2 (partially), Apache 1.3
The
RewriteMapdirective defines a Rewriting Map which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup. The source of this lookup can be of various types.The MapName is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via one of the following constructs:
When such a construct occurs the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified.${MapName:LookupKey}
${MapName:LookupKey|DefaultValue}The following combinations for MapType and MapSource can be used:
- Standard Plain Text
MapType:txt, MapSource: Unix filesystem path to valid regular fileThis is the standard rewriting map feature where the MapSource is a plain ASCII file containing either blank lines, comment lines (starting with a '#' character) or pairs like the following - one per line.
MatchingKey SubstValueExample:
## ## map.txt -- rewriting map ## Ralf.S.Engelschall rse # Bastard Operator From Hell Mr.Joe.Average joe # Mr. Average
RewriteMap real-to-user txt:/path/to/file/map.txt- Randomized Plain Text
MapType:rnd, MapSource: Unix filesystem path to valid regular fileThis is identical to the Standard Plain Text variant above but with a special post-processing feature: After looking up a value it is parsed according to contained ``
|'' characters which have the meaning of ``or''. In other words they indicate a set of alternatives from which the actual returned value is chosen randomly. Although this sounds crazy and useless, it was actually designed for load balancing in a reverse proxy situation where the looked up values are server names. Example:
## ## map.txt -- rewriting map ## static www1|www2|www3|www4 dynamic www5|www6
RewriteMap servers rnd:/path/to/file/map.txt- Hash File
MapType:dbm, MapSource: Unix filesystem path to valid regular fileHere the source is a binary NDBM format file containing the same contents as a Plain Text format file, but in a special representation which is optimized for really fast lookups. You can create such a file with any NDBM tool or with the following Perl script:
#!/path/to/bin/perl ## ## txt2dbm -- convert txt map to dbm format ## use NDBM_File; use Fcntl; ($txtmap, $dbmmap) = @ARGV; open(TXT, "<$txtmap") or die "Couldn't open $txtmap!\n"; tie (%DB, 'NDBM_File', $dbmmap,O_RDWR|O_TRUNC|O_CREAT, 0644) or die "Couldn't create $dbmmap!\n"; while (<TXT>) { next if (/^\s*#/ or /^\s*$/); $DB{$1} = $2 if (/^\s*(\S+)\s+(\S+)/); } untie %DB; close(TXT);
$ txt2dbm map.txt map.db- Internal Function
MapType:int, MapSource: Internal Apache functionHere the source is an internal Apache function. Currently you cannot create your own, but the following functions already exists:
- toupper:
Converts the looked up key to all upper case.- tolower:
Converts the looked up key to all lower case.- escape:
Translates special characters in the looked up key to hex-encodings.- unescape:
Translates hex-encodings in the looked up key back to special characters.- External Rewriting Program
MapType:prg, MapSource: Unix filesystem path to valid regular fileHere the source is a program, not a map file. To create it you can use the language of your choice, but the result has to be a executable (i.e., either object-code or a script with the magic cookie trick '
#!/path/to/interpreter' as the first line).This program is started once at startup of the Apache servers and then communicates with the rewriting engine over its
stdinandstdoutfile-handles. For each map-function lookup it will receive the key to lo