Apache2.2+mod_encoding solve the problem of URL encoding in Chinese

Source: Internet
Author: User
Tags file url

We often see this kind of distress sticker on the forum: Why can't I see the file of Chinese file name on the website? At this time there will be good-hearted warrior told that, to IE6 tools, Internet Options, advanced, the "Always send the URL UTF-8" to remove .... So the world is clean.
Why is that?

Let's look at an example like this:
If you are entering such a path in the browser: http://hi.baidu.com/uroot/Chinese. mp3. For the inclusion of Chinese URLs,
Then the browser will encode "Chinese" in a similar%hh way. So, does the HTTP client actually encode GBK or UTF-8?
In IE, there is an option to "always send a URL with UTF-8". and is the default setting. But not all HTTP clients are like this. For example, Firefox is directly encoded in GBK (here we assume that the operating system is the Windows Simplified Chinese version)
Thus, the Apache server side in obtaining such a URL, it is possible to send a different encoding of the request, but the purpose of their request is actually a file.
Here's how Apache handles these 2 requests:
For Apache, it is very depressing is that the URL does not contain any encoding information, then it can do the simplest thing is to receive the file name, the system (operating system) directly initiated the request to read the name of the file.
For a file system, it must be an encoding, such as UTF-8. Then it means that Apache's request for GBK encoded file name is not found ~ ~ ~ So Apache sent a sorry to the HTTP client, 404 ...


So from the user side to see a very strange thing happened: with IE (check UTF-8 send URL) to access (download) This MP3 file URL, everything is OK. Use download tool such as FlashGet or Firefox to access (download) This file, get 404 file does not exist error!

Through the above explanation, for IE access to the WWW website Chinese file name file access to 404 files did not find the error situation, ie put "always send URL with UTF-8" option uncheck can be successful WWW server, we can infer that the WWW server-side file system using GBK encoding.

Then we have no way to solve this problem, let Apache tube it is UTF-8 or GBK take-all. You know, there are always some special needs in the URL of the Chinese language. Although we tried to avoid doing so.
Tank Factory (hi.baidu.com/uroot)
Here's a way to solve this problem using mod_encoding:
Requirements: A download server, need to download the file is a Chinese file name,. This makes it easier for users to see the name download intuitively.
Requires either IE or other download tools, the default configuration of the normal download of Chinese files, do not need additional settings.
(That is, whether the URL encoding is UTF-8 or GBK, can automatically adapt)
Server configuration: CentOS 5. GBK. Apache 2.2.x.

1.download & Patch:
# wget http://webdav.todo.gr.jp/download/mod_encoding-20021209.tar.gz
# wget http://webdav.todo.gr.jp/download/experimental/mod_encoding.c.apache2.20040616
New version mod_encoding.c overwrite
# CP mod_encoding.c.apache2.20040616 MOD_ENCODING-20021209/MOD_ENCODING.C

You have to hit an Apache 2.2 patch here. Otherwise make will also be APXS rc=65536 such as errors.
# wget Http://www.aconus.com/~oyaji/faq/mod_encoding.c-apache2.2-20060520.patch
# CD mod_encoding-20021209
# Patch-p0 < Mod_encoding.c-apache2.2-20060520.patch

2.install Iconv-hook
# CD Mod_encoding-20021209/lib
#./configure--PREFIX=/USR
# make
# make Install
# Ldconfig

3 Build Mod_encoding
This needs to be performed under the mod_encoding-20021209 folder

# CD mod_encoding-20021209
./configure--with-apxs=/opt/apache2.2/bin/apxs--with-iconv-hook=/usr/include
Make
Gcc-shared-o mod_encoding.so Mod_encoding.o-wc,-wall-llib-liconv_hook

CP Mod_encoding.so/opt/apache2.2/modules

4. config Apache 2.2

LoadModule Headers_module modules/mod_headers.so
LoadModule Encoding_module modules/mod_encoding.so

Header add Ms-author-via "DAV"


Encodingengine on
Normalizeusername on
Setserverencoding GBK
Defaultclientencoding UTF-8 GBK GB2312
Addclientencoding "(Microsoft. * DAV $)" UTF-8 GBK GB2312
Addclientencoding "Microsoft. * DAV" UTF-8 GBK GB2312
Addclientencoding "microsoft-webdav*" UTF-8 GBK GB2312

Tank Factory (Hi.baidu.com/uroot)

Test environment: IE (Always send UTF-8 URL), Flashget (GBK), FireFox 2.0.x can download the Chinese name file normally.

Apache2.2+mod_encoding solve the problem of URL encoding in Chinese

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.