Character coding techniques for the filename field in content-disposition [go]

Source: Internet
Author: User

This article is about "how to encode the filename field in the content-disposition of an HTTP package?" "Another discussion of this issue. This problem has been raised a long time ago, there is still no satisfactory answer, at least I think so, so today I throw this question again, with my solution.

I wrote a C + +-based CGI application that can parse files that contain special character filenames, such as: Weird #€= {}; Filename.txt.

There doesn't seem to be a common way to get rid of content-dispostion in HTTP so that it works in every browser, with the following browser:

    • Internet Explorer
    • Firefox
    • Chrome
    • Opera
    • Safari

But I'd be happy to use different methods for different browsers to encode.

Let's talk about the approach I'm using:

Internet Explorer (add double quotes and replace # and; symbols):

Content-Disposition: attachment; filename="weird %23 € = { } %3B filename.txt"

Firefox (double quotes are still useful, no other changes required):

Content-Disposition: attachment; filename="weird # € = { } ; filename.txt"

There is also a viable alternative:

Content-Disposition: attachment; filename*=UTF-8‘‘weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt

Chrome:

The following problems occur when you use double quotes only:

    1. The = symbol in the file name is lost
    2. € will be replaced by-symbol

In this way you can get it done:

Content-Disposition: attachment; filename*=UTF-8‘‘weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt

Opera:

Using double quotation marks or syntax: Filename*=utf-8 ", this will produce the following problems:

    1. Multiple contiguous spaces in the file name have only one remaining
    2. Paired {} lost: "Ab{}cd.txt", "Abcd.txt"
    3. The semicolon in the file name truncates the following character: "ABC; Def.txt "," ABC "

This is due to file length limitations, and the following example works in opera:

Content-Disposition: attachment; filename*=UTF-8‘‘weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt

Safari:

If you use double quotation marks, the € symbol will be replaced with invisible characters, unfortunately there is no suitable way to solve this small problem.

But there is a way to get a reference from the original question mentioned above:

Content-Disposition: attachment; filename*=UTF-8‘‘weird%20%23%20%80%20%3D%20%7B%20%7D%20%3B%20filename.txt

But this scheme is useless to me, these escaped characters can't be restored correctly so the browser tries to save the file with the name of the CGI app. The reason for this problem is that I'm using an incorrect coding method. I did not encode it in accordance with the principles of RFC 5987. But Safari does not use the same coding method. So we can only say that the code of the € character is not solved temporarily.

By the way, a UTF-8 encoder converter: http://www.rishida.net/tools/conversion/

All the tests mentioned above are using the latest version of the browser:

    • Firefox 7
    • Internet Explorer 9
    • Chrome 15
    • Opera 11.5
    • Safari 5.1

PS: I tried all the special characters on the keyboard, but I was talking about the characters that would cause problems.

I tried, by the way. Include all possible special characters in the file name, and the test results are not the same as mentioned above:

The Complete test string:

0!§ $%&()=`´{}  []²³@€µ^°~+‘ # - _ . , ; ü ä ö ß 9.jpg

After encoding:

0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg

Content-DispositionTo write in this way:

Content-Disposition: attachment; filename*=UTF-8‘‘0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg

The following test results were obtained:

Firefox can work properly

Chrome works fine

IE Display: $% & () = ' ´{} []²³@€µ^°~ + ' #–_. , ; Üäöß9.jpg lost the first 6 characters

Explanation: This problem occurs because the browser has a limitation on the length of the file name character: Discard some characters from the beginning of the string. I didn't dig into this, but the normal file name could be about 200 characters long, and those filenames contain many escape character sequences that can be even more, but not more than 250 characters in length. So that's really not a problem.

Opera:0! §$% & () = ' ´[]²³@€µ^°~ + ' #–_. , ; Üäöß9.jpg and IE also lost some of the characters.

Description: I have scaled back my test string because I suspect there is a length limitation like ie in Opera.

This encoding does not work correctly in Safari.

The test now illustrates the fact that the syntax such as "Filename*=utf-8" Filname escape sequence can work in a browser other than safari. In Safari, however, only € will be replaced by this method. So the problem is not big.

About file name length:

Some problems with file name length were found in the test.

In Internet Explorer: A file name can be up to 147 characters long. If a transfer character does not appear in the string, this is the total length of the file name. If there is a transfer character in the string, the situation changes somewhat. The final file name is less than 147 characters long. But the rules are a little strange and I can't find an exact rule. If I use two escape characters, the file name is shortened by 5 characters, but when I use a lot of escape characters, the file name ends up with only two characters shortened.

Other browsers do not have this length limitation issue. As long as the system is able to process the file, it will be able to save the file successfully. I tested the next 250 character-length file names, and chrome prompted me to reduce the filename length, and opera would automatically shrink to 220 characters, and Firefox would automatically shrink to 210 characters. Opera will start shrinking from the end of the file name. Safari tried to save that long file name, but the processing failed to save and the file name in the download list was shown as-1.

Character coding techniques for the filename field in content-disposition [go]

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.