解決java中對URL編碼的問題

來源:互聯網
上載者:User

標籤:

首先查看javascript中的encodeURI和encodeURLComponent方法的區別.

encodeURI:不會對 ASCII 字母和數字進行編碼,也不會對這些 ASCII 標點符號進行編碼: - _ . ! ~ * ‘ ( )    也不會對以下在 URI 中具有特殊含義的 ASCII 標點符                       號,encodeURI() 函數是不會進行轉義的:;/?:@&=+$,#

encodeURLComponent:不會對 ASCII 字母和數字進行編碼,也不會對這些 ASCII 標點符號進行編碼: - _ . ! ~ * ‘ ( )

 

而java中,URLEncoder.encode(string content,String enc) 方法:

  不會對 ASCII 字母和數字進行編碼,也不會對這些 ASCII 標點符號進行編碼: - _ .  * 

參考代碼如下:

        dontNeedEncoding = new BitSet(256);        int i;        for (i = ‘a‘; i <= ‘z‘; i++) {            dontNeedEncoding.set(i);        }        for (i = ‘A‘; i <= ‘Z‘; i++) {            dontNeedEncoding.set(i);        }        for (i = ‘0‘; i <= ‘9‘; i++) {            dontNeedEncoding.set(i);        }        dontNeedEncoding.set(‘ ‘); /* encoding a space to a + is done                                    * in the encode() method */        dontNeedEncoding.set(‘-‘);        dontNeedEncoding.set(‘_‘);        dontNeedEncoding.set(‘.‘);        dontNeedEncoding.set(‘*‘);

 

如果我想要在java中對一個url進行編碼,但是不對URI 中具有特殊含義的 ASCII 標點符號進行編碼,需要在dontNeedEncoding中添加相關字元,建立自己的編碼類別MyURIEncode:

  

package com.sitech.solr.util;import java.io.CharArrayWriter;import java.io.UnsupportedEncodingException;import java.nio.charset.Charset;import java.nio.charset.IllegalCharsetNameException;import java.nio.charset.UnsupportedCharsetException;import java.security.AccessController;import java.util.BitSet;import sun.security.action.GetPropertyAction;public class MyURIEncoder {    static BitSet dontNeedEncoding;    static final int caseDiff = (‘a‘ - ‘A‘);    static String dfltEncName = null;    static {        /* The list of characters that are not encoded has been         * determined as follows:         *         * RFC 2396 states:         * -----         * Data characters that are allowed in a URI but do not have a         * reserved purpose are called unreserved.  These include upper         * and lower case letters, decimal digits, and a limited set of         * punctuation marks and symbols.         *         * unreserved  = alphanum | mark         *         * mark        = "-" | "_" | "." | "!" | "~" | "*" | "‘" | "(" | ")"         *         * Unreserved characters can be escaped without changing the         * semantics of the URI, but this should not be done unless the         * URI is being used in a context that does not allow the         * unescaped character to appear.         * -----         *         * It appears that both Netscape and Internet Explorer escape         * all special characters from this list with the exception         * of "-", "_", ".", "*". While it is not clear why they are         * escaping the other characters, perhaps it is safest to         * assume that there might be contexts in which the others         * are unsafe if not escaped. Therefore, we will use the same         * list. It is also noteworthy that this is consistent with         * O‘Reilly‘s "HTML: The Definitive Guide" (page 164).         *         * As a last note, Intenet Explorer does not encode the "@"         * character which is clearly not unreserved according to the         * RFC. We are being consistent with the RFC in this matter,         * as is Netscape.         *         */        dontNeedEncoding = new BitSet(256);        int i;        for (i = ‘a‘; i <= ‘z‘; i++) {            dontNeedEncoding.set(i);        }        for (i = ‘A‘; i <= ‘Z‘; i++) {            dontNeedEncoding.set(i);        }        for (i = ‘0‘; i <= ‘9‘; i++) {            dontNeedEncoding.set(i);        }        dontNeedEncoding.set(‘ ‘); /* encoding a space to a + is done                                    * in the encode() method */        dontNeedEncoding.set(‘-‘);        dontNeedEncoding.set(‘_‘);        dontNeedEncoding.set(‘.‘);        dontNeedEncoding.set(‘*‘);                        //對以下在 URI 中具有特殊含義的 ASCII 標點符號    ;/?:@&=+$,#  不需要轉義        dontNeedEncoding.set(‘;‘);        dontNeedEncoding.set(‘/‘);        dontNeedEncoding.set(‘?‘);        dontNeedEncoding.set(‘:‘);        dontNeedEncoding.set(‘@‘);        dontNeedEncoding.set(‘&‘);        dontNeedEncoding.set(‘=‘);        dontNeedEncoding.set(‘+‘);        dontNeedEncoding.set(‘$‘);        dontNeedEncoding.set(‘,‘);        dontNeedEncoding.set(‘#‘);                dfltEncName = AccessController.doPrivileged(            new GetPropertyAction("file.encoding")        );    }    /**     * You can‘t call the constructor.     */    private MyURIEncoder() { }    public static String encode(String s, String enc)        throws UnsupportedEncodingException {        boolean needToChange = false;        StringBuffer out = new StringBuffer(s.length());        Charset charset;        CharArrayWriter charArrayWriter = new CharArrayWriter();        if (enc == null)            throw new NullPointerException("charsetName");        try {            charset = Charset.forName(enc);        } catch (IllegalCharsetNameException e) {            throw new UnsupportedEncodingException(enc);        } catch (UnsupportedCharsetException e) {            throw new UnsupportedEncodingException(enc);        }        for (int i = 0; i < s.length();) {            int c = (int) s.charAt(i);            //System.out.println("Examining character: " + c);            if (dontNeedEncoding.get(c)) {                if (c == ‘ ‘) {                    c = ‘+‘;                    needToChange = true;                }                //System.out.println("Storing: " + c);                out.append((char)c);                i++;            } else {                // convert to external encoding before hex conversion                do {                    charArrayWriter.write(c);                    /*                     * If this character represents the start of a Unicode                     * surrogate pair, then pass in two characters. It‘s not                     * clear what should be done if a bytes reserved in the                     * surrogate pairs range occurs outside of a legal                     * surrogate pair. For now, just treat it as if it were                     * any other character.                     */                    if (c >= 0xD800 && c <= 0xDBFF) {                        /*                          System.out.println(Integer.toHexString(c)                          + " is high surrogate");                        */                        if ( (i+1) < s.length()) {                            int d = (int) s.charAt(i+1);                            /*                              System.out.println("\tExamining "                              + Integer.toHexString(d));                            */                            if (d >= 0xDC00 && d <= 0xDFFF) {                                /*                                  System.out.println("\t"                                  + Integer.toHexString(d)                                  + " is low surrogate");                                */                                charArrayWriter.write(d);                                i++;                            }                        }                    }                    i++;                } while (i < s.length() && !dontNeedEncoding.get((c = (int) s.charAt(i))));                charArrayWriter.flush();                String str = new String(charArrayWriter.toCharArray());                byte[] ba = str.getBytes(charset);                for (int j = 0; j < ba.length; j++) {                    out.append(‘%‘);                    char ch = Character.forDigit((ba[j] >> 4) & 0xF, 16);                    // converting to use uppercase letter as part of                    // the hex value if ch is a letter.                    if (Character.isLetter(ch)) {                        ch -= caseDiff;                    }                    out.append(ch);                    ch = Character.forDigit(ba[j] & 0xF, 16);                    if (Character.isLetter(ch)) {                        ch -= caseDiff;                    }                    out.append(ch);                }                charArrayWriter.reset();                needToChange = true;            }        }        return (needToChange? out.toString() : s);    }}

 

解決java中對URL編碼的問題

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.