URL explanation and URL encoding

Source: Internet
Author: User

As a front-end, daily dealing with URLs is essential. But maybe every day is just a simple use, it is just smattering, with the work, I found in the daily grab package debugging, interface calls, browser compatibility and many other aspects, do not go deep to understand the URL and URL coding will step on a lot of pits. So write down this article and explain the URL.

URL and URI

Many people will confuse these two nouns.

URL: (uniform/universal Resource Locator abbreviation, Uniform Resource Locator).

URI: (Uniform Resource Identifier abbreviation, Uniform Resource Identifier).

Relationship:

A URI is a lower-level abstraction of a URL, a string literal standard.

That is, the URI belongs to the parent class, and the URL belongs to the subclass of the URI. A URL is a subset of the URI.

The difference is that the URI represents the path to the requesting server and defines such a resource. The URL also shows how to access the resource (http://).

port and URL standard format

What is a port? Port, which is the equivalent of a data transmission channel. Used to accept certain data and then transmit it to the appropriate service, and the computer will then send the corresponding reply to the other party via the open end.

The role of the port: because the relationship between the IP address and the network service is a one-to-many relationship. So actually on the Internet is the IP address plus the port number to distinguish between different services.

Ports are marked by port numbers, with only integers, ranging from 0 to 65535.

URL standard format.

In general, the common definition format for URLs that we are familiar with is:

scheme://host[:p ort#]/path/.../[;url-params][?query-string][#anchor]

1234567 scheme //有我们很熟悉的http、https、ftp以及著名的ed2k,迅雷的thunder等。host   //HTTP服务器的IP地址或者域名port#  //HTTP服务器的默认端口是80,这种情况下端口号可以省略。如果使用了别的端口,必须指明,例如tomcat的默认端口是8080 http://localhost:8080/path   //访问资源的路径url-params  //所带参数query-string    //发送给http服务器的数据anchor //锚点定位

 

automatically parse URLs with <a> tags

A common scenario in development is the need to extract some required elements from the URL, such as host, request parameters, and so on.

The usual practice is to write the regular to match the corresponding fields, of course, here to Amway the following method, from the James blog, the principle is to dynamically create a tag, using some of the browser's native methods and some of the regular (for the robustness of the regular or want), the perfect resolution of the URL, Get any part of what we want.

The code is as follows:

12345678910111213141516171819202122232425262728293031 // This function creates a new anchor element and uses location// properties (inherent) to get the desired URL data. Some String// operations are used (to normalize results across browsers).functionparseURL(url) {    vara =  document.createElement(‘a‘);    a.href = url;    return{        source: url,        protocol: a.protocol.replace(‘:‘,‘‘),        host: a.hostname,        port: a.port,        query: a.search,        params: (function(){            varret = {},                seg = a.search.replace(/^\?/,‘‘).split(‘&‘),                len = seg.length, i = 0, s;            for(;i<len;i++) {                if(!seg[i]) { continue; }                s = seg[i].split(‘=‘);                ret[s[0]] = s[1];            }            returnret;        })(),        file: (a.pathname.match(/([^/?#]+)$/i) || [,‘‘])[1],        hash: a.hash.replace(‘#‘,‘‘),        path: a.pathname.replace(/^([^/])/,‘/$1‘),        relative: (a.href.match(/tps?:\/[^/]+(.+)/) || [,‘‘])[1],        segments: a.pathname.replace(/^\//,‘‘).split(‘/‘)    };}

Usage Method:

123456789101112 varmyURL = parseURL(‘http://abc.com:8080/dir/index.html?id=255&m=hello#top‘);  myURL.file;     // = ‘index.html‘myURL.hash;     // = ‘top‘myURL.host;     // = ‘abc.com‘myURL.query;    // = ‘?id=255&m=hello‘myURL.params;   // = Object = { id: 255, m: hello }myURL.path;     // = ‘/dir/index.html‘myURL.segments; // = Array = [‘dir‘, ‘index.html‘]myURL.port;     // = ‘8080‘myURL.protocol; // = ‘http‘myURL.source;   // = ‘http://abc.com:8080/dir/index.html?id=255&m=hello#top‘

Using the above method, any part of the URL can be parsed.

URL Encoding

Why URL encoding? Usually, if something needs to be coded, it means that it is not suitable for direct transmission.

1, will cause ambiguity: for example, the URL parameter string using key=value such a key-value pair form to pass parameters, key-value pairs are separated by & symbols, such as postid=5038412&t=1450591802326, The server will parse the parameters according to the & and = parameter string, if the value string contains = or &, such as Procter and Gamble Company's abbreviation for the "G", assuming that it needs to be passed as a parameter, then the URL may have parameters such as ?name=p& g&t=1450591802326, because more than one & in the parameters will inevitably cause the server parsing error to receive the URL, so the ambiguous & and the = symbol must be escaped, that is, encode it.

2. Illegal characters: Also, the URL encoding format is ASCII, not Unicode, which means you cannot include any non-ASCII characters in the URL, such as Chinese. Otherwise, Chinese can cause problems if the client browser and the server-side browser support different character sets.

So how to encode? As follows:

escape, encodeURI, encodeURIComponent

Escape ()

The first thing to declare is thatthis function is discarded, as a front end if you use this function is to face .

Escape simply encodes the string (while the other two encode the URL), regardless of the URL encoding. The effect after encoding is presented in the form of%XX or%uxxxx. It does not encode ASCII characters, numbers, and @ */+ .

Depending on the description of the MDN, escape should be swapped for encodeURI or encodeuricomponent;unescape should be swapped for decodeURI or decodeuricomponent. Escape should be avoided. Examples are as follows:

12345678 encodeuri ( ' https://www.baidu.com/  A b C ' ) //"https://www.baidu.com/% 20a%20b%20c " encodeuricomponent ( " Https://www.baidu.com/ a b C ' ) //"https%3a%2f%2fwww.baidu.com%2f%20a%20b%20c"  //and escape will be encoded into the following, eocode the colon but not encode slash, very strange, so obsolete Escape ( ' Https://www.baidu.com/ a b C ' ) //"https%3a//www.baidu.com/%20a%20b% 20c "

encodeURI ()

encodeURI () is a function in Javascript that is really used to encode URLs. It looks at encoding the entire URL.

12 encodeURI("http://www.cnblogs.com/season-huang/some other thing");//"http://www.cnblogs.com/season-huang/some%20other%20thing";

After the encoding becomes the above result, you can see that the space is encoded as%20, while the slash/ , Colon: is not encoded.

Yes, it is used to encode the entire URL directly, not ASCII letters, numbers, ~! @ # $ & * () =:/,;? + ' to encode.

12 encodeURI("[email protected]#$&*()=:/,;?+‘")// [email protected]#$&*()=:/,;?+‘

encodeURIComponent ()

Hey, sometimes our URL looks like this, with another URL in the request parameter:

1 varURL = "http://www.a.com?foo=http://www.b.com?t=123&s=456";

It is obviously not possible to encodeuri it directly. Because the encodeURI does not escape the colon: and slash/, then there will be ambiguity when the above-mentioned server is accepted for parsing.

12 encodeURI(URL)// "http://www.a.com?foo=http://www.b.com?t=123&b=456"

This time, it should be used to encodeURIComponent (). Its role is to encode the parameters in the URL, remembering that it is the parameter, not the entire URL.

Because it simply does not encode ASCII letters, Numbers ~! * () ' .

Incorrect usage:

1234 varURL = "http://www.a.com?foo=http://www.b.com?t=123&s=456";encodeURIComponent(URL);// "http%3A%2F%2Fwww.a.com%3Ffoo%3Dhttp%3A%2F%2Fwww.b.com%3Ft%3D123%26s%3D456"// 错误的用法,看到第一个 http 的冒号及斜杠也被 encode 了

Correct usage: encodeURIComponent () focuses on encoding individual parameters:

123 varparam = "http://www.b.com?t=123&s=456"; // 要被编码的参数URL = "http://www.a.com?foo="+encodeURIComponent(param);//"http://www.a.com?foo=http%3A%2F%2Fwww.b.com%3Ft%3D123%26s%3D456"

Using the above use <a> tags to resolve URLs and encodeURI () and encodeURIComponent () based on business scenarios will be able to handle URL encoding issues well.

One of the most common application scenarios is when manually stitching URLs, each pair of Key-value is escaped with encodeURIComponent, and then transmitted. Original address: http://www.cnblogs.com/coco1s/p/5038412.html

URL explanation and URL encoding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.