Java Magic Hall: URI, url (with URL Protocol Handler) and urn

Source: Internet
Author: User

First, preface

The past has been confused about what is the URI what is the URL, it is time to make a good understanding of them! This article as a study note, in order to query later, if there are flaws please correct me!

Second, from the URI.

1. Concept

A URI (Uniform Resource Identifier, Uniform Resource Identifier) that represents a uniform resource identity for a resource in string form.

The format is: [scheme:]scheme-specific-part[#fragment]

[scheme:] Component , the namespace identifier of the URI.

The Scheme-specific-part component is used to identify the resource, and the internal format is determined by the specific scheme .

The [#fragment] Component , the pound sign (#) as the starting character of the fragment component, while the fragment component is used to focus on a portion of the resource.

2. Absolute URI and relative URI

absolute URI: The full format that starts with the scheme component, such as Indicates that the resource is referenced in a way that does not depend on the environment in which the identity appears.

relative URI: a non-complete format that does not start with the scheme component, such as Represents a reference to a resource in a way that relies on the environment in which the dependency identity occurs.

Example: The current page address is

 //  HTML snippet  <a Id= " test  "  href= "  "  ></a >//  JS snippet  <script> var  href = document.getElementById ( "   " ). href console.log (href)  Span style= "color: #008000;" >//  </script> 

3. Opaque URIs and layered URIs

Opaque uri: The Scheme-specific-part component does not start with a forward slash (/), such as Mailto:[email protected]. Because an opaque URI does not require a decomposition operation, the Scheme-specific-part component is not validated for validation.

layered URI: The Scheme-specific-part component starts with a forward slash (/), such as

Scheme-specific-part component format: [//Authority][path][?query]

[/////Authority] , which represents the authorization authority component, starting with a pair of forward slashes (//), can be based on host (server-based) or registered (registry-based) (whereas registrations are relatively small based on the number of hosts) and end with a forward slash, question mark, or no subsequent character as the Authority component. The specific format of the authority component is [[email protected]]host[:p ort] .

[[email protected]] , user account.

Host, IP, or domain name.

[:p ORT] , communication port number, if omitted, use the default port number of the corresponding scheme component.

Example: http://[email protected]:80/

[path] , the path component represents the location of the resource identified based on the authority component. The path component has a series of path fragments (paths segment) that are separated by a forward slash (/) between the path fragments. If the first path fragment starts with a forward slash (/), it is an absolute path, otherwise it is called a relative path.

[? query] , the query component identifies the data to be passed to the resource and is used to affect the behavior of the resource's response.

4. Standardization (normalization), parsing (Resolution) and relativity (relativization)

Normalization (normalization): Actually removes the current layer (.) from the path component. And the previous layer of (..) These redundant characters. such as z/. /y Normalized to Y.

parsing (Resolution): resolves to a new standard URI with URI A as the base URI, along with another URI. such as as the base URI and z/. /y together to parse into

relative (relativization): Relative is actually the opposite operation of parsing. such as as the basic URI and to make a relative operation get/Z.

Here we may think this is not the same as the usual website address? Why do people call the site address a URL instead of a URI?

The father of the internet Tim Berners-lee introduces ways to identify, locate, and name Internet resources--uri, URLs, and urns. The three are related to each other, the category of URIs is at the top of the system, and the categories of URLs and urns are at the bottom of the system.

    • Uri:uniform Resource Identifier, Uniform Resource Identifier;
    • Url:uniform Resource Locator, Uniform Resource Locator;
    • Urn:uniform Resource name, Uniform resource names.

The URL and urn must be a URI, but the URI is not necessarily a URL or urn.

The URI is just the name of the resource, knowing that the URI is known to have a resource of such a name, as to how to get (interact with the resource) is no clue (cannot locate or read/write resources), and this resource name is permanent or temporary hold there is no corresponding provisions, So there's a two subset of URLs and urns.

First, URLs and urns inherit each component in the URI format, and then they expand on that basis.


url = URI (scheme component is a subset of the URI of a partially known network protocol) + Protocol processor (URL Protocol Handler) that matches the network protocol identified by the scheme component

1. The scheme component of the URI is called the protocol component in the URL, generally HTTP, HTTPS, FTP, file, data, jar, etc.

2. URL Protocol handler is a resource locator and a read-write mechanism of constraint rules and resource communication based on protocol for locating and reading and writing resources.

such as: Install Thunderbolt after clicking eD2K's Thunderbolt seeds will automatically open the Thunder download interface, this is why?

Thunderbolt seed is a resource, and ed2k is the resource URL of the Protocol component, and Thunderbolt is the URL Protocol Handler. The mapping between the Protocol component and the URL Protocol handler is stored in the registry under Windows, while Ubuntu is stored in/usr/share/applications/.desktop.

Windows7 under

①. Shortcut key "Start" +r pop-up run input box, enter Regedit to enter the registration form;

②. Enter the hkey_current_user/software/classes directory;

③. ed2k directory contains Shell/open/command directories, the right window has a REG_SZ record with the name URL protocol, indicating that this is a URL protocol record (without this record will not affect)

④. After clicking the command directory, the right-hand window has a REG_SZ type record, and the data is listed as "C:\Program Files (x86) \thunder Network\thunder\program\thundernewtask.exe" %1 "means ThunderNewTask.exe as the URL Protocol Handler, and%1 is the URL passed to Handler processing.

⑤. In fact, there is a DefaultIcon directory missing in eD2K, which has a REG_SZ type of record that specifies the icon for the type protocol file.

Under Ubuntu

Under the/usr/share/applications/.desktop file, add the following:

[Desktop entry]encoding=utf-8Version=1.0Type=  Applicationterminal=falseExec=/usr/bin/cloudjerun-c%uName=tunesviewcomment =small, easy-to-use program toaccess Itunesu Mediaicon=/usr/share/icons/hicolor/scalable/apps/  Tunesview.svgcategories=application; Network; MimeType=x-scheme-handler/cloudje;

Placeholder for the EXEC key value:

Add ...  Accepts ... %F      a single filename. %F      multiple filenames. %u      a single URL. %U      multiple URLs. %d in       conjunction with%F to locate a file.%d       in conjunction with%F to locate files. %n      a single filename without a path. %N      multiple filenames without paths. %K      a URI or local filename of the the location of the desktop file. %v the name of the      Device entry.

3. The URL is associated with the resource address, and the URL needs to be modified when the resource location changes.


URN = URI (scheme component is a subset of the URI of a partially known network protocol) + Protocol processor (URL Protocol Handler) + persistence/address independence that matches the network protocol identified by the scheme component

Urns are used to persistently identify an Internet resource, even if the resource is no longer present or unavailable, and the resource location changes with the actual persistence policy without modifying the URI (address independence). However, the persistence strategy can also implement a urn corresponding to the n URI, such as the magnetic link in BT (Magnet URI scheme).

such as: magnet:?xt=urn:btih:4d9fa761d69964b00df0b3b0c9c1f968ea6c47d0&xt=urn:ed2k : 7655dbacff9395e579c4c9cb49cbec0e&dn=bbb_sunflower_2160p_30fps_stereo_abl.mp4

It's time to sum up the associations and differences between URIs, URLs, and urns!

1. First the URI is the base, both the URL and the urn belong to the URI;

2. url = URI (scheme component is a subset of the URI of the partially known network protocol) + Protocol processor (URL Protocol Handler) that matches the network protocol identified by the scheme component;

3. The urn is highlighted by persistence, and address independence is achieved through a specific persistence strategy. URN = URI (scheme component is a subset of the URI of a partially known network protocol) + Protocol processor (URL Protocol Handler) + persistence/address independence that matches the network protocol identified by the scheme component.

Iv. class and class

In Java, and Java.neturl two operation classes are provided separately for URIs and URLs.

The following features are mainly available in :

1. Verifying the URI format

The constructor URI (string str), if malformed, throws Urisyntaxexceptionuri.create (String str), If the format is not correct, throw unchecked illegalargumentexception

2. Extracting the URI components

Getauthority () getfragment () GetHost () GetPath () Getport () Getquery () Getscheme () Getschemespecificpart () GetUserInfo ( )

3. Standardization, resolution and relativity

Normalize (), returns a new object that conforms to the standard URI. such as ' x/y/. /z/./q ' and ' x/z/Q ' Resolve (String/URI Uri), which is parsed in reverse, takes the argument as a relative URI, takes the Resolve method to the object as the base URI to get a new standard URI object Relativize (Uri uri), the relative operation, is to get the relative URI instance in the URI: URI uribase=NewURI (""); URI urirelative=NewURI ("x/. /y"); URI Uriresolve= Uribase.resolve (urirelative);// urirelativized = uribase.relativize (uriresolve);//y

4. Turn the URI into a URL

Uri#tourl () to convert the URI to a URL.

Note: No operations are included for searching and reading and writing resources.

 The following features are mainly available in

The URL class deals with URL strings based on URL Protocol handler, and throws malformedurlexception if there is no corresponding protocol handler.

Built-in URL Protocol Handler that provides HTTP, HTTPS, FTP, file, and JAR protocols. Other protocol processors require developers to inherit the URLStreamHandler themselves. The processing flow is as follows:

1. View the processor cache Hashtable handlers, and return directly if there is a cache entry;

2. If the cache does not see if there is an urlstreamhandlerfactory instance, call its Createurlstreamhandler (String protocol) if it exists. By default, the URLStreamHandlerFactory instance is null;

3. If NULL is returned in 2, the list of package names separated by the | is obtained through the system Properties java.protocol.handler.pkgs, and then each check for the existence of URLStreamHandler inherited <package>.< Protocol>. Handler class, there is the return, none continue to traverse;

4. If the traversal fails in 3, check if there is a <system default Package>.<protocol> that inherits URLStreamHandler. Handler's built-in class.

5. The above failures are thrown malformedurlexception.

In addition to providing methods for getting the components, the class URL provides a way to read and write resources such as InputStream OpenStream () . Below we read the contents of the T.txt text file through the URL class.

classmain{Static voidMain (string[] args) throws IOException, malformedurlexception{String path1="D:\\t.txt", path2="File:/d:/t.txt"; Main main=NewMain ();    MAIN.READBYFR (path1);  Main.readbyurl (path2); }  //by FileInputStream.  voidreadbyfr (String path) throws ioexception{FileReader fr=NewFileReader (path); Try{      intbuf;  while(-1!=buf) { (); System. out. Print (Char) BUF); }    }finally{fr.close (); }}//by the wording of the URL   voidReadbyurl (String path) throws malformedurlexception, ioexception{URL url=NewURL (path); InputStreamReader Reader=NewInputStreamReader (Url.openstream ()); Try{intbuf;  while(-1!=buf) { (); System. out. Print (Char) BUF); }      }      finally{reader.close (); }   }}

V. Summary

If there are any flaws in the above content, please correct me, thank you!

Respect the original, reprint please indicate from: ^_^ Fat Boy John

Vi. references







Java Magic Hall: URI, url (with URL Protocol Handler) and urn

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.