URLs are standardized names for internet Resources. The URL points to each electronic message, tells them where it is, and how to interact with it.
- URL syntax, and the meanings of various URL components and their Work.
- Many Web clients support the amount of URL shortcuts, including relative URLs and auto-expanding URLs
- URL encoding and character rules
- Common URL schemes to support various Internet Information systems
- The future of urls, including Urn-this framework can maintain a stable access name when an object is moved from one place to Another.
2.1 Browse Internet Resources
The URL is where the browser looks for information that is required for the Resource. With urls, people and applications can find, use, and share large amounts of data on the Internet.
URIs are a more generic class of resource identifiers, consisting of URLs and urns. URLs identify resources by describing their location, and run identifies resources by name regardless of where they are currently located.
The URL subset of the knowledge URI that is processed by the HTTP Application. The URL is divided into three parts
- The first part of the URL (htp) is the URL Scheme. Scenarios can tell Web clients how to access resources
- The second part of the URL refers to the location of the Server. This section tells you where the Web client resource is located
- The third part of the URL is the resource path, which indicates that the specific local resource on the server is requested
URLs can access resources through protocols other than HTTP. They can point to arbitrary resources on the internet, such as personal e-mail accounts.
Maito: [email protected]
or other Protocols. For example, through the file Transfer Protocol (files Transfer Protocol, FTP) to obtain a variety of files.
Ftp://ftp.lots-o-books.com/pub/complete-preice-list.xls
Or from a streaming video server on the Greek movie:
Rtsp://www.joes-hardware.com:554/interview/cto_video
URLs provide a uniform way to name Resources.
URLs provide users and their browsers with all the conditions they need to find Information. The URL defines the specific resources that the user needs, where he is located and how to get Him.
2.2 Syntax for URLs
URLs provide a means of locating arbitrary resources on the internet, but they can be accessed in a variety of different ways, as the scenarios differ.
Most URLs follow a common URL syntax, and the URL syntax for most scenarios is based on the 9-part common Format.
<scheme>://<user>:<password>@
scenarios, users, passwords, hosts, ports, paths, parameters, queries, fragments
almost no URL contains all of these components, and the 3 most important parts of the URL are scheme, host, and path (path).
2.2.1 Scheme---what protocol is used
The scenario is actually a way of specifying how to access the primary identifier of the specified resource, and he tells what protocol the application that is responsible for the gap URL should use.
syntax: The schema component must start with an alphabetic symbol, separated by the first ':', and the symbol from the rest of the Url. The scheme name is case-independent .
2.2.2 Host and Port
The URL host and port number components provide a specific machine location for the machine location and target resource server where the resource is Loaded. The host component identifies the host machine that has access to the resource on the Internet and can represent the hostname with the host name or IP address described Above. The port component represents the network port that the server is listening on, and the default port is 80 for HTTP that uses the TCP protocol BELOW.
2.2.3 User name and password
User and password components, many servers require that a user name and password be entered to allow users to access the Data. The FTP server is such a common example.
Ftp;//anonymous:[email Protected]/pub/gnu
Use ' @ ' to separate the user and password components from the rest of the Url. 2
2.2.4 Path
The path component of the URL describes where the resource resides on the Server. The path is usually much like a hierarchical file system path. You can use the character '/' to divide the path component of an HTTP URL into some path short (path segment). Each path is short with its own parameter (param) component
2.2.5 parameters
In order to provide the application with the input parameters they need to properly interact with the server, there is a parameter component in the Url. This component is the list of name-value pairs in the url, separated by the characters ': ' from the rest of the Urls.
Http://www.joes-hardware.com/hammers:sale=false/index.html:graphics=true
2.2.6 query string
Many resources, such as database services, can be scaled down by asking questions or querying to narrow down the requested Resources.
http://www.joes-hardware.com/inventory-check.cgi?item=12731
(? The content on the right is the query component, which is sent to the gateway resource along with the query component of the URL and the URL path component that identifies the gateway Resource.
There is no requirement for the format of the query Component. As a rule, many gateways want the query string to appear as a series of "name/value" pairs, separated by a "&" between the pairs of name Values:
Http://ww.joes-hardware.com/inventory-check.cgi?item=12731&color=blue
2.2.7 Fragment
To reference a fragment of some resource or resource. The URL supports using the Fragment (frag) component to represent a fragment within a resource.
Http;//ww.joes-hardware.com/tools.html#drils
The HTTP server typically handles the complete object, not the fragment of the Object. The client can send a fragment to the server, and after the browser obtains a blanket return from the server, it will display the portion of the resource that you are interested in according to the Fragment.
(it feels like an anchor in HTML)
2.3 URL Shortcuts
Web clients can understand and use several URL shortcuts, and a relative URL is a handy way to specify a resource within a resource. Many browsers also support the ' auto-extension ' of urls, which is a key part of the User's input url, and then the rest of the browser fills it up.
2.3.1 Relative URL
URLs are available in two ways: absolute and Relative. The absolute URL contains all the information needed to access the Resource. Relative URLs are incomplete, and all the information needed to access a resource from a relative URL must be parsed relative to another URL known as its base (base).
A relative URL is just a fragment or a small portion of a url. The application that processes the URL is able to convert between the relative and absolute urls. Provides a convenient way to maintain the portability of a group of Resources. If you are using a relative url, you can still maintain the validity of the connection while moving a set of documents, because the relative URLs are interpreted relative to the new base, enabling the ability to provide mirrored content on other Servers.
- 1. base URL
The first step in conversion processing is to find the underlying url, which is used as a reference point for the relative url.
L explicitly provide in resources
• The underlying URL of the encapsulated resource
L No Base URL
- 2. resolving relative references
To convert a relative URL to an absolute url, the next step is to divide the relative URL and the underlying URL into component Segments.
In fact, this is just parsing the url, which divides it into components and is often referred to as a decomposition (decomposing) url. The base and relative URLs are divided into components, and the algorithm can be used to complete the Conversion.
2.3.2 Automatic extension URL
Some browsers interpret the automatic extension URL after the user submits the url, or when the user enters it. There are two ways of doing this:
- Host name extension
Example: Enter baidu, The browser automatically inserted in the host name Www. and. COM.
- Historical expansion
Store URL history that has been visited by previous users
2.4 All kinds of characters
The URL is portable (portable). He's going to agree. naming all resources on the internet, it is important to design URLs so that they can be transmitted securely through any Internet Protocol.
2.4.1 URL Character Set
The default computer system character set usually tends to be centered in English. historically, Many computer applications use the US-ASCII character set. The US-ASCII uses 7-bit, two-level codes to represent most of the keys provided by the English typewriter and a few non-promising control characters for text formatting and hardware notifications.
The designer of the URL integrates the escape sequence and, with the escape sequence, encodes any character value or data with the limited US-ASCII character set, allowing portability and Integrity.
2.4.2 Encoding mechanism
To avoid the restrictions imposed by the safe character set Notation. A coding mechanism was designed to represent the various unsafe characters in a url. Unsafe characters are represented by an ' escape ' method. Contains a percent semicolon (%) followed by two hexadecimal digits that represent the ASCII code of the Character.
2.4.3 Character limit
In the URL total, there are a few words Fu Bai to keep up, have this special meaning. Some characters are not in the defined Us-ascii printable character set, and some of the characters are confused with some Internet gateways and protocols and are therefore deprecated or used.
2.4.4 another point of clarification
The application should work according to certain specifications. Before the client application wants to send any URLs to other applications, it is best to have all the unsecured or first self-pay Awakened. As long as all unsafe characters are encoded, this URL is a ghost hairstylist that can be shared between applications, and the Bison does not have to worry about other applications being fooled by any special meanings of characters.
2.5 Programme of the World
- Http
- HTTPS (using Netscape's Ssl,ssl provides an end-to-end encryption mechanism for the HTP Link. Default Port: 443)
- mailto
- Ftp
- Rtsp,rtspu
- File
- News
- Telnet
2.6 Future Prospects
A URL is a powerful tool that can be used to name all existing objects and can easily include a written letter format, and a URL that provides a unified naming mechanism that can be shared among various internet protocols.
But the URLs are not perfect, they represent the actual address rather than the exact name.
Urn no matter where the object is moved, the urn can provide a stable name for the Object. 】、
An example of a permanent Uniform Resource Locator (persistent Uniform Resource Locators,purl) uses URLs to implement urn Functionality.
HTTP Learning 1-2 Chapter2-url and resources