For the URL, everyone is more familiar with, the other two words are more unfamiliar. URIs, URLs, and urns are standard ways to identify, locate, and name resources on the Internet. 1989 Tim Berners-lee invented the Internet (World Wide Web). WWW is considered to be a collection of real and abstract resources of global interconnection – it provides information entities on demand – accessed over the Internet. The actual resources range from file to person, and abstract resources include database queries.
Because a resource can be identified in a variety of ways (the name of the person may be the same, but the computer file can only be accessed by a unique path name combination), a standard way to identify the WWW resource is required. To meet this need, Tim Berners-lee introduces standard ways of identifying, locating, and naming: URIs, URLs, and urns.
- Uri:uniform Resource Identifier, Uniform Resource Identifier;
- Url:uniform Resource Locator, Uniform Resource Locator;
- Urn:uniform Resource name, Uniform resource names.
URIs, URLs, and urns in this system are associated with each other. The category of URIs lies at the top of the system, with URLs and urn categories at the bottom of the system. This arrangement displays both URLs and urns as sub-categories of URIs.
Of the three, where URLs and URIs are particularly easy to confuse.
URLs are strings used on the Internet to describe information resources, mainly used in various WWW client programs and server programs. URLs can be used in a unified format to describe various information resources, including files, server addresses and directories.
The format of the URL consists of the following three parts:
- Agreement (or service mode);
- The host IP address that holds the resource (sometimes including the port number);
- The specific address of the host resource. such as directories and filenames.
The first and second sections are separated by a "://" Symbol, and the second and third sections are separated by a "/" symbol. The first part and the second part are indispensable, and the third part can be omitted sometimes.
At present, the biggest disadvantage is that when the information resource storage location changes, the URL must be changed accordingly. So people are looking into new ways of expressing information resources.
A URI is a simple string that identifies a resource in a uniform (standardized) way, typically consisting of three parts:
- The naming mechanism for accessing resources.
- Host name of the storage resource.
- The name of the resource itself, represented by the path.
Typically, this string begins with a scheme with the following syntax:
[Scheme:] Scheme-specific-part
http://www.google.com, where HTTP is scheme,//www.google.com is Scheme-specific-part, And its scheme is separated from the scheme-specific-part by a colon.
Some URIs point to the inside of a resource. This URI ends with a "#" and follows a anchor glyph (called a fragment marker).
The relative URI does not contain any naming specification information. Its path usually refers to resources on the same machine. Relative URIs may contain relative paths (for example: "..") Represents the previous layer path) and can also contain fragment markers.
FAQ for URIs
- Difficult to enter, the URI unnecessarily verbose.
- The capital letter of the puzzling.
- Less common punctuation.
- It is difficult to display on paper media, and some characters are printed on paper and are not easily recognizable.
- Host and port issues in addition to the Scheme-specific section, domain and port can also cause confusion for users.
The rules that should be followed for designing URIs (refer to the previous article: Good URIs do not change)
URIs are part of the Web site UI, so the available sites should meet these URL requirements
- Simple, well-remembered domain name
- Brief (short) URI
- Easy-to-enter URI
- URI can reflect the structure of the site
- URIs can be guessed and hack by the user (and encouraged by the user)
- Permanent link, Cool URI, don ' t change
Smart Choice URI
must be short for the URI can be convenient entry, write down, spelling and memory, the URI to be as short as possible, according to the reference data provided by the information, a URI length preferably not more than 80 bytes (this is not a technical limit, experience and statistics provided data), including the schema and Host,port and so on.
uppercase and lowercase policies The case policy of the URI should be appropriate, either all lowercase, or the first letter capitalized, should avoid confusing case combination, in the UNIX world, the file path team case is sensitive, while in the Windows world, it is not case sensitive.
allow URI Management URI mapping administrators can reorganize the file system structure on the server without altering the URI, which requires a mapping mechanism between the URI and the real server file system structure. , rather than the blunt correspondence. This mapping mechanism can be implemented by the following technical means:
- Aliases, aliases, directory aliases on Apache, virtual directories on IIS
- Symbolic links, symbolic links, symbolic link to the Unix world
- Table or database of mappings, database mappings, URI, and file system structure correspondence are stored in the database.
The standard redirect administrator can simply modify the HTTP status code to implement the URI compatibility after the server file system structure change, the HTTP status codes that can be exploited are:
- 301 Moved permanently ([RFC2616] section 10.3.2)
- 302 Found (undefined redirect scheme, [RFC2616] section 10.3.3)
- Temporary Redirect ([RFC2616] section 10.3.8)
With a separate URI
Technology-Independent URIs
- When you provide a dynamic content service, you should use a technology-independent URI. That is, the URI does not expose the server-side scripting language, the platform engine, and the changes to these languages, platforms, and engines do not result in URI changes. Therefore, words such as sevelet,cgi-bin should not appear in the URI.
- When providing a static content service, the technology that should be hidden from the file extension is content-negotiation, proxy, and URI mapping
Identity sign and session mechanism
- Use a standard authentication mechanism instead of a specific URI per user
- Use the standard session mechanism instead of placing the session ID in the URI.
Use standard steering when content changes
- Use standard redirects for the content of the change
- Use HTTP410 for deleted resources
Provide an index proxy
Indexing policy
- Content-location
- Content-md5
Provide the appropriate cache information
- Cache-Related HTTP headers
- Caching policies
- Cache generated content HTTP head and HTTP GET
Summarize
- A URI is part of the Web UI and should be treated like a website logo and a company brand
- A URI is the only interface between a Web site and a normal user, and should be treated like your business phone number.
Read and remember the above two sentences, the next time you design a URI will give it due attention.
- The URL should be user-friendly
- The URI should be readable
- The URI should be predictable
- The URI should be uniform
Read and remember the above four sentences, you will know what kind of URI should be designed.
The difference between a URI and a URL and a urn