In Layman's Rest
I wonder if you realize that the "right" way to implement heterogeneous application-to-application communication is around, and a debate is raging: while the current mainstream approach is clearly focused on the soap, WSDL, and WS-* specifications of the Web services domain, But there are a few small but resonant voices arguing that the better way is rest, the representational state transfer (Transfer) abbreviation. In this article, I'm not going to discuss the topic of contention, but rather try to introduce the usability of rest and restful HTTP application integration. In my experience, some of the topics that once touched will attract many discussions, and when it comes to this topic, I will elaborate in depth.
Rest key Principles
Most of the introduction to rest begins with its formal definition and background. But here and first, I'm going to come up with a simple and concise definition: Rest defines how it should be used correctly (which is very different from how most people actually use it), such as HTTP and URIs. If you stick to the rest principle when designing your application, you're going to get a system that uses a high-quality web architecture that will benefit you. In summary, the five key principles are listed below:
- Define ID for all "things"
- Link all things together
- Using standard methods
- Multiple representations of resources
- Non-stateful communication
Let's take a closer look at these principles.
Define ID for all "things"
Here I use "things" instead of more formal and accurate terminology "resources", because a simple principle should not be submerged in terminology. Think about the systems that people build and often find a series of key abstractions that are worth identifying. Every thing should be identifiable, should have an obvious id--in the web, the uniform concept of representing IDs is: URIs. URIs make up a global namespace, and using URIs to identify your critical resources means that they have a unique, global ID.
The main benefit of using a consistent naming convention for things (naming scheme) is that you do not need to present your own rules--but rely on a rule that has been defined to operate almost perfectly in the world and that can be understood by the vast majority of people. Think of any of the High-level objects (high-level object) in the last app you built, assuming it wasn't built in a restful way, it's likely to see many use cases that benefit from using unique identities. For example, if your app contains an abstraction of a customer, then I'm pretty sure that the user would want a link to a customer that could be emailed to a colleague, or added to a browser bookmark, or even written on paper. To be more thorough: if in a Amazon.com-like online store, there is no unique ID (a URI) to identify each of its items, it is conceivable how terrible the business decision would be.
When confronted with this principle, many people are amazed at whether this means exposing database records (or database record IDs) directly to the outside world--and since many years of object-oriented practice has taught us that to hide persistent information as implementation details, even just a little bit of thought is often frightening. But there is no conflict between this principle and the hidden implementation details: Often things that are worth being identified by URIs--resources--are more abstract than database records. For example, an order resource can consist of a single item, an address, and many other aspects that may not be exposed as a separate identified resource. Identifying all the things that are worth identifying and understanding this concept can further lead you to create resources that are uncommon in traditional application design: a process or process step, a sale, a negotiation, a request for quotation-all examples of things that should be identified. Again, this can lead to the creation of more persistent entities than non-restful designs.
Here are some examples of URIs you might think of:
Http://example.com/customers/1234http://example.com/orders/2007/10/776654http://example.com/products/4554http ://example.com/processes/salary-increase-234
As I chose to create easy-to-read uri--This is a useful idea, though not necessary for restful design-it should be easy to infer the meaning of URIs: they clearly identify a single "data item". But look down again:
Http://example.com/orders/2007/11http://example.com/products?color=green
First, the two URIs look slightly different from the previous one-after all, they are not an identity for a thing, but rather an identification of a collection of things (assuming that the first URI identifies all orders submitted in November 2007 and the second is a collection of green-colored products). But these collections themselves are things (resources) and should also be identified.
Note that the benefits of using a unique, globally uniform naming convention apply both to Web applications in browsers and to machine-to-machine (MACHINE-TO-MACHINE,M2M) communication.
To summarize the first principle: use a URI to identify all the things that are worth identifying, especially all the "advanced" resources provided in the app, whether they represent a single data item, a collection of data items, a virtual or actual object, or a calculation result.
Link all things together
The next principle to be discussed is a somewhat frightening formal description: "Hypermedia is used as the application state engine (hypermedia as of the engine of application), sometimes abbreviated to HATEOAS. (Strictly speaking, this is not what I said.) The core of this description is the hypermedia concept, in other words: the idea of linking. Linking is a common concept in HTML, but its usefulness is not limited to this (for people to browse the Web). Consider the following imaginary XML fragment:
<order self= "http://example.com/customers/1234" > <amount>23</amount> <product ref= " http://example.com/products/4554 "> <customer ref=" http://example.com/customers/1234 "> </customer > </product></order>
If you look at the link between product and customer in the document, it's easy to imagine how the application (already retrieved) can "follow" the link to retrieve more information. Of course, it is also possible to use a simple "id" attribute that adheres to a proprietary naming convention-but only within the application environment. The elegance of using URIs to represent links is that links can point to resources provided by different applications, servers, and even other continents-because the URI naming convention is a global standard, and all the resources that make up the web can be interconnected.
The hypermedia principle also has a more important aspect-the application of "state". In short, the fact that the server side (or service provider, if you prefer) provides a set of links to the client (service consumer) that enables the client to change the application from one state to another by linking. We'll explore the implications of this in another article later, and for now, just remember that links are a very effective way to make a dynamic application.
A summary of this principle is as follows: wherever possible, use links to guide things (resources) that can be identified. It's the hyperlinks that make up the web today.
Using standard methods
There is an implicit assumption in the discussion of the first two principles that applications that receive URIs can explicitly do something meaningful with URIs. If you see a URI on the bus, you can enter it into the address bar of the browser and return to it--but how does your browser know what it needs to do with the URI?
The reason it knows how to handle URIs is that all resources support the same interface, and a set of the same methods (which you can also call operations). In HTTP this is called a verb (verb), except for the two well-known (get and post), the standard method set contains put, DELETE, head, and options. The meanings of these methods, together with the behavior promises, are defined in the HTTP specification. If you are an OO developer, you can imagine that all the resources in a restful HTTP scenario inherit from a class like this (in the pseudo-syntax description of Java, C #, note the critical approach):
Class Resource { Resource (URI u); Response get (); Response Post (Request R); Response put (Request R); Response Delete ();}
Since all resources use the same interface, you can use the Get method to retrieve a representation (representation)-that is, a description of the resource. Because the semantics of get are defined in the specification, it is certain that you do not have to be responsible for the consequences when you invoke it-that is why you can "safely" call this method. The Get method supports very efficient and mature caches, so in many cases you don't even need to send requests to the server. It is also certain that the Get method is idempotent [: it means that multiple identical requests return the same result]--if you send a GET request without getting the result, you may not know whether the request failed to reach the destination, or the response was lost on the way to the feedback. Idempotent guarantees that you can simply send another request to resolve the problem. Idempotent also applies to put (the basic implication is "update the resource data, create a new resource based on this URI if the resource does not exist") and delete (you can completely manipulate it over and over until you reach the conclusion that there is nothing wrong with deleting something that does not exist). The Post method, which usually means "create a new resource", can also be used to invoke the procedure , so it is neither secure nor idempotent.
If you expose your app's functionality in a restful way (or you can call it a service feature if you like), that principle and its constraints apply to you as well. If you're used to another design approach, it's hard to accept the principle-after all, you probably think your app contains logic that goes beyond the scope of these operations. Please allow me to take some time to convince you that there is no such situation.
Consider the following example of a simple procurement scenario:
As you can see, the example defines two service programs (no implementation details are included). The interfaces of these service programs are customized to complete the task (exactly the ordermanagement and customermanagement services we are discussing). If the client program attempts to use these services, it must encode these specific interfaces-it is not possible to use the client program to collaborate with the destination and interface before these interfaces are defined. These interfaces define the application protocol for the Service Program (Application protocol).
In restful HTTP mode, you will access the service program through the common interface that makes up the HTTP application protocol. You might come up with a way like this:
As you can see, specific operations in the service program are mapped to the standard HTTP method-I created a whole new set of resources to disambiguate. "It's a trick of deception," I heard you shouting. No, it's not cheating. The Get method that identifies a customer's URI is exactly equivalent to the getcustomerdetails operation. A triangle has been used to illustrate this point visually:
Think of the three vertices as buttons you can adjust. As you can see in the first approach, you have many operations, many kinds of data, and a fixed number of "instances" (essentially the same number of service programs you have). In the second approach, you have a fixed number of operations, many kinds of data, and many objects that call a fixed method. What it means is that you can basically express anything you like in either of these ways.
Why is it so important to use the standard approach? Fundamentally, it makes your app a part of the web-the contribution the application makes to the most successful app on the internet, proportional to the amount of resources it adds to the web. In a restful way, an app may add millions of customer URIs to the Web, and if CORBA is used to maintain the original design of the application, its contribution is probably just a "endpoint"-like a very small door, Only allow the person with the key to enter the resource domain.
The unified interface also makes it possible for all components that understand the HTTP application protocol to interact with your application. Common client programs (generic client) are examples of components that benefit from it, such as curl, wget, proxies, caches, HTTP servers, gateways and Google, Yahoo!, MSN, and more.
This is summarized as follows: To enable client programs to work with your resources, resources should implement the default application protocol (HTTP) correctly, which means using the standard GET, PUT, post, and delete methods.
Multiple representations of resources
So far we have overlooked a slightly more complex question: How does the client program know how to handle the retrieved data, such as the result of a GET or POST request? The reason for this is that the way HTTP is taken is to allow separation of relationships between data processing and operation invocations. In other words, if the client program knows how to handle a particular data format, it can interact with all the resources that provide this form of presentation. Let's use one more example to illustrate this point. With HTTP content negotiation, the client program can request a representation in a specific format:
The result of the request may be some customer information expressed in the company's proprietary XML format. Suppose the client program sends another different request, as follows:
get/customers/1234 http/1.1host:example.com Accept:text/x-vcard
The result may be a customer address in the vcard format. (Here I don't show the content of the response, it should contain metadata about the data type in its HTTP Content-type header.) This explains why ideally, resource representations should be in a standard format--If the client has an "understanding" of the HTTP application protocol and a set of data formats, it can interact with any restful HTTP application in the world in a meaningful way. Unfortunately, we can't get to the standard format for everything, but maybe we can think of using a standard format in a company or some partner to create a small environment. Of course, this applies not only to data from the server side to the client, but also to the fact that if the data from the client conforms to the application protocol, the server can process the data in a specific format instead of caring about the type of the client.
In practice, multiple representations of resources have other important benefits: if you provide HTML and XML two representations of your resources, these resources can be used not only by your application, but also by any standard Web browser-that is, your application information can be obtained by anyone who uses the web.
Resource Multi-representation there is another way to use it: You can incorporate your app's Web UI into the Web API-after all, the API's design is usually driven by the functionality that the UI can provide, and the UI executes the action through the API. Bringing these two tasks together brings surprising benefits, allowing both the user and the calling program to get a better web interface.
Summary: Provide multiple representations of resources for different needs.
Non-stateful communication
Stateless communication is the last principle I want to talk about. First, it should be emphasized that while rest contains the notion of statelessness (statelessness), this is not to say that the application of the exposed function cannot have a state--
In fact, in most cases this can lead to no use of the whole practice. Rest requires that the state be either placed in the resource state or saved on the client. Or, in other words, the server side cannot maintain the communication status of any client that communicates with it except for a single request. The most straightforward reason for this is scalability-if the server needs to maintain client state, a large amount of client interaction can severely affect the server's memory free space (footprint). (Note that to do stateless communication often requires some redesign--you can't simply attach some session state to the URI, and then claim that the app is restful.) )
But other things may seem more important: stateless constraints make server changes invisible to the client, because in two successive requests, the client is not dependent on the same server. A client receives a document from a server that contains a link, and when it does some processing, the server goes down, maybe the hard drive is broken and it is repaired, maybe the software needs an upgrade reboot--If the client accesses a link received from this server, it will not notice that the server in the background has changed.
Theoretically, rest
I confess: rest is not real rest, as I said above, and I may be a bit too keen on simplification. But because I wanted to have a different opening, I didn't introduce its formal definition and background at the outset. Now let's give a little brief introduction to this.
First, I didn't explicitly distinguish HTTP, restful http, and rest from the previous one. To understand the relationship between these different aspects, we must first look at the history of rest.
Roy T. Fielding in his doctoral dissertation (in fact you should visit this link-at least for an academic paper, it is quite readable.) This paper has been translated into Chinese) to define the term rest. Roy was the main designer for many basic Web protocols, including HTTP and URIs, and he put a lot of ideas into these protocols in his paper. (This paper is known as the "Rest Bible", which is appropriate-after all, the author invented the term, so in terms of definition, anything he writes is considered authoritative.) In the paper, Roy first defines a methodology to talk about architectural styles-advanced, abstract patterns-to express the core concepts behind architectural approaches. Each architectural style is defined by a series of constraints (constraints). Examples of architectural styles include "no style" (no constraints at all), pipelines and filters (pipe and filter), client/server, distributed objects, and--you guessed it--rest.
If it sounds too abstract to you, that's right.--rest is inherently a high-level style that can be implemented by many different technologies, and can be instantiated-by assigning different values to its abstract properties. For example, rest contains the concept of resources and unified interfaces-that is, all resources should respond to these same methods. But rest does not indicate which methods, or how many methods, are available.
An "Avatar" in restful style is HTTP (and a set of related standards, such as URIs), or a little more abstract: the Web schema itself. In the example above, HTTP uses an HTTP verb as an "instance" of the rest unified interface. Since fielding is a restful style defined after the web has been (or at least most) "perfected," one might argue that the two are not 100% matches. But in any case, the Web, HTTP, and URI as a whole are just a major implementation of the rest style. However, since Roy Fielding is the author of the rest paper and has far-reaching implications for the design of Web architectures, the similarities are also understandable.
Finally, I used the term "RESTful HTTP" over and over again for a simple reason: many applications that use HTTP are not following the rest principle for some reason, and some say that using HTTP instead of following the rest principle is tantamount to abusing HTTP. This, of course, sounds a bit crazy--in fact, the reason for violating rest constraints is usually that the design tradeoff that comes with each constraint may not be appropriate for some special situations. But in general, the reason for violating rest constraints can be attributed to the lack of awareness of their benefits. Look at an obvious negative case: using an HTTP GET call is similar to deleting an object, which violates the rest's security constraints and general knowledge (the client program should not be responsible for this, and the server-side developer is probably not intentional). But in a subsequent article, I'll mention more of this or that kind of abuse of HTTP.
Summarize
This article attempts to provide a quick introduction to the concepts behind rest (web schemas). RESTful HTTP exposure is not the same way as RPC, distributed objects, and Web services; To really understand these differences is a change of mindset. Whether you're building apps that just want to expose the Web UI or make the API a part of the web, it's good to understand the principles of rest.
In Layman's Rest