"Distributed operating System" knowledge point (22~28) four

Last Update:2018-07-26 Source: Internet

Author: User

Tags stub domain name server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Note:

(4) 8 P160: (4) on behalf of the title belongs to the 4th chapter of the content, 8 is the title (8th), P160 is the problem in the Book of the General page number.

(2) 22 describes the main steps of the RPC, in the form of input parameters, output parameters, input, output parameters of what the meaning is, why this provision. If the server is stateless, why the process of reading a file needs to give the position parameter. P48, P51, P56

A: The main steps of RPC are as follows:

the input parameters are created by the client process and passed to the server process. The output parameters are created by the server process and passed to the client process. The input and output parameters are created by the client process, passed to the server process, and then passed back to the client process after the server process has been modified.

the parameters of the process and process are written to the form specification as input generated by the stub to generate the client and server stubs and put them in the appropriate stub library for invocation.

If the server is stateless, after a request response, the server deletes all the requested information. If the user opens a file and does some work on it, the next time you request the file again and continue from the last place, you need to give the position parameter, because the server is not aware of the location of the last operation.

(2) 23 illustrates the main idea of RPC. After the client makes the request, the client is OK, but the response is not received, which should be the cause. It also explains which methods can be handled in the event of a server crash. P57, P59

A: The basic idea of RPC: calling a remote procedure is just like calling a local procedure.

Reason:

① customer cannot locate server;

② the request message sent by the client to the server is lost;

③ The response message sent by the server to the customer is lost;

④ server crashes after receiving a request;

The ⑤ client crashes after sending a request.

handling scenarios for server crashes:

① at least once in semantics. Wait for the server to restart, and then resend the request. This method requires constant retries until the answer message arrives and is passed to the customer. This technique, called at least semantics, guarantees that RPC executes at least once, but it is also possible to execute multiple times.

② A maximum of one semantics. Discard immediately and report the failure. It is the maximum semantics, ensuring that RPC executes at most once, but may not be executed.

③ does not make any warranties. When the server crashes, the customer does not get any help or assurance. RPC can not be executed or executed quite a number of times. The greatest advantage of this approach is that it is easy to implement.

④ the exact semantics once. Not easy to implement. But this method is not very mature.

(2) 24 describes the main idea of the client/server model and shows how the system works in the case of blocking, cached, reliable sending and receiving primitives. P36, P42, P44, P45

A: The basic idea: to construct an operating system that consists of a set of collaborative processes called the Customer (client), the process that serves the user is called the server. Both the client and the server are running in the same micro-core. Both the client and the server run as user processes, and a machine can run on a single process, multiple customers, multiple servers, or a mix of both.

the process is that the client process is stuck in a kernel call to send a message after it has been sent, blocking the process, waiting for the message to be sent to completion. The server-side kernel establishes a mailbox for the receiving process and receives messages from the network address. The kernel calls the receive primitive to process the message from the mailbox and, if the mailbox is empty, blocks the receive process. After the message is placed in the mailbox by the server-side kernel, the server-side kernel sends a confirmation message to the client core, and the client kernel receives confirmation from the server, confirms that the message is sent, activates the client process, and returns control to the client process, and the client process continues. The server-side receive sends the results back to the client process after the message is removed from the mailbox, and after the client process receives the message, the client core is sent a confirmation to the server-side core, and the server-side kernel receives the acknowledgement to activate the server-side send process.

(2) 25 customer in order to send a message to the server, it must know the address of the server, give the basic principles of the three addressing mechanisms, and explain the problems of the three mechanisms. P42

A: 1. Machine. Process Address method: Machine number and process number, machine number is used to enable the kernel to send messages correctly to the appropriate machine. The process number is used to make the kernel decide which process the message will be given.

2. Process addressing with broadcast (placing an additional burden on the system): the process chooses its own identification number in a fairly large and dedicated address space. The sender broadcasts a special location package that contains the address of the destination process, all cores check and see if the address is not theirs, and if it is the "I am here" message gives the network address, the sending kernel uses this address and "remembers" it.

3. Address query through the name server (need an intermediate part-name server): The name of the ASCII server in the client computer, each time the client is running, the first attempt to use the server, the client sends a request message to a special mapping server, (often referred to as a name server) ask a machine number where the current server is located, with this address, you can send the request directly.

Cons: Each of these methods has a problem. The first method is opaque, the second method creates an additional burden on the system, and the third requires an intermediate component-the name server. Of course, the name server can replicate, but doing so will cause problems maintaining its data consistency.

(2) 26 when implementing the client-server protocol, what basic types of packages are required, explaining the source, destination, and role of each package, and explaining what the following diagram means. P47

A: The package type in the client-server protocol:

(1) 27 The goal of the distributed system is to give the user an illusion, just as with a single computer, which requires transparency support that illustrates the various types of transparency supported by distributed systems. P4

Answer:

type	meaning
Position Transparent	The user does not know where the resource is located
Migration Transparency	resources can be moved without renaming
Copy Transparent	The user does not know how many copies exist
Concurrency Transparent	multiple users can automatically share resources
Parallel Transparent	system activity can occur in parallel without the user feeling

location Transparency: In a truly distributed system, users do not know the location of hard, software resources such as CPUs, printers, files, and databases. The name of the resource should not contain the location information of the resource. All names like machine1:prog.c or/macchine1/prog.c are unacceptable.

migration transparency: means that resources are free to move from one place to another without renaming.

replication is transparent, and the system is free to make additional copies of files and other resources without the user's knowledge.

concurrent transparency, when two users try to update the same file at the same time, no one user discovers the presence of another user. The mechanism for obtaining this type of transparency is that once a user begins to use the resource, the system automatically locks the resource until the user is finished and then unlocked. In this way, all resources can only be used serially, not concurrently.

parallel transparency: In theory, a distributed system behaves like a traditional but processor-time system in front of the user. System activity can occur in parallel without the user feeling. , but the current level of development is not enough. In fact, when the transparency of the work is completed, the whole work is completed, but also began a new field.

Access transparency: refers to the way in which different data are represented and how resources are accessed.

Reposition Transparency: If a resource can be relocated while receiving access without the user and application's attention.

Failure Transparency: The user will not notice that a resource (perhaps he has never heard of this resource) is not working properly and that the system subsequently recovers from a failure.

Persistent Transparency: hides the resource in volatile memory or on this disk.

28 detailed analysis of three factors that affect the scalability of a distributed system scale (size), which is centralized services, data, and algorithms. Examples of how distribution and replication techniques can improve scalability. (Principles and paradigms of distributed Systems P8, P10)

Answer: Examples of extensibility limitations:

Concept	Example
Centralized Service	a single server for all users to access
Centralized Data	Single online phone book
Centralized Algorithm	schedule routes based on complete information

Many services are implemented centrally, and they are provided by a single server running on a specific computer in a distributed system. The problem with this scenario is obvious: when the user increases, the server becomes a bottleneck for the system. Even if it has unlimited processing power and storage capacity, when the system reaches a certain size, communication with the server will also be difficult, so that the size of the system can not continue to grow.

centralized data is just as flawed as centralized services. If we want to save 50 million people's telephone number and address, if each data record occupies 50 characters, a 2.5GB hard disk can provide enough storage space. However, if only one database is used, there is no doubt that the database is filled with data in and out of the communication lines. Large amounts of congested data greatly reduce system inefficiencies and overall performance.

There are also drawbacks to centralized algorithms. In large-scale distributed systems, massive amounts of information have to be routed between many lines. Theoretically speaking, the best way to accomplish this kind of transmission is the complete information of the mobile phone about all the computer and the line load, then the algorithm of graph theory is used to calculate the optimal routing line, and the result is applied to the system to improve the performance of information routing. However, the practice of collecting and transmitting all input and output information is inherently flawed, as it can cause some network overloads.

Distribution Technology: Split a component into multiple parts and then scatter them into the system. An example of using a partial technique is that the Internet Dns,dns namespace is a hierarchical tree structure of domains (domain) that is divided into non-overlapping zones (zone). Each name in the zone has a single domain name server processing. Fundamentally, parsing a name means returning the network address of the host associated with that name. Distributing DNS-provided naming services across multiple computers avoids the dilemma that a single server has to deal with all name resolution requests.

replication technology: Distributing replicas across the system is usually a good idea. Replication not only increases availability, but also facilitates load balancing among components, which improves performance. Similarly, for systems that are geographically dispersed, having a copy near the requester can largely hide the previously mentioned communication wait time problem. For example, a medium or large company is often composed of geographically dispersed departments, which typically require data sharing. Data replication enables data sharing by replicating these shared data to multiple databases in different locations, enabling local access to data, reducing network load and improving the performance of data access, and by periodically synchronizing the data in the database (usually nightly). This ensures that all users are using the same, up-to-date data.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More