Process ID in Distributed System

Source: Internet
Author: User

Chen Shuo (giantchen_at_gmail)

Blog.csdn.net/solstice

I chatted with my friends yesterday and talked about process identifier in a distributed system. I wrote a blog to briefly summarize my point of view.

This document assumes that a machine (host) has only one IP address and does not consider multihome. At the same time, assuming that NTP is correctly run on each machine in the distributed system, the time of each machine is generally synchronized.

Process is one of the two basic concepts of the operating system.Program. In daily communication, the word "process" usually excludes this meaning. Sometimes we will say "HTTPd process" or "mysqld process", which is actually a program, not necessarily a specific "process"-A Fork () the product of the system call. After an "HTTPd process" is restarted, it is still "an httpd process ". This article discusses how to create a programRun each timeProcess to obtain a unique identifier. That is to say, the httpd program runs for the first time. The process is httpd_1, And it is restarted in the original place. The process is httpd_2.

The process identifier referred to in this article is used to uniquely identify a program's "one run. Each time a process is started, it should be assigned a unique identifier, which is different from all processes that are currently running. In addition, it should be different from a process that has been run in history, processes that have been extinct are also different (the direct inference between the two is that they are different from processes that may run in the future ). Naming each process is of great practical significance in the distributed system, especially when failover is considered. Because the new processes after a program is restarted and their "previous processes" are usually in different States, other processes (s) dealing with it) it is best to change its process identifier to easily judge that the program has been restarted, and take necessary relief measures to prevent the mistake.

This article assumes that the ports of each server program are statically allocated, and there is a public Wiki in the company to record the correspondence between the ports and the Program (then published through NIS or DNS ). For example, port 11211 always corresponds to memcached, and other programs do not use port 11211. Port 3306 is always reserved for mysqld, and port 3690 is always reserved for svnserve. This is a common practice in the initial stage of a distributed system. In the advanced stage, most users use dynamic port numbers, because there are only over 60 thousand port numbers, which are scarce resources, there is also a day allocated within the company. This article only considers the TCP protocol and does not consider the UDP protocol. "Port" refers to the TCP port.

In addition, we assume that on a machine, a listening port can only be used by one process at the same time, regardless of the old listen () + fork () model (multiple processes can access connections on the same port). Chen Shuo has written a lot about this, see the new Linux system call inspiration and the applicable scenarios of multi-threaded servers.

Incorrect practice

In a distributed system, how does one refer to a process or obtain its own GLobal IDEntifier (GPID )? There are two methods that you can easily think:

    • IP: Port (port is the port number of the process that provides network services to external parties, which is generally the TCP listening port)
    • HOST: PID

Both methods have problems. Why?

If the process is stateless or restarted, it is okay to use IP: port to identify a "service, for example, common httpd and memcached can both be identified by their usual ports (80 and 11211. We can safely reference (refer to) the "HTTP server running on 10.0.0.5: 80" or "memcached 10.0.0.6: 11211" in other programs ", even if the two services are restarted, there will be no bad consequences. The client may retry or automatically switch to the backup address.

If the service is stateful, there is a big problem with the IP: Port identification method, because the client cannot distinguish whether it is a process or multiple processes that deal with itself from start to end. When developing a server program, we usually set so_reuseaddr to restart quickly. The result is that the process standing behind 10.0.0.7: 8888 in the previous second and occupying 10.0.0.7 in the next second: 8888 of processes may be different-the server program has restarted quickly.

For example, consider a master of a distributed file system similar to gfs. If it identifies itself only by IP: port, then it issues a synchronization command to shadows (not the chunk server, so how does shadows know that the master has been restarted? Is the master's "past life" or "current life "? Should we reject the past life?

If you want to change it to host: PID, will it be better? I don't think it's because the PID has a small space and a high probability of repetition. For example, the maximum value of Linux PID is 32768 (/proc/sys/kernel/pid_max). After a program is restarted, the probability of obtaining the same PID as "past life" is 1/32768. Some readers may not believe that the PID will repeat after restart, because the PID increases progressively and returns to the currently idle minimum PID when the upper limit is reached. Consider a server program A, whose PID is 1234, which has been running stably for several days. During this period, PID has increased several cycles (because this machine often starts some scripts to execute some auxiliary work ). At the moment before a crashed, the recently used PID had returned to 1232. After a crashed, a daemon started a script (pid = 1233) to clear a log, then restart program A. In this way, the PID of program a after the restart happens to be the same as that of program a, both of which are 1234. That is to say, the host: PID cannot uniquely identify the process.

What if IP: Port: PID is used together? And cannot be unique. It is the same as the host: PID problem, because the IP: Port does not change after restart, PID may be reincarnation.

I guess someone will think, create a central server and assign the system's GPID. When each process starts, ask it about its GPID. This error goes further: Who determines the gpid of the global pid distributor? How can I ensure that the GPID allocated by the program is not repeated (this program may also be restarted unexpectedly )? Does it become the single point of failure of the system? If you want to make fault tolerance for the GPID distributor, do you face the basic problem of the distributed system: Status migration?

There is another way to use a random number that is strong enough for GPID, so that it will not be repeated, but this GPID itself does not have much extra significance, it is not easy to manage and maintain (for example, find the process running on the Machine Based on GPID ).

Correct practice

Correct practice: Use the quad-tuple IP: Port: start_time: PID as the GPID of the process in the distributed system. The start_time is a 64-bit integer, indicating the start time of the process (UTC time zone, muduo :: timestamp ). The reasons are as follows:

    • It is easy to ensure uniqueness. If the program restarts in a short time, the PID of the two processes must not be repeated (before a cycle is completed: Even if 1000 processes are created every second, the cycle will take more than 30 seconds, the server is basically paralyzed when a process is created at such a high speed .); If the program runs for a long period of time and then restarts, The start_time of the two starts must not be repeated. (See the explanation of time duplication below)
    • The cost of generating this GPID is very low (several low-cost system calls), and the Global server is not used, and there is no single point of failure.
    • GPID itself makes sense. According to GPID, you can immediately know what process (port) is, which machine (IP) is running, and when it starts, in the/proc directory (/proc/PID), the resource usage of the process can also be reported by the monitoring program running on that machine.
    • GPID is of historical significance for future tracing. For example, if the process crash, then I know its GPID, you can go to the history to query the CPU/MEM load before its crash.

If you only use IP: Port: start_time as the GPID, the uniqueness cannot be guaranteed. If the program is restarted for a short time (interval of one second or several seconds), start_time may jump back (NTP is being tuned) or pause (in the time of a leap second ). The question of time jump is left to the next blog "Date and time in the program" Chapter 2: Timing and timing ". Simply put, the clock on the computer may not necessarily increase monotonically.

What if there is no port? Generally, a network service program listens to a port to provide services. If it is a pure client, it only initiates a connection and does not actively listen to the port, how should GPID be allocated? According to Chen Shuo's opinion in the article "engineering development methods of Distributed Systems ", every process in a distributed system that runs for a long time and will deal with other machines should provide a management interface to provide a maintenance probe channel for external users to view the full status of the process. This management interface is a TCP server that listens to a port.

An additional benefit of using such a service channel is that it can automatically prevent repeated startup procedures. If the BIND is started repeatedly, an error occurs when it reaches the O & M Port (the port is occupied), and the program exits immediately. Even better, there is no need to worry that the crash process will not be able to clear the lock (if mutex is used across processes, this risk exists ), when a process is closed, the operating system automatically closes all the ports it opens, And the next process can start smoothly.

Furthermore, the program name and version number can be used as part of the GPID, which plays an icing on the cake.

Revelation of TCP protocol

I mentioned in the distributed system engineering development method, "What can I learn from the TCP protocol ?", The GPID mentioned today is actually inspired by the TCP protocol. TCP/IP uses IP: port to represent an endpoint. Two endpoints form a socket. This seems to be in line with the IP: Port method mentioned at the beginning to identify the process. Actually not. When initiating a TCP connection, to prevent interference from the previous connection with the same address (same local_ip: local_port: remote_ip: remote_port) (called wandering duplicates , that is, stray packets). TCP uses seq numbers (the seq number sent for the first time in SYN packet is called initial sequence number, ISN) to distinguish this connection from previous connections. This idea of TCP is similar to preventing the process from interfering with the previous life. Each time the kernel creates a TCP connection, it tries to increase isn to make sure it is different from the seq number used last time the connection was established. It is equivalent to adding start_time to the endpoint, which is very close to the "correct GPID" method we mentioned later. (Of course, the algorithm generated by ISN of the original BSD 4.4 has a security vulnerability, which causes TCP Sequence Prediction attack, the Linux kernel has adopted a safer method to generate isn .)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.