ZeroMQ, the fastest message queue in history-ZMQ learning and research-PHP source code

Source: Internet
Author: User
ZeroMQ, the fastest message queue in history-ZMQ learning and research ZeroMQ, and the fastest message queue in history-ZMQ learning and research

I. Background of ZeroMQ

Reference the official saying: "ZMQ (ZMQ for short) is a simple and easy-to-use transmission layer, a socket library like a framework, it makes Socket programming simpler, more concise, and more efficient. Is a message processing queue library that can be elastically scaled between multiple threads, kernels, and host boxes. ZMQ's clear goal is to "become part of the standard network protocol stack and then enter the Linux kernel ". They are not yet successful. However, it is undoubtedly very promising and a layer of encapsulation above the "traditional" BSD socket that people need more. ZMQ makes writing high-performance network applications extremely simple and interesting ."

In recent years, "Message Queue" projects have emerged one after another, with more than a dozen well-known projects. this is mainly because distributed processing has gradually become the mainstream in the post Moore's law era, the industry needs a set of standards for message communication between nodes in a distributed computing environment. After several years of competition, the Apache Foundation's RabbitMQ compliant with the AMQP/1.0 standard has been widely recognized and become a leading MQ project.

Compared with RabbitMQ, ZMQ is not a message queue server in the traditional sense. In fact, ZMQ is not a server at all. it is more like a underlying network communication Library, the Socket API is encapsulated to abstract network communication, process communication, and thread communication into a unified API interface.

2. what is ZMQ?

After reading the ZMQ Guide document, I understand that this is a series of interfaces similar to Socket. The difference between it and Socket is that normal sockets are end-to-end ), ZMQ has a relationship that can be N: M. people know more about BSD sockets through point-to-point connections, point-to-point connections require explicit connection establishment, connection destruction, Protocol selection (TCP/UDP), and error handling. ZMQ shields these details and makes network programming easier. ZMQ is used for communication between nodes. a node can be a host or a process.

III. purpose of this article

In the process of providing services to external clusters, we have many configurations that need to be updated at any time as needed. if this information is pushed to each node? And ensures information consistency and reliability? This article attempts to use ZMQ to implement a configuration distribution center based on the introduction of the basic ZMQ theory. From one node, the information is delivered to each server node, and the information is correct and consistent.

4. three basic models of ZMQ

ZMQ provides three basic communication models: "Request-Reply", "Publisher-Subscriber", "Parallel Pipeline". let's look at ZMQ from these three modes.

Hello world of ZMQ!

The Client initiates a request and waits for the Server to respond to the request. When the requester sends a simple hello message, the server returns a world message. Both the requester and server can be 1: N models. Generally, 1 is considered as the Server, and N is considered as the Client. ZMQ can well support the routing function (the component that implements the routing function is called Device), and 1: N is extended to N: M (only several routing nodes need to be added ). 1:

: Request-Reply communication of ZMQ

The php program on the server side is as follows:

 

The Client program is as follows:

  

From the above process, we can understand how to use ZMQ to write basic programs. Note the following:

A) no matter who starts the server or the client first, the effect is the same, which is different from that of Socket.

B) before the server receives the message, the program is blocked and will wait for the client to connect.

C) after the server receives the message, it will send a "World" to the client. It is worth noting that after the client connects up, it sends a message to the Server, and then the Server re-rev and then responds to the client. If the Server sends messages first, the client first returns an error.

D) The ZMQ communication unit is a message. in addition to knowing the size of Bytes, ZMQ does not care about the message format. Therefore, you can use any data format that you think is useful. Xml, Protocol Buffers, Thrift, json, and so on.

E) although ZMQ can be used to implement the HTTP protocol, this is not what he is good.

Publish-subscribe mode of ZMQ

We can imagine the weather forecast subscription mode, where one node provides information sources and other nodes accept information sources, as shown in Figure 2:

: Publish-subscribe of ZMQ

The sample code is as follows:

Publisher:

   Subscriber
<?php /* * Weather update client* Connects SUB socket to tcp://localhost:5556* Collects weather updates and finds avg temp in zipcode* @author Ian Barber <ian (dot) barber (at) gmail (dot) com> */ $context = new ZMQContext (); // Socket to talk to server echo "Collecting updates from weather server…", PHP_EOL; $subscriber = new ZMQSocket ($context, ZMQ::SOCKET_SUB); $subscriber->connect ("tcp://localhost:5556"); // Subscribe to zipcode, default is NYC, 10001 $filter = $_SERVER['argc'] > 1 ? $_SERVER['argv'][1] : "10001"; $subscriber->setSockOpt (ZMQ::SOCKOPT_SUBSCRIBE, $filter); // Process 100 updates $total_temp = 0; for ($update_nbr = 0; $update_nbr < 100; $update_nbr++) { $string = $subscriber->recv (); sscanf ($string, "%d %d %d", $zipcode, $temperature, $relhumidity); $total_temp += $temperature;} printf ("Average temperature for zipcode '%s' was %dF\n", $filter, (int) ($total_temp

In this code, the server generates random numbers zipcode, temperature, and relhumidity, which respectively represent the city code, temperature value, and humidity value. Then, the client continuously broadcasts the information, and sets the filter parameters to accept the information of the specific city code. after collecting the information, the client makes an average value.

A) unlike Hello World, the Socket type is changed to SOCKET_PUB and SOCKET_SUB.

B) the client needs $ subscriber-> setSockOpt (ZMQ: SOCKOPT_SUBSCRIBE, $ filter); setting a filter value is equivalent to setting a subscription channel, otherwise nothing can be received.

C) the server keeps Broadcasting. if a Subscriber exits in the middle, it does not affect the broadcast. when the Subscriber is connected again, the new information sent later is received. This is a late comparison, or a subscriber who leaves halfway will inevitably lose part of the information. this is a problem in this mode, the so-called Slow joiner. The problem will be solved later.

D) However, if Publisher leaves halfway, all Subscriber will be held and will continue to accept the information when Publisher goes online again.

PipeLine model of ZMQ

Imagine this scenario. if you need to collect statistics on the logs of each machine, we need to distribute the statistical tasks to each node machine, and finally collect the statistical results for a summary. PipeLine is suitable for this scenario. its structure is shown in figure 3.

: PipeLine model of ZMQ

Parallel task ventilator in PHP

    

Parallel task worker in PHP

     

Parallel task sink in PHP

      


From the program, we can see that the task ventilator uses SOCKET_PUSH to distribute the task to the Worker node. On the Worker node, SOCKET_PULL is used to receive tasks from the upstream, and SOCKET_PUSH is used to collect the results to Slink. It is worth noting that the routing function of server load balancer is also available during task distribution. worker can be freely added at any time, and task ventilator can evenly distribute tasks.

V. Other extension modes

Generally, a node can act as a Server and a Client. through the Worker in the PipeLine model, it connects to the task distribution up and down to the Sink machine for result collection. Therefore, we can use this feature to expand the original three modes. For example, a proxy Publisher receives information as an intranet Subscriber and forwards the information to the Internet, as shown in figure 4.

: N: M connection

We use an intermediate node (Broker) for load balancing. We can understand through the code that the Client is the same as the Client of Hello World, while the Server is different in that it does not need to listen to the port, but needs to connect to the Broker port, accept the information to be processed. Therefore, we should focus on reading the Broker code:

       


The Broker listens to two ports, receives data sent from multiple clients, and forwards the data to the Server. In the Broker, we listen to two ports and use two sockets. in the case of multiple sockets, we do not need to process data through polling. Before that, we can use libevent to implement asynchronous information processing and transmission. Now, we only need to use ZMQ's $ poll-> poll to implement asynchronous processing of multiple sockets.

7. Inter-process communication

ZMQ not only communicates between nodes through TCP, but also through Socket files. As shown in 7, we fork three PHP processes and send the data of Process 1 to process 3 through the Socket file.

: Inter-process communication

        

During running, we can see two more files, as shown in figure 8.

Figure 9: ZMQ subscription mode extension

We use $ context-> getSocket (ZMQ: SOCKET_REQ); to set a new Request-Reply connection for Subscriber to report its identity information to Publisher, publisher then selects Publish's own information when all Subscriber is connected.

The Subscriber program is as follows:

         


The Publisher program is as follows:

          


Each node uses port 5562 and uses the Rep mode to connect to Publisher. this connection tells Publisher the machine name, and Publisher maintains a Machine list through a whitelist, after all the machines in the Machine list are connected, send the latest configuration information through Port 5561.

For subsequent processing, Subscriber can choose to write the configuration information to the APC cache. The program will always read part of the configuration information from the cache, Subscriber and update the status information, report to Publisher in real time through 5562.

Although this example does not show up, if the amount of information to be published is too large, the Subscriber end suddenly breaks the network (or the program crashes) during receiving the information ), so when he connects up, some information will be lost? ZMQ considers this issue and sets an id through $ subscriber-> setSockOpt (ZMQ: SOCKOPT_IDENTITY, $ hostname); when the Subscriber of this id reconnects, he can continue to accept information from the last interruption. of course, the interruption of the node will not affect the acceptance of other nodes.

So how does ZMQ send messages after reconnection? He will send the information that the disconnected Subscriber should receive to the memory, wait for him to go online again, and then send the cached information to him. Of course, the memory must be limited, and excessive memory overflow may occur. ZMQ passes

SetSockOpt (ZMQ: SOCKOPT_SWAP, 250000) sets the size of Swap space to prevent out of memory and crash. Finally, the program running result is shown in 10.

Figure 10: running result of the configuration center

Of course, this is just a general idea. if it is applied to an actual production environment, more problems need to be considered, including stability and fault tolerance. However, due to its high concurrency, stability, and ease of use, ZMQ has a bright future. his goal is to enter the Linux kernel and we look forward to the arrival of that day.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.