Distributed Message Queuing for large web site architecturesreproduced January 26, 2016 08:48:40
Distributed Message Queuing for large web site architectures
The following is an outline of Message Queuing, which focuses on Message Queuing overview, Message Queuing scenarios, and message middleware examples (e-commerce, log system).
This sharing outline
- Message Queuing Overview
- Message Queuing scenarios
- Message Middleware Sample
- JMS Messaging Service
- Common Message Queuing
- Reference (recommended) information
- Summary of this share
I. Overview of Message Queuing
Message Queue Middleware is an important component in distributed system, which mainly solves the problems of application coupling, asynchronous message and traffic cutting front. Achieve high performance, high availability, scalability and eventual consistency architecture. is an indispensable middleware for large-scale distributed systems.
At present in the production environment, using more message queue has ACTIVEMQ,RABBITMQ,ZEROMQ,KAFKA,METAMQ,ROCKETMQ and so on.
Second, Message Queuing application scenario
The following is a description of the usage scenarios that Message Queuing commonly uses in real-world applications. Asynchronous processing, application decoupling, traffic cutting front and message communication four scenarios.
2.1 Asynchronous processing
Scene Description: After the user registration, you need to send a registration email and registration text. The traditional approach has two types of 1. Serial mode; 2. Parallel mode.
(1) Serial mode: After the registration information is written to the database successfully, send the registered mail, and then send the registration SMS. When all of the above three tasks are complete, return to the client. (Architecture kkq:466097527, welcome to join)
(2) Parallel mode: After the registration information is written to the database successfully, send the registration message at the same time, send the registration text message. When the above three tasks are completed, they are returned to the client. The difference with the serial is that the parallel way can increase the processing time.
Assuming that three business nodes each use 50 of a second, regardless of other overhead such as the network, the serial mode time is 150 milliseconds, the parallel time may be 100 milliseconds.
Since the number of requests processed by the CPU within the unit time is certain, assume that the throughput is 100 times in CPU1 seconds. The number of requests that the CPU can process in a serial mode of 1 seconds is 7 times (1000/150). The volume of requests processed in parallel is 10 times (1000/100).
Summary: As described in the above case, the traditional way of system performance (concurrency, throughput, response time) there will be bottlenecks. How to solve this problem?
The introduction of Message Queuing will not be necessary for the business logic to be processed asynchronously. After the transformation, the following structure:
According to the above convention, the response time of the user is equivalent to the time that the registration information is written to the database, which is 50 milliseconds. Registering messages, sending text messages to the message queue, returns directly, so the message queue is written quickly and can be ignored, so the user's response time may be 50 milliseconds. As a result, the throughput of the system increases to QPS per second after the architecture changes. 3 times times higher than serial, twice times higher than parallel.
2.2 Application Decoupling
Scene Description: After the user orders, the order system needs to notify the inventory system. The traditional approach is that the order system invokes the interface of the inventory system. such as: (Architecture kkq:466097527, welcome to join)
Disadvantages of the traditional model:
1) If the inventory system is inaccessible, the order reduction inventory will fail, resulting in an order failure;
2) Coupling of order system and inventory system;
How to solve the above problems? Introduce scenarios after Message Queuing is applied, such as:
- Order system: After the user orders, the order system completes the persistent processing, writes the message to the message queue, returns the user orders the order to succeed.
- Inventory System: Subscribe to the order of the message, the use of pull/push way, to obtain the order information, inventory system according to the order information, inventory operations.
- If: The inventory system will not be used properly when order is being given. Also does not affect the normal order, because after placing orders, the order system to write to the message queue will no longer care about other subsequent operations. Realize the application decoupling of order system and inventory system.
2.3 Flow Cutting Front
Traffic clipping is also a common scenario in Message Queuing, and is generally used extensively in the second or group robbery activities.
Application scenario: The second kill activity, generally because the traffic is too large, resulting in traffic explosion, the application hangs off. In order to solve this problem, it is generally necessary to join the message queue in the application front.
- The number of people who can control the activity;
- It can alleviate the high flow crushing application in short time;
- After the user requests, the server receives the message queue first. If the message queue length exceeds the maximum number, then discard the user request or jump to the error page directly;
- The second kill business is based on the request information in the message queue, and then the subsequent processing.
2.4 Log processing
Log processing refers to the use of Message Queuing in log processing, such as the application of Kafka, to solve a large number of log transmission problems. The architecture is simplified as follows: (Schema kkq:466097527, welcome to join)
- Log acquisition Client, responsible for log data collection, timed write by write Kafka queue;
- Kafka Message Queue, responsible for the receipt, storage and forwarding of log data;
- Log processing application: Subscribe to and consume the log data in the Kafka queue;
The following is the Sina Kafka log processing application case:
Turn from (http://cloud.51cto.com/art/201507/484338.htm)
(1) Kafka: Receive Message Queuing for user logs.
(2) Logstash: Do log parsing, unified into JSON output to elasticsearch.
(3) Elasticsearch: The core technology of real-time log Analysis service, a schemaless, real-time data storage service, through index organization data, both powerful search and statistical functions.
(4) Kibana: Data visualization component based on Elasticsearch, strong data visualization capability is an important reason for many companies to choose Elk Stack.
2.5 Message Communication
Message communication means that Message Queuing generally has an efficient communication mechanism built into it, so it can also be used in pure message communication. such as the implementation of peer-to-peer Message Queuing, or chat rooms.
Point-to-point communication:
Client A and Client B use the same queue for message communication.
Chat Room Newsletter:
Client A, client B, and client n subscribe to the same topic for message publishing and receiving. Implement similar chat room effects.
The above is actually two message patterns for Message Queuing, point-to-point or publish-subscribe mode. The model is, for reference.
III. Message Middleware Example 3.1 e-commerce system
Message Queuing uses highly available, durable message middleware. such as active Mq,rabbit Mq,rocket MQ. (1) After the application has completed the backbone logic, write to the message queue. The confirmation mode of the message can be turned on if the message is sent successfully. (Message Queuing Returns the success status of the message when the application returns, which guarantees the integrity of the message)
(2) Extended process (SMS, delivery Processing) subscribe to queue messages. Use push or pull to get the message and handle it.
(3) When the message is decoupled, the data consistency problem can be solved by using the final consistency method. For example, the master data is written to the database, and the extended application is based on the message queue and the database is used to follow the message queue processing.
3.2 Log Collection system
Divided into the Zookeeper registry, the log collection client, Kafka cluster and Storm cluster (Otherapp) are composed of four parts.
- Zookeeper Registration Center, put forward load balancing and address lookup service;
- The Log collection client, which collects the log of the application system and pushes the data to the Kafka queue;
- Kafka cluster: Receiving, routing, storage, forwarding and other message processing;
Storm cluster: With Otherapp at the same level, using pull to consume the data in the queue;
Iv. JMS Messaging Service
Message Queuing has to mention JMS. The JMS (JAVA message Service,java message Service) API is a standard/specification for messaging services that allows application components to create, send, receive, and read messages based on the Java EE platform. It makes the distributed communication less coupled, the message service more reliable and asynchronous.
In the EJB architecture, a message bean can be seamlessly integrated with the JM message service. In the Java EE architecture pattern, there is a message service pattern that enables direct decoupling of the message from the application.
4.1 Message Model
In the JMS standard, there are two message model-to-peer (point-to-point), Publish/subscribe (pub/sub).
4.1.1 Peer Mode
The peer mode contains three roles: Message Queuing (queue), Sender (sender), receiver (receiver). Each message is sent to a specific queue, and the recipient obtains the message from the queue. The queue retains messages until they are consumed or timed out.
The characteristics of peer-to
- Only one consumer per message (Consumer) (that is, once consumed, the message is no longer in the message queue)
- There is no dependency on time between the sender and the receiver, that is, when the sender sends the message, regardless of whether the recipient is running, it does not affect the message being sent to the queue
- The recipient needs to answer the queue successfully after receiving the message successfully
If you want to send every message will be successfully processed, then you need to peer mode. (Architecture kkq:466097527, welcome to join)
4.1.2 Pub/sub Mode
Contains three role topics (Topic), publisher (publisher), Subscriber (subscriber). Multiple publishers send messages to topic, which the system passes to multiple subscribers.
Features of Pub/sub
- Each message can have multiple consumers
- There is a time dependency between the Publisher and the Subscriber. Subscribers to a topic (TOPIC) must create a subscriber before they can consume the publisher's message.
- In order to consume messages, subscribers must remain in a running state.
To mitigate such strict time dependencies, JMS allows subscribers to create a durable subscription. This way, even if the Subscriber is not activated (running), it can receive the message from the publisher.
The PUB/SUB model can be used if the message that you want to send can be processed without any processing, or only handled by a single message person, or can be processed by multiple consumers.
4.2 Message consumption
In JMS, both the generation and consumption of messages are asynchronous. For consumption, JMS messages can consume messages in two ways.
(1) Synchronization
A subscriber or receiver receives a message through the Receive method, and the receive method blocks until the message is received (or before it times out);
(2) asynchronous
Subscribers or receivers can register as a message listener. When the message arrives, the system automatically calls the listener's OnMessage method.
The Jndi:java naming and directory interface is a standard Java naming system interface. You can find and access services on the network. By specifying a resource name that corresponds to a record in the database or naming service, and also returns the information necessary to establish the resource connection.
Jndi plays a role in JMS in finding and accessing the sending target or the source of the message. (Architecture kkq:466097527, welcome to join)
4.3JMS programming Model
(1) ConnectionFactory
The factory that created the connection object has two queueconnectionfactory and topicconnectionfactory for two different JMS message models. You can find the ConnectionFactory object through Jndi.
(2) Destination
Destination means that the message producer's message is sent to the target or to the source of the message consumer. For a message producer, its destination is a queue or a topic (Topic), and for the message consumer, its destination is also a queue or topic (that is, the source of the message).
So, destination is actually two types of objects: Queue, topic can find destination through Jndi.
(3) Connection
Connection represents the link established between the client and the JMS system (the wrapper to the TCP/IP socket). Connection can produce one or more sessions. Like ConnectionFactory, there are two types of connection: Queueconnection and Topicconnection.
(4) Session
The session is an interface for manipulating messages. You can create producers, consumers, messages, etc. through the session. The session provides the functionality of the transaction. When you need to send/receive multiple messages using the session, you can put these send/receive actions into a transaction. Similarly, queuesession and topicsession are also divided.
(5) Producers of messages
The message producer is created by the session and is used to send messages to destination. Similarly, there are two types of message producers: Queuesender and Topicpublisher. You can call the message producer's method (send or Publish method) to send the message.
(6) Message Consumers
The message consumer is created by the session to receive messages sent to destination. Two types: QueueReceiver and TopicSubscriber. Can be created by Createreceiver (Queue) or Createsubscriber (Topic) of the session, respectively. Of course, you can also create a persistent subscriber by the Creatdurablesubscriber method of the session.
(7) MessageListener
Message listeners. If a message listener is registered, the listener's OnMessage method is automatically invoked once the message arrives. An MDB (Message-driven Bean) in an EJB is a messagelistener.
Deep learning JMS is very helpful for mastering the Java architecture, and the message middleware is also a necessary component of a large distributed system. This sharing mainly to do a global introduction, specific deep needs everyone to learn, practice, summary, understanding.
V. Common message queues
Generally commercial containers, such as Weblogic,jboss, support the JMS Standard and are easy to develop. But for free, such as tomcat,jetty, you need to use third-party message middleware. This section describes the common message middleware (Active mq,rabbit Mq,zero Mq,kafka) and their features.
5.1 ActiveMQ
ActiveMQ is the most popular, powerful, open source messaging bus that Apache has produced. ActiveMQ is a JMS provider implementation that fully supports the JMS1.1 and the Java EE 1.4 specification, although the JMS specification has been around for a long time, but JMS still plays a special role in the middle of today's Java EE applications.
The ACTIVEMQ features are as follows:
⒈ multiple languages and protocols for writing clients. Language: java,c,c++,c#,ruby,perl,python,php. Application protocol: Openwire,stomp REST,WS NOTIFICATION,XMPP,AMQP
⒉ fully supports JMS1.1 and the Java EE 1.4 specification (persistence, XA messages, transactions)
⒊ support for spring, ACTIVEMQ can easily be embedded in a system that uses spring, and also supports Spring2.0 features
⒋ passed a test of a common Java EE server (such as Geronimo,jboss 4,glassfish,weblogic), with the configuration of the JCA 1.5 resource adaptors, Allows ACTIVEMQ to automatically deploy to any compatible Java 1.4 Business Server
⒌ supports multiple transfer protocols: In-vm,tcp,ssl,nio,udp,jgroups,jxta
⒍ support for high-speed message persistence through JDBC and journal
⒎ is designed to ensure high-performance clustering, client-server, point-to-point
⒏ supports Ajax
⒐ support for integration with axis
⒑ can easily invoke the embedded JMS provider for testing
5.2 RabbitMQ
RABBITMQ is a popular open source Message Queuing system, developed in Erlang language. RABBITMQ is the standard implementation of the AMQP (Advanced Message Queuing protocol). Support for multiple clients such as Python, Ruby,. NET, Java, JMS, C, PHP, ActionScript, XMPP, stomp, etc., support Ajax, persistent. It is used to store and forward messages in distributed system, which is very good in ease of use, extensibility, high availability and so on.
The structure diagram is as follows: (Schema kkq:466097527, welcome to join)
Several important concepts:
Broker: The Message Queuing server entity is simply the case.
Exchange: A message switch that specifies what rules the message is routed to and to which queue.
Queue: A message queue carrier in which each message is put into one or more queues.
Binding: Bind, which is the role of binding exchange and queue according to routing rules.
Routing key: The routing keyword, exchange messages are delivered based on this keyword.
Vhost: Virtual host, a broker can open multiple vhost, as a separate user permissions.
Producer: The message producer is the program that delivers the message.
Consumer: The message consumer is the program that receives the message.
Channel: The message channels, in each connection of the client, multiple channels can be established, each channel represents a session task.
The process of using Message Queuing is as follows:
(1) The client connects to the Message Queuing server and opens a channel.
(2) The client declares an exchange and sets the related properties.
(3) The client declares a queue and sets the related properties.
(4) The client uses routing key to establish a good binding relationship between Exchange and queue.
(5) Clients post messages to exchange.
When Exchange receives a message, it routes messages to one or more queues based on the key of the message and the binding that has been set.
5.3 ZeroMQ
Known as the history of the fastest message queue, it is actually similar to the socket of a series of interfaces, and his socket is the difference between: the ordinary socket is end-to-end (1:1 relationship), and ZMQ can be n:m relationship, people on the BSD socket understanding more is the point-to-point connection, A point-to-point connection requires an explicit connection, a destroy connection, a selection protocol (TCP/UDP), and a processing error, and ZMQ masks these details to make your network programming easier. ZMQ is used for the communication between node and node, which can be either a host or a process.
Citing the official saying: "ZMQ (hereinafter referred to as ZEROMQ ZMQ) is a simple and easy to use transport layer, like a framework of a socket library, he makes the socket programming simpler, more concise and higher performance. is a message processing queue library that can elastically scale across multiple threads, cores, and host boxes. ZMQ's clear goal is to "become part of the standard network protocol stack and then into the Linux kernel". They are not yet seen to be successful. However, it is undoubtedly a very promising and a layer of encapsulation on the "traditional" BSD sockets that people need most. ZMQ makes writing high-performance Web applications extremely simple and fun. ”
Features are:
- High performance, non-persistent;
- Cross-platform: Supports Linux, Windows, OS X, and more.
- Multi-lingual support, C, C + +, Java,. NET, Python and more than 30 different development languages.
- Can be deployed separately or integrated into the application;
- Can be used as a socket communication library.
Compared with RABBITMQ, ZMQ is not a traditional Message Queuing server, in fact, it is not a server at all, more like a low-level network communication library, on the socket API to do a layer of encapsulation, the network communication, process communication and thread communication is abstracted into a unified API interface. Support "Request-reply", "Publisher-subscriber", "Parallel Pipeline" three basic models and extension models.
ZEROMQ High Performance Design essentials:
1. Lock-Free queue model
For the data exchange channel between cross-thread interaction (user-side and session) pipe, using the lock-free queue algorithm CAS; asynchronous events are registered at both ends of the pipe, and read and write events are automatically triggered when the message is read or written to the pipe.
2. Batch processing algorithm
For the traditional message processing, each message in the sending and receiving time, need to call the system, so for a large number of messages, the system overhead is relatively large, zeromq for the bulk of the message, the adaptive optimization, can be bulk receive and send messages.
3, multi-core thread binding, without CPU switching
Unlike traditional multithreaded concurrency patterns, semaphores or critical sections, ZEROMQ leverages the benefits of multicore, each of which runs a worker thread and avoids CPU switching overhead between threads.
5.4 Kafka
Kafka is a high-throughput distributed publish-subscribe messaging system that handles all the action flow data in a consumer-scale website. This kind of action (web browsing, search and other user actions) is a key factor in many social functions on modern networks. This data is usually resolved by processing logs and log aggregations due to throughput requirements. This is a viable solution for the same log data and offline analysis system as Hadoop, but requires real-time processing constraints. The purpose of Kafka is to unify online and offline message processing through Hadoop's parallel loading mechanism, and also to provide real-time consumption through the cluster machine.
Kafka is a high-throughput distributed publish-subscribe messaging system with the following features:
- Provides persistence of messages through the disk data structure of O (1), a structure that maintains long-lasting performance even with terabytes of message storage. (The file is appended to the data, the expired data is deleted periodically)
- High throughput: Even very common hardware Kafka can support millions of messages per second.
- Support for partitioning messages through Kafka servers and consumer clusters.
- Supports Hadoop parallel data loading.
Kafka Related Concepts
A Kafka cluster contains one or more servers, which are called broker[5]
Each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)
Parition is a physical concept, and each topic contains one or more partition.
Responsible for publishing messages to Kafka broker
The message consumer, the client that reads the message to Kafka broker.
Each consumer belongs to a specific consumer group (the group name can be specified for each consumer, and the default group if the group name is not specified).
General applications are used in large data log processing or for scenarios where there is a slight delay in real-time (small latency), and reliability (few drops of data) is required.
Vi. references
The following is a reference to the sharing of information and recommend the information you reference.
reference materials (available in reference):
(1) Jms
Http://blog.sina.com.cn/s/blog_3fba24680100r777.html
http://blog.csdn.net/jiuqiyuliang/article/details/46701559 (JMS (i)--JMS basic concepts)
(2) RabbitMQ
Http://baike.baidu.com/link?url=s2cU-QgOsXan7j0AM5qxxlmruz6WEeBQXX-Bbk0O3F5jt9Qts2uYQARxQxl7CBT2SO2NF2VkzX_XZLqU-CTaPa
http://blog.csdn.net/sun305355024sun/article/details/41913105
(3) Zero MQ
Http://www.searchtb.com/2012/08/zeromq-primer.html
http://blog.csdn.net/yangbutao/article/details/8498790
Http://wenku.baidu.com/link?url=yYoiZ_ Pypcuuxesgqvmmley08bcptzvwf3imho2w1i-ti66yxxpplljbgxboddwggbnoehhiudslfhtz7rgzykrtmqq02dv5sv9jff4lznk
(4) Kafka
Http://baike.baidu.com/link?url=qQXyqvPQ1MVrw9WkOGSGEfSX1NHy4unsgc4ezzJwU94SrPuVnrKf2tbm4SllVaN3ArGGxV_N5hw8JTT2-lw4QK
http://www.infoq.com/cn/articles/apache-kafka/
Http://www.mincoder.com/article/3942.shtml
Shared e-Data (in a group file)
(1) Active MQ
(2) Kafka
(3) Notify
Vii. Summary of this sharing
This week's share includes a summary of Message Queuing, common Message Queuing scenarios (asynchronous processing, application decoupling, traffic clipping, log processing and messaging), the JMS Java Messaging Service, and several popular Message Queuing introductions. Finally, two architectures using message middleware are demonstrated.
Because of the time, some of the explanation is not meticulous, we can ask the/google Niang, hope this sharing to everyone helpful.
This is the last time before the Spring Festival, we will continue to share the year, next year will continue the "large Web site Architecture series", and will increase the "Step by Step Learning Architecture series." The specific time and share content will be announced by QQ group notice to everyone. Thank you for your attention.
Sharing is a joy and a process of personal growth. The article is generally their own study summary, work experience, deficiencies are unavoidable, please correct me, common progress. Established an architecture-centric KK Group 466097527, Welcome to join us. Focus on large-scale distributed Web site architecture, big data, architectural patterns, design patterns.
Original: http://www.cnblogs.com/itfly8/p/5155983.html
Distributed Message Queuing for large web site architectures