VI. Message Consumer
, configure consumers: Every Java consumer needs a consumerconfig
the configuration instance.
Second, the consumer group
In Metaq, consumers are considered a cluster, meaning that there is a group of machines that share a topic of consumption. So the ConsumerConfig
most important configuration in the consumer configuration is group, and each consumer must tell Metaq which group it belongs to, and then Metaq will find all the registered consumers under the group, load balance between them, and spend one or more topic together. Note that different group can be regarded as different consumers, they consume the same topic the progress of the message is different.
For example, suppose you have a topic business-logs
, which is a log of all business systems. And now you're going to do two things about these logs: One is a distributed file system stored in HDFS for subsequent analysis, real-time data analysis, alerting and presentation in a real-time analytics system like Twitter storm. Obviously, you need two groups here, like we have a group called hdfs-writer
, which has three machines consuming simultaneously business-logs
, storing the logs in the HDFs cluster. At the same time, you also have another group called storm-spouts
, 5 machines used to feed the storm cluster data. These two group are isolated, although consumption of the same topic, but the two are consumption progress (how many messages consumed, waiting for the number of messages to consume) is different. But in the same group, for example, three machines, three machines hdfs-writer
are the messages of common consumption, the business-logs
same message will only be processed by one of the hdfs-writer
three machines, but this message will be consumed by a twitter-spouts
machine in other groupings.
Iii. creation of Consumerconfig
Create Consumerconfig and pass in the group name:
Group; Consumerconfig (group);
ConsumerConfig
Other important options include the following:
-
Fetchrunnercount, because the Metaq consumer pulls the data from the server with the pull model and consumes it, this parameter sets the number of threads that are pulled in parallel, the default is CPUs. For a concurrency model of consumption, see the Concurrency Processing section below.
-
Fetchtimeoutinmills, the synchronous fetch request time-out, default of 10 seconds, usually do not need to modify this parameter.
-
Maxdelayfetchtimeinmills, the maximum time to crawl a thread's sleep, by default of 5 seconds, in milliseconds, when the last message was not crawled. When a message is not crawled at one time, the crawl thread will start to hibernate maxdelayfetchtimeinmills 10 1 time, if not caught next time, then hibernate maxdelayfetchtimeinmills 10 2 time, And so on until the maximum sleep maxdelayfetchtimeinmills time. Halfway if any of the fetches begin to fetch data, the count zeroing starts again from 10 to 1. When you are particularly sensitive to the real-time nature of the message, you should turn this parameter down, and also small the unflushinterval
parameter on the server side.
-
Consumerid, the ID of a single consumer, must be globally unique, typically used to identify a single consumer within a group, not set, and the system is automatically generated based on IP and timestamp.
-
Offset, offset at the start of the first consumption, is consumed by default from the earliest data on the server.
-
Commitoffsetperiodinmills, which holds the interval between offset of the data consumed by the consumer, by default of 5 seconds, in milliseconds. Larger intervals, in the fault and restart times, may repeat the consumption of messages more, at smaller intervals, which can cause stress to the storage.
-
Maxfetchretries The maximum retry consumption for the same message in case of failure, by default 5 times, skipping this message and calling Rejectconsumptionhandler
processing. For Rejectconsumptionhandler
, see the Reject handling section below.
These parameters have the appropriate Getter/setter method to set.
Iv. Creating consumers
Final messageconsumer Consumer = sessionfactory. Createconsumer (Consumerconfig);
Five, offset storage
Metaq consumption model is a pull model, the consumer based on the last consumption data absolute offset (offset) from the server side of the data file pull back data continue to consume, so this offset information is very critical, need to be stored reliably. By default, Metaq is storing the offset information on the zookeeper cluster you are using, which is what you ZkOffsetStorage
do, and it implements the OffsetStorage
interface. Usually this kind of saving is reliable and secure, but sometimes you may need other options, and there are two different OffsetStorage
implementations available:
LocalOffsetStorage
, using consumer's local file as the offset store, which is stored in ${HOME}/.meta_offsets
the file by default. Suitable for consumer groups there is only one consumer situation, no need to share offset information. For example, the broadcast type of consumer is particularly suitable.
MysqlOffsetStorage
, using MySQL as the offset store, you need to create a table structure before use:
' (() auto_increment, (), (), (), (), (), (), (), KEY (,,)) Engineinnodb DEFAULT CHARSETutf8;
You can also implement your own OffsetStorage
storage. If you want to use offset storage other than zookeeper, you can pass in when creating a consumer:
Consumer Sessionfactorycreateconsumer (Consumerconfig, (DataSource));
MySQL storage needs to be passed into the JDBC data source.
offset initial value of the first consumption.
mentioned earlier that Consumerconfig
has an offset
The parameter can set the absolute offset at which the first consumption begins, by default this parameter is 0, which starts from the minimum offset of the existing message on the server and consumes all messages from the beginning.
However, in general, new consumer groupings are expected to start spending from the latest messages, ComsumerConfig
providing a setConsumeFromMaxOffset(boolean always)
way to set up a consumption starting from the latest location. The always
parameters indicate whether the consumer starts each time the consumption starts from the latest position, thus ignoring the message during the consumer's stop. The always
parameter is usually set to true only when testing, so that the most recent message is tested each time. Do not set always to true unless you really don't need a message from the consumer to stop the period, such as a restart interval.
Liu, Https://github.com/killme2008/Metamorphosis/wiki/Java-MessageConsumer
Metaq examples of three