Configuring flume cluster Reference Https://www.cnblogs.com/jifengblog/p/9277793.htmlload-balance load Balancing Introduction
Load balancing is an algorithm that is used to solve a machine (a process) that cannot resolve all requests.
The load Balancing Sink Processor can implement the load balance function, such as AGENT1 is a routing node that balances the Channel staging Event to the corresponding plurality of Sink components, and each Sink component is connected to On a separate Agent.
Load Balancing (load_balance) is used to solve a problem where a person does not handle multiple things together and then how to allocate
Configuration
Agent1
Cd/export/servers/flume/conf
VI exec-avro.conf
#agent1 nameagent1.channels=c1agent1.sources=r1agent1.sinks=K1 K2#set gruopagent1.sinkgroups=G1#set Channelagent1.channels.c1.type=memoryagent1.channels.c1.capacity= 1000agent1.channels.c1.transactionCapacity= 100Agent1.sources.r1.channels=C1agent1.sources.r1.type=Execagent1.sources.r1.command= Tail-f/root/logs/123. log# Set Sink1agent1.sinks.k1.channel=C1agent1.sinks.k1.type=Avroagent1.sinks.k1.hostname= Node-2Agent1.sinks.k1.port= 52020# set Sink2agent1.sinks.k2.channel=C1agent1.sinks.k2.type=Avroagent1.sinks.k2.hostname= Node-3Agent1.sinks.k2.port= 52020#set Sink Groupagent1.sinkgroups.g1.sinks=K1 K2#set failoveragent1.sinkgroups.g1.processor.type=Load_balanceagent1.sinkgroups.g1.processor.backoff=true #如果开启, the failed sink is blacklistedAgent1.sinkgroups.g1.processor.selector=round_robin #轮询agent1. Sinkgroups.g1.processor.selector.maxTimeOut=10000 #在黑名单放置的超时时间, if the timeout is still not received, the time-out period is exponentially increased
Agent2
Cd/export/servers/flume/conf
VI avro-logger.conf
# Name the components on Thisagenta1.sources=r1a1.sinks=K1a1.channels=c1# Describe/Configure the Sourcea1.sources.r1.type=Avroa1.sources.r1.channels=C1a1.sources.r1.bind= Node-2A1.sources.r1.port= 52020# Describe The Sinka1.sinks.k1.type=logger# use a channel which buffers events in Memorya1.channels.c1.type=memorya1.channels.c1.capacity= 1000a1.channels.c1.transactionCapacity= 100# Bind The source and sink to the Channela1.sources.r1.channels=C1a1.sinks.k1.channel= C1
The remaining agents are identical except for the host IP address and other configurations
agent2-agentn Start command
When it comes to flume multi-level start-up, it is recommended to start from data source
Bin/flume-ng agent-c conf-f conf/avro-logger.conf-n a1-dflume.root.logger=info,console
Agent1 Start command
Bin/flume-ng agent-c conf-f conf/exec-avro.conf-n agent1-dflume.root.logger=info,console
Writing scripts to/root/logs/123.log for testing
while true; do date >>/root/logs/123.log;sleep 0.5;done
Failover Fault Tolerance Introduction
The Failover Sink Processor is capable of Failover functionality, and the process is similar to load-balance, but the internal processing mechanism is completely different from load balance.
Failover Sink Processor maintains a prioritized Sink component list, and as long as a Sink component is available, the Event is passed to the next component. The role of the failover mechanism is to downgrade the failed sink to a pool in which they are allocated a cooldown time, which increases as the failure progresses and the cooldown time is increased before retrying. Once the Sink successfully sends an event, it reverts to the active pool. The Sink has the priority associated with it, the greater the number, the higher the priority.
For example, a sink with a priority of 100 is activated before a sink with a priority of 80. If the aggregation fails when the event is sent, then the next Sink send event with the highest priority will be attempted. If no priority is specified, the priority is determined based on the order in which Sink is specified in the configuration.
Fault tolerance (failover) is used to resolve a person's hang-up resulting in overall unavailability (single point of failure) but with spare can be overhead
The most common solution for fault tolerance is ha (highly available)
There's only one to work at the same time.
Configuration
This configuration and load balancer are different except exec-avro.conf, and the other same
Modify Agent1 's exec-avro.conf
#agent1 nameagent1.channels=c1agent1.sources=r1agent1.sinks=K1 K2#set gruopagent1.sinkgroups=G1#set Channelagent1.channels.c1.type=memoryagent1.channels.c1.capacity= 1000agent1.channels.c1.transactionCapacity= 100Agent1.sources.r1.channels=C1agent1.sources.r1.type=Execagent1.sources.r1.command= Tail-f/root/logs/456. log# Set Sink1agent1.sinks.k1.channel=C1agent1.sinks.k1.type=Avroagent1.sinks.k1.hostname= Node-2Agent1.sinks.k1.port= 52020# set Sink2agent1.sinks.k2.channel=C1agent1.sinks.k2.type=Avroagent1.sinks.k2.hostname= Node-3Agent1.sinks.k2.port= 52020#set Sink Groupagent1.sinkgroups.g1.sinks=K1 K2#set failoveragent1.sinkgroups.g1.processor.type=Failoveragent1.sinkgroups.g1.processor.priority.k1= 10Agent1.sinkgroups.g1.processor.priority.k2= 1agent1.sinkgroups.g1.processor.maxpenalty= 10000
agent2-agentn Start command
When it comes to flume multi-level start-up, it is recommended to start from data source
Bin/flume-ng agent-c conf-f conf/avro-logger.conf-n a1-dflume.root.logger=info,console
Agent1 Start command
Bin/flume-ng agent-c conf-f conf/exec-avro.conf-n agent1-dflume.root.logger=info,console
Writing scripts to/root/logs/456.log for testing
While True;do date >>/root/logs/456.log;sleep 0.5;done
Flume's load-balance, failover