1. Practice Scenario
Simulates the failover of an upstream flume agent when sending an event (failover)
1) Initial: Upstream agent passes event to active downstream node Collector1
2) Collector1 fault: Kill the process to simulate the way the event is sent to Collector2, complete the failover
3) Collector1 Recovery: Rerun the process, after the maximum penalty time, the event will be sent back to Collector1
2. configuration file Agent configuration file
#flume-failover- Client# agent name:a1# source:exec with given command, monitor output of the command, each Lin E'll be generated as a event # channel:memory# sink:k1 K2, each set to Avro type to link to next -Level Collector# ondefine Source,channel,sink namea1.sources=R1a1.channels=c1a1.sinks=K1 k2# Genevadefine Sourcea1.sources.r1.type=Execa1.sources.r1.command=Tail-f/root/flume_test/server.log#define Sink,each Connect to Next-level collector via hostnameand Porta1.sinks.k1.type = Avroa1.sinks.k1.hostname = slave1 # sink bind to remote host, RPC (upstream agent Avro Sink bound to downstream hosts)A1.sinks.k1.po RT = 4444a1.sinks.k2.type = Avroa1.sinks.k2.hostname = slave2 # sink band to remote host, PRC (upstream agent Avro Sink bound to downstream Host)A1.sinks.k2.port = 4444# define SINKGROUPS,ONLY 1Sink'll be selected as active based in priority and online statusa1.sinkgroups=g1a1.sinkgroups.g1.sinks=K1 K2a1.sinkgroups.g1.processor.type=Failover# K1 is selected as active to send event if K1 is online, otherwise K2 is selecteda1.sinkgroups.g1.processor.priority.k1= # Based on priority selection, priority is selected active, the same priority is selected according to the order in which K1,k2 appears. A1.sinkgroups.g1.processor.priority.k2=1# failover time, milliseconds# ifK1 is down and up again, K1 'll be selected as active after1secondsa1.sinkgroups.g1.processor.priority.maxpenality= # # failback time # todefine Channela1.channels.c1.type=memory# Number of eventsinchMemory Queue A1.channels.c1.capacity= +# Number of events for 1commit (Commit events to memory queue) A1.channels.c1.transactioncapacity= -# .bind Source,sink to Channela1.sources.r1.channels=C1a1.sinks.k1.channel= C1
A1.sinks.k2.channel = C1
Collector1 configuration file
# onSpecify Agent,source,sink,channela1.sources=r1a1.sinks=K1a1.channels=C1# Avro Source,connect to local port 4444a1.sources.r1.type = Avro # downstream Avro Source is bound to this machine, the port number is consistent with the upstream agent specified value A1.S Ources.r1.bind = Slave1a1.sources.r1.port = 4444# GenevaLogger Sinka1.sinks.k1.type=Logger #GenevaChannel,memorya1.channels.c1.type=memorya1.channels.c1.capacity= +a1.channels.c1.transactionCapacity= -# tobind Source,sink to Channela1.sources.r1.channels=C1a1.sinks.k1.channel= C1
Collector2 configuration file
# onSpecify Agent,source,sink,channela1.sources=r1a1.sinks=K1a1.channels=C1# Avro Source,connect to local port 4444a1.sources.r1.type = Avro # downstream Avro Source is bound to this machine, the port number is consistent with the upstream agent specified value A1.sou Rces.r1.bind = Slave2a1.sources.r1.port = 4444# GenevaLogger Sinka1.sinks.k1.type=Logger #GenevaChannel,memorya1.channels.c1.type=memorya1.channels.c1.capacity= +a1.channels.c1.transactionCapacity= -# tobind Source,sink to Channela1.sources.r1.channels=C1a1.sinks.k1.channel= C1
3. Start collector1,2 and Agent
Start Collector1
--conf conf --conf-file./conf/flume-failover-server.properties --name A1 -dflume.root.logger=info,console
Interpretation: Start the flume agent according to the Flume-failvoer-server.properties configuration file in the Conf directory under the current directory; Agent name is called A1;
Flume log information to the terminal at info level and above
Start Collector2
./--conf conf--conf-file./conf/flume-failover-server.properties--name A1 - Dflume.root.logger=info,console
Start Agent
./--conf conf--conf-file./conf/flume-failover-client.properties--name A1 - Dflume.root.logger=info,console
Attention:
1) to start the downstream collector, and then to start the agent; Otherwise, the agent will start the downstream effective site selection, at this time collector if not started, there will be an error
2) After 3 agents have started normally, the agent will establish connections to all downstream sites: experience three stages of connected, bound, open
4, fault simulation and recovery
1) Before the failure occurs: first add data to the log file, pipe way to see if the event is printed at the Collector1 terminal
The SLAVE1 node where the Collector1 is located receives and prints an event to the terminal
2) Failure simulation: Kill Collector1 Process
3) Try sending the data again
The Slave2 node where the Collector2 is located receives and prints an event to the terminal
At the same time, the agent will always try to reestablish and Collector1 the connection
4) Restart the Collector1 process to simulate failback
Agent --conf conf --conf-file./conf/flume-failover-server.properties --name A1 -dflume.root.logger=info,console
5) Append data to log again to see if the event was sent to Collector1 again and printed to the terminal
At this point Collecot1 receives and prints the event (the failback time is set to 1 seconds in the agent's configuration)
6) Consider all downstream nodes all down, and then the downstream node recovery situation, the final data to whom?
Because the flume has an event-based transaction mechanism, when downstream nodes are all down, Flume will keep the event in the channel
When the downstream node resumes, the agent makes the active node selection again, and then sends the evnet again
When the downstream node receives the event, the agent removes the event from the channel
If the COLLECOTR2 is restored first, the event will be sent to Collector2; And no data is sent to Collector2 after Collecot1, so the event is now removed from the agent's channel
03_flume Multi-node failover practice