Source code Analysis of Failoversinkprocessor fault-tolerant processing mechanism in "Flume" Flume

Source: Internet
Author: User
Tags failover

Failoversinkprocessor as the name implies is the sink output fault tolerant processor in flume

Inherit from Abstractsinkprocessor

First look at the overall source code

/** * Licensed to the Apache software Foundation (ASF) under one * or more contributor license agreements.  See the NOTICE file * Distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * under the Apache License, Version 2.0 (The * "License");  You are not a use of this file except in compliance * with the License. Obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * unless required by applicab Le law or agreed into writing, software * Distributed under the License is distributed on a "as is" BASIS, * without WAR Ranties or CONDITIONS of any KIND, either express OR implied. * See the License for the specific language governing permissions and * limitations under the License. */package org.apache.flume.sink;import java.util.hashmap;import Java.util.list;import java.util.Map;import Java.util.map.entry;import Java.util.priorityqueue;import Java.util.queue;import Java.util.SortedMap;impoRT Java.util.treemap;import Org.apache.flume.context;import Org.apache.flume.eventdeliveryexception;import Org.apache.flume.sink;import Org.apache.flume.sink.status;import Org.slf4j.logger;import org.slf4j.LoggerFactory ;/** * Failoversinkprocessor maintains a prioritized list of sinks, * guarranteeing that's long as one is available event s'll be processed. * * The failover mechanism works by relegating failed sinks to a pool * where they is assigned a cooldown period, increas ing with sequential * failures before they is retried. Once a sink succesfully sends an * event it's restored to the live pool. * * Failoversinkprocessor is in no-it thread safe and expects to being run via * Sinkrunner Additionally, setsinks must be C  Alled before configure, and * Additional sinks cannot be added while running * * To configure, set a sink groups processor To ' failover ' and set priorities * for individual sinks, all priorities must is unique. Furthermore, an * upper limit to failover time can Set (in miliseconds) using maxpenalty * * Ex) * * host1.sinkgroups = group1 * * host1.sinkgroups.group1.sinks = SINK1 sink 2 * host1.sinkgroups.group1.processor.type = failover * Host1.sinkgroups.group1.processor.priority.sink1 = 5 * HOST1.SINKGROUPS.GROUP1.PROCESSOR.PRIORITY.SINK2 = ten * host1.sinkgroups.group1.processor.maxpenalty = 10000 * */  public class Failoversinkprocessor extends Abstractsinkprocessor {private static final int failure_penalty = 1000;  private static final int default_max_penalty = 30000;    Private class Failedsink implements comparable<failedsink> {private Long refresh;    Private Integer priority;    Private Sink Sink;    Private Integer sequentialfailures;      Public Failedsink (Integer priority, Sink Sink, int seqfailures) {this.sink = Sink;      This.priority = priority;      This.sequentialfailures = Seqfailures;    Adjustrefresh ();    } @Override public int compareTo (Failedsink arg0) {return Refresh.compareto (Arg0.refresh); }    Public Long Getrefresh () {return refresh;    } public Sink Getsink () {return Sink;    } public Integer getpriority () {return to priority;      } public void Incfails () {sequentialfailures++;      Adjustrefresh (); Logger.debug ("Sink {} failed again, new refresh is at {}," + "Current Time {}", new object[] {si    Nk.getname (), Refresh, System.currenttimemillis ()}); } private void Adjustrefresh () {refresh = System.currenttimemillis () + Math.min (maxpenalty, (1 <& Lt    Sequentialfailures) * failure_penalty);  }} private static final Logger Logger = loggerfactory. GetLogger (Failoversinkprocessor.class);  private static final String Priority_prefix = "priority.";  private static final String Max_penalty_prefix = "Maxpenalty";  Private map<string, sink> sinks;  Private Sink Activesink;  Private Sortedmap<integer, sink> livesinks;  Private queue<failedsink> failedsinks; private intMaxpenalty;    @Override public void Configure (context context) {livesinks = new Treemap<integer, sink> ();    Failedsinks = new priorityqueue<failedsink> ();    Integer Nextprio = 0;    String maxpenaltystr = context.getstring (Max_penalty_prefix);    if (maxpenaltystr = = null) {maxpenalty = default_max_penalty;      } else {try {maxpenalty = Integer.parseint (MAXPENALTYSTR); } catch (NumberFormatException e) {Logger.warn ("{} is not a valid value for {}", new object[] {max        Penaltystr, max_penalty_prefix});      Maxpenalty = default_max_penalty; }} for (Entry<string, sink> entry:sinks.entrySet ()) {String pristr = Priority_prefix + Entry.getkey ()      ;      Integer priority;      try {priority = Integer.parseint (Context.getstring (PRISTR));      } catch (Exception e) {priority =--nextprio; } if (!livesinks.containskey (priority)) {Livesinks.put (priority, Sinks.get (ENTRY.GETKEY ())); } else {Logger.warn ("Sink {} not added to Failversinkprocessor as" + "duplicates, that of Sink      {} ", Entry.getkey (), Livesinks.get (priority));  }} Activesink = Livesinks.get (Livesinks.lastkey ()); } @Override Public Status process () throws Eventdeliveryexception {//Retry any failed sinks that has gone through    Their "cooldown" period Long now = System.currenttimemillis (); while (!failedsinks.isempty () && Failedsinks.peek (). Getrefresh () < now) {Failedsink cur = failedsinks.poll      ();      Status s;        try {s = cur.getsink (). process ();          if (s = = Status.ready) {livesinks.put (Cur.getpriority (), Cur.getsink ());          Activesink = Livesinks.get (Livesinks.lastkey ());        Logger.debug ("Sink {} is recovered from the Fail list", Cur.getsink (). GetName ());          } else {//if it ' s a backoff it Needn ' t be penalized.    Failedsinks.add (cur);    } return s;        } catch (Exception e) {cur.incfails ();      Failedsinks.add (cur);    }} Status ret = null;        while (Activesink! = null) {try {ret = activesink.process ();      return ret; } catch (Exception e) {Logger.warn ("Sink {} failed and have been sent to failover list", Activesink.        GetName (), E);      Activesink = Moveactivetodeadandgetnext ();  }} throw new Eventdeliveryexception ("All sinks-failed to process," + "no left-to-failover to");    } private Sink Moveactivetodeadandgetnext () {Integer key = Livesinks.lastkey ();    Failedsinks.add (New Failedsink (Key, Activesink, 1));    Livesinks.remove (key);    if (Livesinks.isempty ()) return null;    if (livesinks.lastkey () = null) {return livesinks.get (Livesinks.lastkey ());    } else {return null;    }} @Override public void setsinks (list<sink> sinks) {//needed to implement the Start/stop functionality Super. Setsinks (sinks);    This.sinks = new hashmap<string, sink> ();    for (Sink sink:sinks) {this.sinks.put (Sink.getname (), Sink); }  }}

There is an inner class failsink in this class, which is the definition of the failed sink

Private Long refresh;    Private Integer priority;    Private Sink Sink;    Private Integer sequentialfailures;
This is the definition of a variable

1. Select the system time for the currently active sink

2, Sink priority

3, Specific sink

4. Number of failures

public void Incfails () {      sequentialfailures++;      Adjustrefresh ();      Logger.debug ("Sink {} failed again, new refresh is at {}," +            "current Time {}", new object[] {              sink.getname (), RE Fresh, System.currenttimemillis ()});    }
This method is the method that is triggered when the sink fails.
Here's a look at the main logic code:

private static final String Priority_prefix = "priority.";  private static final String Max_penalty_prefix = "Maxpenalty";  Private map<string, sink> sinks;  Private Sink Activesink;  Private Sortedmap<integer, sink> livesinks;  Private queue<failedsink> failedsinks;  private int maxpenalty;
Variable definition.

 public void Configure (context context) {livesinks = new Treemap<integer, sink> ();    Failedsinks = new priorityqueue<failedsink> ();    Integer Nextprio = 0;    String maxpenaltystr = context.getstring (Max_penalty_prefix);    if (maxpenaltystr = = null) {maxpenalty = default_max_penalty;      } else {try {maxpenalty = Integer.parseint (MAXPENALTYSTR); } catch (NumberFormatException e) {Logger.warn ("{} is not a valid value for {}", new object[] {max        Penaltystr, max_penalty_prefix});      Maxpenalty = default_max_penalty; }} for (Entry<string, sink> entry:sinks.entrySet ()) {String pristr = Priority_prefix + Entry.getkey ()      ;      Integer priority;      try {priority = Integer.parseint (Context.getstring (PRISTR));      } catch (Exception e) {priority =--nextprio;      } if (!livesinks.containskey) {livesinks.put (priority, Sinks.get (Entry.getkey ()));  } else {      Logger.warn ("Sink {} not added-Failversinkprocessor as priority" + "duplicates that of Sink {}", entry      . GetKey (), Livesinks.get (priority));  }} Activesink = Livesinks.get (Livesinks.lastkey ()); }
the method is primarily to read the configuration, and initialize a number of variables

1. livesinks,failedsinks initialized to empty map and queue

2. Read Maxpenalty

3, initialize the sinks, here through the Setsinks method initialized, the internal logic is read conf configuration file "The specific process can view the source code abstractconfigurationprovider.getconfiguration (), Flumeconfiguration.getconfigurationfor () Step by step, you'll see.

4, initialize the Livesinks assignment, add all the sink in the configuration to Livesinks
5. Select the last sink from Livesinks to process the data output as a sink of the current activation state

Here's another look at the specific processing logic:

Public Status process () throws Eventdeliveryexception {//Retry any failed sinks that has gone through their "Cooldow    N "period Long now = System.currenttimemillis (); while (!failedsinks.isempty () && Failedsinks.peek (). Getrefresh () < now) {Failedsink cur = failedsinks.poll      ();      Status s;        try {s = cur.getsink (). process ();          if (s = = Status.ready) {livesinks.put (Cur.getpriority (), Cur.getsink ());          Activesink = Livesinks.get (Livesinks.lastkey ());        Logger.debug ("Sink {} is recovered from the Fail list", Cur.getsink (). GetName ());          } else {//if it ' s a backoff it Needn ' t be penalized.        Failedsinks.add (cur);      } return s;        } catch (Exception e) {cur.incfails ();      Failedsinks.add (cur);    }} Status ret = null;        while (Activesink! = null) {try {ret = activesink.process ();      return ret; } catch (Exception e) {loGger.warn ("Sink {} failed and have been sent to failover list", Activesink.getname (), E);      Activesink = Moveactivetodeadandgetnext ();  }} throw new Eventdeliveryexception ("All sinks-failed to process," + "no left-to-failover to"); }
So far failedsinks is still empty, so take precedence over the latter half of the code

STATUS ret = null;    while (Activesink! = null) {      try {        ret = activesink.process ();        return ret;      } catch (Exception e) {        Logger.warn ("Sink {} failed and have been sent to failover list",                activesink.getname (), e); 
   activesink = Moveactivetodeadandgetnext ();      }    }
1, the current activation state of the sink is not empty

2. Call current sink for processing

3. If processing exception occurs, add the current sink to Failedsinks and remove from livesinks

Private Sink Moveactivetodeadandgetnext () {    Integer key = Livesinks.lastkey ();    Failedsinks.add (New Failedsink (Key, Activesink, 1));    Livesinks.remove (key);    if (Livesinks.isempty ()) return null;    if (livesinks.lastkey () = null) {      return Livesinks.get (Livesinks.lastkey ());    } else {      return null;    }  }

4. Return to a usable sink

If a failure occurs, then look at the execution logic of the first half of the code in the process:

Long now = System.currenttimemillis ();    while (!failedsinks.isempty () && Failedsinks.peek (). Getrefresh () < now) {      Failedsink cur = Failedsinks.poll ();      Status s;      try {        s = cur.getsink (). process ();        if (s  = = Status.ready) {          livesinks.put (cur.getpriority (), Cur.getsink ());          Activesink = Livesinks.get (Livesinks.lastkey ());          Logger.debug ("Sink {} is recovered from the Fail list",                  Cur.getsink (). GetName ());        } else {          //If it's a BA Ckoff it needn ' t be penalized.          Failedsinks.add (cur);        }        return s;      } catch (Exception e) {        cur.incfails ();        Failedsinks.add (cur);      }    }

Prerequisites: Failedsinks is not empty and the sink activation time of the team header is less than the current time

1, poll out the queue of the first Failedsink

2, using the current sink processing, if the processing is successful, then add the sink back to Livesinks, and assign activesinks to the current sink

3. If processing fails, re-adding will failedsinks the queue

4, abnormal situation, then trigger Incfails (), also re-add will failedsinks queue

The above logic is the core content, that is, a backoff mechanism, if the failedsinks queue of sink can continue to deal with, I will recycle, and will not punish it


private void Adjustrefresh () {      refresh = System.currenttimemillis ()              + math.min (maxpenalty, (1 << Sequentialfailures) * failure_penalty);    }
a failed sink will not be selected to deal with, it depends on the conditions above Refresh<now .
in other words, after a failure, you must wait for "Math.min (Maxpenalty, (1 << sequentialfailures) * failure_penalty)" To be checked again after this time.






Source code Analysis of Failoversinkprocessor fault-tolerant processing mechanism in "Flume" Flume

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.