【Flume】從入口Application來分析flume的source和sink是如何與channel互動的

來源:互聯網
上載者:User

標籤:flume   源碼   

大家在啟動flume的時候,輸入的命令就可以看出flume的啟動入口了

[[email protected] apache-flume-1.5.2-bin]# sh bin/flume-ng agent -c conf -f conf/server.conf -n a1Info: Sourcing environment configuration script /home/flume/apache-flume-1.5.2-bin/conf/flume-env.sh+ exec /home/flume/jdk1.7.0_71/bin/java -server -Xms2048m -Xmx2048m -Xss256K -XX:PermSize=32M -XX:MaxPermSize=512M -XX:+UseConcMarkSweepGC -XX:+DisableExplicitGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedOops -XX:CMSInitiatingOccupancyFraction=70 -XX:+HeapDumpOnOutOfMemoryError -XX:SurvivorRatio=8 -cp '/home/flume/apache-flume-1.5.2-bin/conf:/home/flume/apache-flume-1.5.2-bin/lib/*' -Djava.library.path= org.apache.flume.node.Application -f conf/server.conf -n a1

從這裡可以看出flume的啟動入口是:org.apache.flume.node.Application

下面我們就來看該入口程式是如何來啟動並執行

找到main函數

附:flume每次啟動都會先判斷有沒有與當前配置的三大組件同名的組件存在,存在的話先停掉該組件,順序為source,sink,channel

其次是啟動所有當前配置的組件,啟動順序為channel,sink,source

通過這個啟動停止的順序可以看出flume也是對資料一致性做了保證的。

if(reload) {        EventBus eventBus = new EventBus(agentName + "-event-bus");        PollingPropertiesFileConfigurationProvider configurationProvider =            new PollingPropertiesFileConfigurationProvider(agentName,                configurationFile, eventBus, 30);        components.add(configurationProvider);        application = new Application(components);        eventBus.register(application);      } else {        PropertiesFileConfigurationProvider configurationProvider =            new PropertiesFileConfigurationProvider(agentName,                configurationFile);        application = new Application();        application.handleConfigurationEvent(configurationProvider.getConfiguration());      }
這個if的作用就是是否30秒讀一下配置,判斷是否有更新

主要看一下對於配置內容的處理,兩個分支雖然從代碼上看邏輯不一樣,但是處理的邏輯是一樣的

我們看else分支的代碼吧:

看configurationProvider.getConfiguration()
public MaterializedConfiguration getConfiguration() {    MaterializedConfiguration conf = new SimpleMaterializedConfiguration();    FlumeConfiguration fconfig = getFlumeConfiguration();    AgentConfiguration agentConf = fconfig.getConfigurationFor(getAgentName());    if (agentConf != null) {      Map<String, ChannelComponent> channelComponentMap = Maps.newHashMap();      Map<String, SourceRunner> sourceRunnerMap = Maps.newHashMap();      Map<String, SinkRunner> sinkRunnerMap = Maps.newHashMap();      try {        loadChannels(agentConf, channelComponentMap);        loadSources(agentConf, channelComponentMap, sourceRunnerMap);        loadSinks(agentConf, channelComponentMap, sinkRunnerMap);        Set<String> channelNames =            new HashSet<String>(channelComponentMap.keySet());        for(String channelName : channelNames) {          ChannelComponent channelComponent = channelComponentMap.              get(channelName);          if(channelComponent.components.isEmpty()) {            LOGGER.warn(String.format("Channel %s has no components connected" +                " and has been removed.", channelName));            channelComponentMap.remove(channelName);            Map<String, Channel> nameChannelMap = channelCache.                get(channelComponent.channel.getClass());            if(nameChannelMap != null) {              nameChannelMap.remove(channelName);            }          } else {            LOGGER.info(String.format("Channel %s connected to %s",                channelName, channelComponent.components.toString()));            conf.addChannel(channelName, channelComponent.channel);          }        }        for(Map.Entry<String, SourceRunner> entry : sourceRunnerMap.entrySet()) {          conf.addSourceRunner(entry.getKey(), entry.getValue());        }        for(Map.Entry<String, SinkRunner> entry : sinkRunnerMap.entrySet()) {          conf.addSinkRunner(entry.getKey(), entry.getValue());        }      } catch (InstantiationException ex) {        LOGGER.error("Failed to instantiate component", ex);      } finally {        channelComponentMap.clear();        sourceRunnerMap.clear();        sinkRunnerMap.clear();      }    } else {      LOGGER.warn("No configuration found for this host:{}", getAgentName());    }    return conf;  }
我們看在載入source組件的時候有個方法: SourceRunner.forSource(source)

public static SourceRunner forSource(Source source) {    SourceRunner runner = null;    if (source instanceof PollableSource) {      runner = new PollableSourceRunner();      ((PollableSourceRunner) runner).setSource((PollableSource) source);    } else if (source instanceof EventDrivenSource) {      runner = new EventDrivenSourceRunner();      ((EventDrivenSourceRunner) runner).setSource((EventDrivenSource) source);    } else {      throw new IllegalArgumentException("No known runner type for source "          + source);    }    return runner;  }
這個方法裡面通過對source的類型判斷來選擇使用哪種SourceRunner

我們來看一個具體例子吧AvroSource,它是事件驅動類型的source——EventDrivenSourceRunner

  public void start() {    Source source = getSource();    ChannelProcessor cp = source.getChannelProcessor();    cp.initialize();    source.start();    lifecycleState = LifecycleState.START;  }
這個方法MonitorRunnable類會來調的,這個類就是負責監控flume的所有組件的

那麼什麼時候來調呢?一旦調用這個方法,source與channel的互動就開始了

 switch (supervisoree.status.desiredState) {              case START:                try {                  lifecycleAware.start();
上面的代碼出現在LifecycleSupervisor類中的內部靜態類MonitorRunnable的run方法中,再來看這個線程類誰來調用?

  MonitorRunnable monitorRunnable = new MonitorRunnable();    monitorRunnable.lifecycleAware = lifecycleAware;    monitorRunnable.supervisoree = process;    monitorRunnable.monitorService = monitorService;    supervisedProcesses.put(lifecycleAware, process);    ScheduledFuture<?> future = monitorService.scheduleWithFixedDelay(        monitorRunnable, 0, 3, TimeUnit.SECONDS);    monitorFutures.put(lifecycleAware, future);
在LifecycleSupervisor類中supervise方法

從這裡我們終於看到核心中的核心了,也就是每隔3秒,source會和channel互動一次。

 Supervisoree process = new Supervisoree();    process.status = new Status();    process.policy = policy;    process.status.desiredState = desiredState;    process.status.error = false;
那麼上面的代碼所在方法又是被誰調用的呢?

是Application

public synchronized void start() {    for(LifecycleAware component : components) {      supervisor.supervise(component,          new SupervisorPolicy.AlwaysRestartPolicy(), LifecycleState.START);    }  }
這樣的話,整個鏈就串起來了
所以從這裡看出來source和channel的互動頻率是3秒


看完source和channel的互動,再來看sink和channel的互動

到這裡再看sink就很簡單了,因為flume中三大組件都實現自介面LifecycleAware

所以從flume的入口Application來看,從start開始最終都是到LifecycleSupervisor類的supervise方法,而該方法同樣:

MonitorRunnable monitorRunnable = new MonitorRunnable();    monitorRunnable.lifecycleAware = lifecycleAware;    monitorRunnable.supervisoree = process;    monitorRunnable.monitorService = monitorService;    supervisedProcesses.put(lifecycleAware, process);    ScheduledFuture<?> future = monitorService.scheduleWithFixedDelay(        monitorRunnable, 0, 3, TimeUnit.SECONDS);    monitorFutures.put(lifecycleAware, future);
這串邏輯,不分具體的source,sink,同樣是3秒執行一次。


至此,flume中三大組件的互動以及互動頻率就說完了,望各位網友不吝指教!!





【Flume】從入口Application來分析flume的source和sink是如何與channel互動的

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.