When you start flume, you can see the flume start-up entry in the command you entered.
[[Email protected] apache-flume-1.5.2-bin]# sh bin/flume-ng agent-c conf-f conf/server.conf-n a1info:sourcing environm ENT configuration script/home/flume/apache-flume-1.5.2-bin/conf/flume-env.sh+ exec/home/flume/jdk1.7.0_71/bin/ java-server-xms2048m-xmx2048m-xss256k-xx:permsize=32m-xx:maxpermsize=512m-xx:+useconcmarksweepgc-xx:+ disableexplicitgc-xx:+useparnewgc-xx:+cmsclassunloadingenabled-xx:+cmsparallelremarkenabled-xx:+ usecmscompactatfullcollection-xx:+usefastaccessormethods-xx:+usecmsinitiatingoccupancyonly-xx:+ USECOMPRESSEDOOPS-XX:CMSINITIATINGOCCUPANCYFRACTION=70-XX:+HEAPDUMPONOUTOFMEMORYERROR-XX:SURVIVORRATIO=8-CP '/ home/flume/apache-flume-1.5.2-bin/conf:/home/flume/apache-flume-1.5.2-bin/lib/* '-Djava.library.path= Org.apache.flume.node.application-f conf/server.conf-n A1
From here we can see that Flume's starting entrance is: org.apache.flume.node.Application
Let's see how the Portal program works.
Find the main function
Attached: Flume each boot will first determine whether there is a component with the same name as the three components of the current configuration exists, if present, first stop the component, the order is Source,sink,channel
The second is to start all currently configured components, with the boot sequence Channel,sink,source
The sequence of starting stops shows that Flume is also a guarantee of data consistency.
if (reload) { Eventbus Eventbus = new Eventbus (agentname + "-event-bus"); Pollingpropertiesfileconfigurationprovider Configurationprovider = New Pollingpropertiesfileconfigurationprovider (AgentName, configurationfile, Eventbus,); Components.add (Configurationprovider); application = new application (components); Eventbus.register (application); } else { Propertiesfileconfigurationprovider configurationprovider = new Propertiesfileconfigurationprovider (AgentName, configurationfile); application = new application (); Application.handleconfigurationevent (Configurationprovider.getconfiguration ()); }
The function of this if is whether to read the configuration in 30 seconds to determine if there is an update
Mainly look at the processing of configuration content, two branches although the logic is different from the code, but the logic of processing is the same
Let's look at the code of the Else branch:
See Configurationprovider.getconfiguration ()
Public Materializedconfiguration getconfiguration () {materializedconfiguration conf = new Simplematerializedconfigura tion (); Flumeconfiguration fconfig = Getflumeconfiguration (); Agentconfiguration agentconf = fconfig.getconfigurationfor (Getagentname ()); if (agentconf! = null) {map<string, channelcomponent> channelcomponentmap = Maps.newhashmap (); map<string, sourcerunner> sourcerunnermap = Maps.newhashmap (); map<string, sinkrunner> sinkrunnermap = Maps.newhashmap (); try {loadchannels (agentconf, Channelcomponentmap); Loadsources (agentconf, Channelcomponentmap, Sourcerunnermap); Loadsinks (agentconf, Channelcomponentmap, Sinkrunnermap); set<string> channelnames = new hashset<string> (Channelcomponentmap.keyset ()); for (String channelname:channelnames) {channelcomponent channelcomponent = Channelcomponentmap. Get (ChannelName); if (channelcomponent.cOmponents.isempty ()) {Logger.warn (String.Format ("Channel%s have no components connected" + "an D has been removed. ", ChannelName)); Channelcomponentmap.remove (ChannelName); map<string, channel> namechannelmap = Channelcache. Get (ChannelComponent.channel.getClass ()); if (namechannelmap! = null) {namechannelmap.remove (channelname); }} else {Logger.info (String.Format ("channel%s connected to%s", ChannelName, channel Component.components.toString ())); Conf.addchannel (ChannelName, Channelcomponent.channel); }} for (Map.entry<string, sourcerunner> entry:sourceRunnerMap.entrySet ()) {Conf.addsourcer Unner (Entry.getkey (), Entry.getvalue ()); } for (map.entry<string, sinkrunner> entry:sinkRunnerMap.entrySet ()) {Conf.addsinkrunner (entry.get Key (), Entry.getvalue ()); } } catch (Instantiationexception ex) {Logger.error ("Failed to instantiate component", ex); } finally {channelcomponentmap.clear (); Sourcerunnermap.clear (); Sinkrunnermap.clear (); }} else {Logger.warn ("No configuration found for this host:{}", Getagentname ()); } return conf; }
We see a way to load the source component: Sourcerunner.forsource (source)
public static Sourcerunner Forsource (source source) { Sourcerunner runner = null; if (source instanceof Pollablesource) { runner = new Pollablesourcerunner (); ((Pollablesourcerunner) runner). SetSource ((pollablesource) source); } else if (source instanceof Eventdrivensource) { runner = new Eventdrivensourcerunner (); ((Eventdrivensourcerunner) runner). SetSource ((eventdrivensource) source); } else { throw new IllegalArgumentException ("No known runner type for source" + source); } return runner; }
This method chooses which sourcerunner to use by judging the type of source
Let's take a look at a specific example. Avrosource, which is an event-driven type of Source--eventdrivensourcerunner
public void Start () { Source Source = GetSource (); Channelprocessor CP = Source.getchannelprocessor (); Cp.initialize (); Source.start (); Lifecyclestate = Lifecyclestate.start; }
This method Monitorrunnable class will be adjusted, this class is responsible for monitoring the flume of all components of the
So when do you want to tune it? Once this method is called, the interaction between the source and channel begins.
Switch (supervisoree.status.desiredState) {case START: try { lifecycleaware.start ();
The above code appears in the Lifecyclesupervisor class in the internal static class Monitorrunnable The Run method, and then see this thread class who to call?
Monitorrunnable monitorrunnable = new monitorrunnable (); Monitorrunnable.lifecycleaware = Lifecycleaware; Monitorrunnable.supervisoree = process; Monitorrunnable.monitorservice = Monitorservice; Supervisedprocesses.put (lifecycleaware, process); scheduledfuture<?> future = Monitorservice.schedulewithfixeddelay ( monitorrunnable, 0, 3, TimeUnit.SECONDS ); Monitorfutures.put (Lifecycleaware, future);
Supervise methods in the Lifecyclesupervisor class
From here we finally see the core in the core, that is, every 3 seconds, the source will interact with the channel once.
Supervisoree process = new Supervisoree (); Process.status = new status (); Process.policy = policy; Process.status.desiredState = desiredstate; Process.status.error = false;
Then who is the method that called the above code?
It's application.
Public synchronized void Start () {for (Lifecycleaware component:components) { supervisor.supervise (component , new Supervisorpolicy.alwaysrestartpolicy (), Lifecyclestate.start); } }
In that case, the whole chain is strung up.
So I can see from here that the interaction frequency of source and channel is 3 seconds.
Watch the interaction between the source and channel, and then see the sink and channel interactions
It's easy to see sink here, because the three components in flume implement the self-interface lifecycleaware
So from the point of view of Flume's entrance application, starting from start is ultimately the supervise method to Lifecyclesupervisor class, and the same method:
Monitorrunnable monitorrunnable = new monitorrunnable (); Monitorrunnable.lifecycleaware = Lifecycleaware; Monitorrunnable.supervisoree = process; Monitorrunnable.monitorservice = Monitorservice; Supervisedprocesses.put (lifecycleaware, process); scheduledfuture<?> future = Monitorservice.schedulewithfixeddelay ( monitorrunnable, 0, 3, TimeUnit.SECONDS ); Monitorfutures.put (Lifecycleaware, future);
This logic, no specific source,sink, is also executed in 3 seconds.
At this point, the flume of the three components of the interaction and frequency of interaction is said, hope you netizens feel free!!
"Flume" from the portal application to analyze how Flume's source and sink interact with the channel.