Design Background
Spark Thriftserver currently has 10 instances on the line, the past through the monitoring port survival is not accurate, when the failure process does not quit a lot of situations, and manually to view the log and restart processing services This process is very inefficient, so design and use spark Streaming to the real-time acquisition of the spark Thriftserver log, through the log to determine whether the service stopped service, so that the corresponding automatic restart processing, the program can achieve the second level of 7 * 24h uninterrupted monitoring and maintenance services.
Design Architecture
- Deploy the flume agent on the Spark Thriftserver Service node that requires detection to monitor the log stream (flume use interceptor to add host information to the log)
- Flume collected log streams into Kafka
- Spark streaming receives Kafka log stream, detects log content according to custom keywords, and if hit keyword thinks the service is unavailable, the host information corresponding to the log is entered into MySQL
- Write a shell script to read the host information from MySQL, perform a restart service operation
software version and Configuration
Spark 2.0.1, Kafka 0.10, Flume 1.7
1) Flume Configuration and command:
Modify Flume-conf.properties
Agent.sources =Sparkts070agent.channels=cagent.sinks=kafkasink# for each one of the sources, the type is Definedagent.sources.sparkTS070.type=TAILDIRagent.sources.sparkTS070.interceptors=I1agent.sources.sparkTS070.interceptors.i1.type=Hostagent.sources.sparkTS070.interceptors.i1.useIP=falseAgent.sources.sparkTS070.interceptors.i1.hostHeader=agenthost# The channel can be defined as Follows.agent.sources.sparkTS070.channels=Cagent.sources.sparkTS070.positionFile=/home/hadoop/xu.wenchun/apache-flume-1.7.0-bin/taildir_position.jsonagent.sources.sparkTS070.filegroups=f1agent.sources.sparkTS070.filegroups.f1=/data1/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.hivethriftserver2-1-hadoop070.dx.com.out# each sink' s type must be Definedagent.sinks.kafkaSink.type =Org.apache.flume.sink.kafka.KafkaSinkagent.sinks.kafkaSink.kafka.topic= mytest-topic1agent.sinks.kafkaSink.kafka.bootstrap.servers= 10.87.202.51:9092Agent.sinks.kafkaSink.useFlumeEventFormat=true#Specify The channel the sink should Useagent.sinks.kafkaSink.channel=C # each channel' s type is Defined.agent.channels.c.type = memory
To run the command:
Nohup bin/flume-ng agent-n agent-c conf-f Conf/flume-conf.properties-dflume.root.logger=info,logfile &
2) Kafka Configuration and execution command:
Modify Config/server.properties
Broker.id=1listeners=plaintext://10.87.202.51:9092log.dirs=/home/hadoop/xu.wenchun/ KAFKA_2.11-0.10.0.1/Kafka.logzookeeper.connect= 10.87.202.44:2181,10.87.202.51:2181,10.87.202.52:21811234
Run command
Nohup bin/kafka-server-start.sh Config/server.properties &
Spark Streaming executes the command:
/opt/spark-2.0.1-bin-2.6.0/bin/spark-submit--master yarn-cluster--num-executors 3--class Sparktslogmonito
3) Shell script
Write a shell script to read the host information from MySQL, perform a restart service operation
Spark streaming core code for monitoring job
This type of sharing spark streaming code, the following code after some pit groping out to verify available.
Stream.foreachrdd {Rdd =rdd.foreachpartition {rddofpartition=Val Conn=connectpool.getconnection println ("Conn:" +conn) Conn.setautocommit (false)//Set as manual commitVal stmt =conn.createstatement () Rddofpartition.foreach {event=val Body=Event.value (). Get () Val decoder= Decoderfactory.get (). Binarydecoder (Body,NULL) Val Result=NewSpecificdatumreader[avroflumeevent] (classof[avroflumeevent]). Read (NULL, decoder) val hostname= Result.getHeaders.get (NewUtf8 ("Agenthost")) Val text=NewString (Result.getBody.array ())if(Text.contains ("Broken pipe") | | text.contains ("No Active Sparkcontext")) {val Dateformat:simpledateformat=NewSimpleDateFormat ("Yyyymmddhhmmsssss") Val ID= Dateformat.format (NewDate ()) + "_" + (NewUtil. Random). Nextint (999) Stmt.addbatch ("INSERT into monitor (Id,hostname) VALUES ('" + ID + "', '" + hostname + "')") println ("INSERT into monitor (Id,hostname) VALUES ('" + ID + "', '" + hostname + "')")}} stmt.executebatch () Conn.commit () Conn.close ()}}
The above is a real-time processing of the typical entry application, just encounter such monitoring operation and maintenance problems, so the use of the program to deal with the effect is good.
Transferred from: http://blog.csdn.net/xwc35047/article/details/75309350
Automated, spark streaming-based SQL services for real-time automated operations