I recently installed spark stand-alone mode on my computer, spark started without any problems, but when I started spark history, I had an error message as follows:
Spark assembly have been built with Hive, including DataNucleus jars on Classpathspark Command:/usr/local/java/jdk1.7.0_67 /BIN/JAVA-CP::/usr/local/spark/conf:/usr/local/spark/lib/spark-assembly-1.1.0-hadoop2.4.0.jar:/usr/local/spark /lib/datanucleus-core-3.2.2.jar:/usr/local/spark/lib/datanucleus-api-jdo-3.2.1.jar:/usr/local/spark/lib/ datanucleus-rdbms-3.2.1.jar:/usr/local/hadoop/etc/hadoop-xx:maxpermsize=128m-dspark.akka.loglifecycleevents= true-xms512m-xmx512m org.apache.spark.deploy.history.historyserver========================================15/01 /07 15:08:55 INFO history. Historyserver:registered signal handlers for [term, HUP, int]15/01/07 15:08:55 INFO Spark. Securitymanager:changing View ACLs to:root,15/01/07 15:08:55 INFO Spark. Securitymanager:changing Modify ACLs to:root,15/01/07 15:08:55 INFO Spark. SecurityManager:SecurityManager:authentication Disabled; UI ACLs Disabled; Users with view Permissions:set (root,); Users with modify Permissions:set (root,) Exception IN Thread "main" java.lang.reflect.InvocationTargetException at Sun.reflect.NativeConstructorAccessorImpl.newInstanc E0 (Native Method) at Sun.reflect.NativeConstructorAccessorImpl.newInstance (nativeconstructoraccessorimpl.java:57) At Sun.reflect.DelegatingConstructorAccessorImpl.newInstance (delegatingconstructoraccessorimpl.java:45) at J Ava.lang.reflect.Constructor.newInstance (constructor.java:526) at org.apache.spark.deploy.history.historyserver$ . Main (historyserver.scala:187) at Org.apache.spark.deploy.history.HistoryServer.main (Historyserver.scala) caused By:java.lang.IllegalArgumentException:Logging directory must be specified. At org.apache.spark.deploy.history.fshistoryprovider$ $anonfun $2.apply (fshistoryprovider.scala:41) at org.apache.spark.deploy.history.fshistoryprovider$ $anonfun $2.apply (fshistoryprovider.scala:41) at Scala. Option.getorelse (option.scala:120) at Org.apache.spark.deploy.history.FsHistoryProvider.<init> (fshistoryprovider.scala:41) ... 6 more/usr/local/spark/sbin/. /logs/spark-root-org.apache.spark.deploy.history.historyserver-1-macor.out (END)
The error is that log directory is not specified, looking for a half-day do not know how to set up, in addition, in this article "Spark History server Process Analysis" has the corresponding source code analysis, through learning, know is Spark.history.fs.logDirectory parameter not specified, but I still do not know how to set Ah! Later in this article, "Spark History Server configuration Use", you know that there are two ways to solve this:
1. Specify the value of Spark.history.fs.logDirectory when starting the spark history process as follows:
start-history-server.sh hdfs://localhost:9000/sparkhistorylogs
2. Configuring in Conf/spark-defaults.conf
History server-related configuration parameter description
1) spark.history.updateInterval
Default value: 10
The interval, in seconds, for updating log-related information
2) spark.history.retainedApplications
Default value: 50
The number of application history is saved in memory, and if this value is exceeded, the old application information is deleted and the page needs to be rebuilt when the deleted app information is accessed again.
3)spark.history.ui.port
Default value: 18080
Historyserver's web Port
4) spark.history.kerberos.enabled
Default value: False
It is useful to use Kerberos to log on to access Historyserver, which is helpful for persistent tiers in HDFs on secure clusters, and if set to True, configure the following two properties
5) Spark.history.kerberos.principal
Default value: The Kerberos Principal name for Historyserver
6) Spark.history.kerberos.keytab
Kerberos keytab file location for Historyserver
7) spark.history.ui.acls.enable
Default value: False
Whether the ACL is checked when the user is authorized to view the application information. If enabled, only the application owner and the user specified by Spark.ui.view.acls can view the application information;
8)spark.eventLog.enabled
Default value: False
Whether spark events are logged for the application to refactor after completion WebUI
9)Spark.eventLog.dir
Default value: File:///tmp/spark-events
The path that holds information about the log, either an HDFs path beginning with hdfs://or a local path at the beginning of the file://, need to be created in advance
)spark.eventLog.compress
Default value: False
Whether to compress the record spark event, if spark.eventLog.enabled is true, the default is to use snappy
The configuration that begins with Spark.history is configured in spark-env.sh spark_history_opts, with the Spark.eventlog beginning with the spark-defaults.conf
Spark-defaults.conf
spark.eventLog.enabled truespark.eventLog.dir hdfs://localhost:9000/ EventlogsTrue
spark-env.sh
Export spark_history_opts= "-Dspark.history.ui.port=7777-dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://localhost:9000/sparkhistorylogs "
Parameter description:
spark.history.ui.port=7777 Adjust the port number of the WebUI access to 7777, the ports can be modified according to their own
Spark.history.fs.logdirectory=hdfs://localhost:9000/sparkhistorylogs The property is configured, the The specified path is no longer required to be displayed at start-history-server.sh, depending on the actual modification
SPARK.HISTORY.RETAINEDAPPLICATIONS=3 Specifies the number of saved application history, and if this value is exceeded, the old application information will be deleted
The parameter is not required to start start-history-server.sh after adjusting the parameters.
Start-history-server. SH
This article is used in the memo and learning, if not, please advise!
Reference: http://www.cnblogs.com/luogankun/p/4089767.html
Http://www.cnblogs.com/luogankun/p/3981645.html
Spark initiates history task recording process, error Logging directory must be specified resolved