Spark:ValueError:Cannot run multiple sparkcontexts at once solution

Source: Internet
Author: User
Tags jupyter notebook pyspark

Yesterday spent an afternoon to install spark, and Pyspark shell editing interface to Jupyter notebook, and then in accordance with the "Spark fast large data analysis" This book taste fresh, feel the power of spark.

My system is Win7,spark 1.6,anaconda 3,python3. The code is as follows:

Lines = Sc.textfile ("D://program files//spark//spark-1.6.0-bin-hadoop2.6//readme.md") 
print ("Number of lines of text", Lines.count ()) from

pyspark import sparkcontext

logFile = "D://program files//spark// Spark-1.6.0-bin-hadoop2.6//readme.md "  # Should is some file on your system
SC = sparkcontext (' local '," Simple App ")
Logdata = Sc.textfile (logFile). Cache ()

Numas = Logdata.filter (lambda s: ' A ' in s). Count () 
numbs = Logdata  . Filter (lambda s: ' B ' s). Count ()
pythonlines = Lines.filter (lambda line: "Python" in line)
print ("lines with a: %i, lines with B:%i "% (Numas, numbs))

The results are as follows, error ValueError:

Number of lines of text---------------------------------------------------------------------------ValueError  Traceback (most recent) <ipython-input-3-70ecab39b7ea> in <module> () 5 6 logFile = "D://program files//spark//spark-1.6.0-bin-hadoop2.6//readme.md" # Should to some file on your system----> 7 SC = Sp Arkcontext ("local", "Simple App") 8 Logdata = Sc.textfile (logFile). Cache () 9 D:\spark\spark-1.6.0-bin-hadoop 2.6\python\pyspark\context.py in __init__ (self, master, AppName, Sparkhome, Pyfiles, Environment, batchsize, serializer , conf, Gateway, JSC, PROFILER_CLS) "" "Self._callsite = First_spark_call () or Callsite (No             NE, none, none)--> 112 sparkcontext._ensure_initialized (self, gateway=gateway) 113 try:114 Self._do_init (Master, AppName, Sparkhome, Pyfiles, Environment, batchsize, serializer, D:\spark\spark-1.6.0-bin -hadoop2.6\python\pyspark\context.py in _ensure_initialized (CLS, instance, Gateway) 259 ' created by%s at%s:%s ' 26 0% (Currentappname, Currentmaster,--> 261 callsite.function, call  Site.file, Callsite.linenum)) 262 else:263 Sparkcontext._active_spark_context = Instance Valueerror:cannot run multiple sparkcontexts at once; Existing Sparkcontext (App=pysparkshell, master=local[*]) created by <module> at D:\Program files\anaconda3\lib\ 
 site-packages\ipython\utils\py3compat.py:186

Bloggers Google the problem, and finally found the answer in the stack overflow this site. Originally, Valueerror:cannot run multiple sparkcontexts at once; Existing Sparkcontext (App=pysparkshell, master=local[*]) created by at D:\Program Files\anaconda3\lib\site-packages\ ipython\utils\py3compat.py:186. This means that you cannot open multiple SC (sparkcontext) at once because there is already a spark contexts, so creating a new SC will make an error. So the way to solve the problem is to turn off the existing SC to create a new SC. Then how to close it. We can use the Sc.stop () function to shut down.

Let's change the code and run it to see the result:

Lines = Sc.textfile ("D://program files//spark//spark-1.6.0-bin-hadoop2.6//readme.md") 
print ("Number of lines of text", Lines.count ())

sc.stop () #退出已有的sc from

pyspark import sparkcontext

logFile = "D://program files//spark// Spark-1.6.0-bin-hadoop2.6//readme.md "  # Should is some file on your system
SC = sparkcontext (' local '," Simple App ")
Logdata = Sc.textfile (logFile). Cache ()

Numas = Logdata.filter (lambda s: ' A ' in s). Count () 
numbs = Logdata . Filter (lambda s: ' B ' s). Count ()
pythonlines = Lines.filter (lambda line: "Python" in line)
print ("lines with a" :%i, lines with B:%i "% (Numas, numbs))

The results are as follows:

The number of lines of text
Lines with a:58 and Lines with b:26

The result is easy to fix, and mark, beware of the next problem.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.