Import org.apache.spark.SparkConf Import org.apache.spark.storage.StorageLevel import org.apache.spark.streaming. {Seconds, StreamingContext} val conf = new sparkconf ()//Create a local StreamingContext Val SSC with two execution threads and 1 seconds batch interval (that is, split data stream in seconds) = new StreamingContext (conf, Seconds (1))//Create a dstream that represents the streaming data obtained from the TCP source (host bit localhost, port 9999). The lines variable is a dstream that represents the stream data that is about to be obtained from the data server Val lines = Ssc.sockettextstream ("localhost", 9999)// Each record in the Dstream represents a line of text. Dstream each line of text in the word//flatmap is a one-to-many dstream operation that creates a new Dstream Val by generating multiple new records for each record in the source Dstream Words = Lines.flatmap (_.split (")) Val pairs = Words.map (Word => (Word, 1)) Val wordcounts = Pairs.reducebykey (_ + _) Wordcounts.print ()//Perform calculation ssc.start () ssc.awaittermination ()
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.