There is a system dedicated to conversion operations, and we call it DStream. We will first process the incoming data according to time, that is, RDD@time1 RDD@time2 RDD@time3…
But it must be known that these few things do not coexist, because every time a rdd with a time segmentation comes in, the paragraphs that have been processed before have been divided into the conversion operation. Then, the so-called stream conversion operation is explained. For wordcount, the previously partitioned data becomes a row by row, and then the flatMap operation is performed according to the time partitioned row. After the conversion is completed, the words are formed DStream, but in fact still operate on rdd, compared with our previous wordcount, this is a multiple conversion behavior
Spark Streaming back pressure mechanism When the speed of received data is greater than the speed of data processing, the backlog of rdd will trigger this mechanism
If the previous data is generated too quickly, there will be a bucket, and a token will be generated in the bucket. Only the data source with the token can be encapsulated as rdd, as long as we control the generation rate of the token, it can be eased Drop this problem, because he managed to control the generation rate of rdd@time, and when a token cannot be obtained, it will form a blocking state and wait for the token to be generated.
The entrance to Spark Streaming StreamingContext val conf = new SparkConf().setMaster(master).setAppName(appName); val ssc = new StreamingContext(conf,Second(1)); //You can access SparkContext through ssc.sparkContext
//Or directly create StreamingContext through sparkContext var ssc = new StreamingContext(new SparkContext(), Second(1));
After initializing Context: 1. Define the message input source to create DStreams. 2. Define the conversion operation and output operation of DStreams. 3. Start streaming message collection and processing through streamingContext.start() 4. Wait for the program to terminate, can be set by streamingContext.awaitTermination() 5. Stop the program manually by StreamingContext.stop()
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.