Window operator Windowoperator is the bottom-level implementation of window mechanism, it involves almost all the knowledge points of window, so it is relatively complex. This paper will analyze the implementation of windowoperator by means of polygon and point. First, let's take a look at its execution for the most common time window (containing processing time and event time):
, the left-to-right side is the direction of the event flow. The box represents the event, and the vertical dashed line in the event stream represents the watermark. Flink watermark is injected into the event stream through the Watermark splitter (the two operators of Timestampsandperiodicwatermarksoperator and Timestampsandpunctuatedwatermarksoperator). When the element flows to the windowoperator in the streaming dataflow engine, it is divided into two dials, namely ordinary events and watermarks.
In the case of an ordinary event, the Processelement method (one of the three circles in the dashed box) is called, and in the Processelement method, the window allocator is used first to assign the window to the currently received element. The trigger's Onelement method is then invoked on a per-element basis. For time-related triggers, it is common to register event time or process time timers, which are stored in the Windowoperator processing time timer queue and the Watermark timer queue (see the top and bottom two cylinders in the dashed box in the figure), and if the result is fire, The window is evaluated.
If it is a watermark (event time scenario), then the method Processwatermark will be called and it will process the timer in the Watermark timer queue. If the timestamp satisfies the condition, the Oneventtime method of the trigger is used for processing.
In the case of processing time, Windowoperator will implement itself as a trigger based on processing time, to trigger the trigger method to consume the timer in the timer queue to satisfy the condition and invoke the onprocessingtime of the window trigger. Determines whether the window is calculated based on the trigger result.
The above is the simplest representation of Windowoperator's routine process, in fact its logic is much more complex. We first break down several internal core objects, in which we see two queues: the watermark Timer queue and the processing time timer queue, respectively. What is the timer here? What role does it have? Next we'll look at its definition of--windowoperator's inner class timer. A timer is the basis for all time window execution, which is actually a context object that encapsulates three properties:
- Timestamp: Time stamp triggered by trigger;
- Key: The keys of the group to which the current element belongs;
- window: The current element to which it belongs;
When we talk about the window trigger, we mentioned the trigger context object, which acts as the process series method parameter. Inside Windowoperator We finally see the implementation of the context object interface--context, which provides three types of methods:
- Provide state storage and access;
- Registration and deletion of timers;
- Window trigger package for process series method;
When registering a timer, a timer object is created and added to the timer queue. When the time-related processing methods (Processwatermark and trigger) are triggered, the timer object is consumed from the timer queue and the window trigger is invoked, and the result of the trigger is used to determine whether the calculation of the window is touched. We select the event time processing method Processwatermark for analysis (processing time trigger similar to it):
Public void Processwatermark(Watermark Mark)throwsException {//Define an identity indicating whether a timer still satisfies the trigger condition BooleanFire; do {//From the Watermark Timer queue to find a timer in the first team, note that this is not the team (note the difference from the Remove method)Timer<k, w> timer = Watermarktimersqueue.peek ();//If the timer is present and its timestamp is less than the time stamp of the watermark //(note that the condition is: no greater than, the watermark is used to indicate that elements smaller than that timestamp have arrived, so all trigger timestamps that are less than the watermark are triggered) if(Timer! =NULL&& timer.timestamp <= Mark.gettimestamp ()) {//ID is true, indicating that a timer is found that satisfies the trigger conditionFire =true;//Set the element out of the team firstWatermarktimers.remove (timer); Watermarktimersqueue.remove ();//Build a new contextContext.key = Timer.key; Context.window = Timer.window; Setkeycontext (Timer.key);//The state store type used by the window is a state store that can be appendedAppendingstate<in, acc> windowState; Mergingwindowset<w> mergingwindows =NULL;//If the allocator is a merge allocator (e.g. session window) if(WindowassignerinstanceofMergingwindowassigner) {//Get merged Windows helper class Mergingwindowset instancesMergingwindows = Getmergingwindowset ();//Get the status window corresponding to the current window (the status window corresponds to the namespace stored by the state backend)W Statewindow = Mergingwindows.getstatewindow (Context.window);//If there is no corresponding status window, skip this cycle if(Statewindow = =NULL) {Continue; }//Get the status representation of the current windowWindowState = Getpartitionedstate (Statewindow, Windowserializer, windowstatedescriptor); }Else{//If it is not a merge allocator, the corresponding state representation of the window is obtained directlyWindowState = Getpartitionedstate (Context.window, Windowserializer, windowstatedescriptor); }//Get all the elements in the window from the window state representationACC contents = Windowstate.get ();if(contents = =NULL) {//If we have there and no Continue; }//Invoke the Event time processing method of the window trigger through the context object and get the trigger result objectTriggerresult Triggerresult = Context.oneventtime (Timer.timestamp);//If the result of the trigger is fire (touch window calculation), then call the firing method for window calculation if(Triggerresult.isfire ()) {Fire (Context.window, contents); }//And if the result of the touch is a cleanup window, or the event time equals the window's cleanup time (usually the window's Maxtimestamp property) if(Triggerresult.ispurge () | | (Windowassigner.iseventtime () && iscleanuptime (Context.window, Timer.timestamp))) {//Cleanup window and elementsCleanup (Context.window, windowState, mergingwindows); } }Else{//There are no qualifying timers in the queue, no set ID, terminate loopFire =false; } } while(fire);//downstream watermark transmissionOutput.emitwatermark (Mark);//Overwrite the Watermark property of the current operator with the timestamp of the new watermark This. Currentwatermark = Mark.gettimestamp ();}
Although the above method is verbose but the process is clear, the fire method used to calculate the window, it calls the internal window function (that is, internalwindowfunction, it wraps the windowfunction) of the Apply method.
and Iscleanuptime and cleanup This method mainly involves the window cleanup. If the current window is a time window and the window's time has reached cleanup time, Cleanup window cleanup will occur. So how do you judge the cleanup time? Flink is calculated by combining the maximum timestamp property of the window with the time allowed for delay:
privatelongcleanupTime(W window) { //清理时间被预置为窗口的最大时间戳加上允许的延迟事件 long cleanupTime = window.maxTimestamp() + allowedLateness; //如果窗口为非时间窗口(其maxTimestamp属性值为Long.MAX_VALUE),则其加上允许延迟的时间, //会造成Long溢出,从而会变成负数,导致cleanupTime < window.maxTimestamp 条件成立, //则直接将清理时间设置为Long.MAX_VALUE return cleanupTime >= window.maxTimestamp() ? cleanupTime : Long.MAX_VALUE;}
When the cleanup time is calculated and compared with the timer registration time, if the two are equal then the Boolean condition is true, otherwise false:
protectedfinalbooleanisCleanupTimelong time) { long cleanupTime = cleanupTime(window); return cleanupTime == time;}
Let's take a look at the main things the cleanup method does:
privatevoidcleanup(W window, AppendingState<IN, ACC> windowState, throws Exception { //清空窗口对应的状态后端的状态 windowState.clear(); //如果支持窗口合并,则清空窗口合并集合中对应当前窗口的记录 ifnull) { mergingWindows.retireWindow(window); } //清空上下文对象状态 context.clear();}
About window cleanup, in fact, the three major processing methods (Processelement\/processwatermark\/trigger) will be judged, if the conditions are met to clean up. While the logic to actually register the cleanup timer is in processelement, it calls the Registercleanuptimer method:
protectedvoidregisterCleanupTimer(W window) { //这里注册的时间即为计算过了的清理时间 long cleanupTime = cleanupTime(window); //根据不同的时间分类调用不同的注册方法 if (windowAssigner.isEventTime()) { context.registerEventTimeTimer(cleanupTime); else { context.registerProcessingTimeTimer(cleanupTime); }}
From the code snippet above, the cleanup timer is the same as the normal timer.
If there is no delay, perhaps their window cleanup is not necessarily triggered by the cleanup timer for event time and processing time. Because in the event-time triggers and processing-time triggers, the time that they register for the timer corresponds to the maximum timestamp of the window. Because these timers are generally queued in the queue before the cleanup timer, these timers will take precedence over the cleanup timer to be executed (priority trigger window cleanup). The Registercleanuptimer method here is a generalized cleanup mechanism that applies to all types of Windows and ensures that the window is bound to be cleaned up. In this case, repeated "clean-up" timers do not have a negative effect.
Windowoperator also has a successor: Evictingwindowoperator, which supports the element expulsion device on a regular window operator (see the small dashed rectangle inside the large dashed box). Evictingwindowoperator's special place is primarily the implementation of its fire-the elements that meet the expulsion conditions are pre-screened before the window is calculated, as shown in the following code:
private void Fire (W window, iterable<streamrecord<inch>> contents) throws Exception {Timestampedcollector. Setabsolutetimestamp(Window. Maxtimestamp()); Calculates the number of elements to expel int toevict = Evictor. Evict((iterable) contents, Iterables. Size(contents), Context. Window); fluentiterable<inch> projectedcontents = fluentiterable. from(contents). Skip(toevict). Transform(New function<streamrecord<inch,inch> () {@Override publicinchApply (streamrecord<inch> Input) {return input. GetValue(); } }); Userfunction. Apply(Context. Key, context. Window, Projectedcontents, Timestampedcollector);}
The number of elements to evict is calculated before the final call to the Apply method of the window, and then the elements are skipped and all successive elements starting from the first element are skipped (this is also mentioned before in our analysis window element eviction).
The fluentiterable helper class for the Guava class library is used here, which expands the Iterable interface and provides a very rich extension API.
Scan code Attention public number: Apache_flink
QQ Scan Code concern QQ Group: Apache Flink Learning Exchange Group (123414680)
Window operator analysis of Flink stream processing