The previous blog introduces the filtering section in the basic query operations of streaminsight. This article mainly introduces the aggregation section in the basic query operations of streaminsight.
Test data preparation
To facilitate test query, we first prepare a static test data source:
var weatherData = new[]{ new { Timestamp = new DateTime(2010, 1, 1, 0, 00, 00, DateTimeKind.Utc), Temperature = -9.0, StationCode = 71395, WindSpeed = 4}, new { Timestamp = new DateTime(2010, 1, 1, 0, 30, 00, DateTimeKind.Utc), Temperature = -4.5, StationCode = 71801, WindSpeed = 41}, new { Timestamp = new DateTime(2010, 1, 1, 1, 00, 00, DateTimeKind.Utc), Temperature = -8.8, StationCode = 71395, WindSpeed = 6}, new { Timestamp = new DateTime(2010, 1, 1, 1, 30, 00, DateTimeKind.Utc), Temperature = -4.4, StationCode = 71801, WindSpeed = 39}, new { Timestamp = new DateTime(2010, 1, 1, 2, 00, 00, DateTimeKind.Utc), Temperature = -9.7, StationCode = 71395, WindSpeed = 9}, new { Timestamp = new DateTime(2010, 1, 1, 2, 30, 00, DateTimeKind.Utc), Temperature = -4.6, StationCode = 71801, WindSpeed = 59}, new { Timestamp = new DateTime(2010, 1, 1, 3, 00, 00, DateTimeKind.Utc), Temperature = -9.6, StationCode = 71395, WindSpeed = 9},};
Weatherdata represents a series of weather information (timestamp, temperature, weather station code, and wind speed ).
Next we will transform weatherdata into a complex event stream of point type:
var weatherStream = weatherData.ToPointStream(Application, t => PointEvent.CreateInsert(t.Timestamp, t), AdvanceTimeSettings.IncreasingStartTime);
Basic Aggregation
Question 1: how to calculate the average value of all events (average temperature and average wind speed) every three hours )?
var averageQuery = from win in weatherStream.TumblingWindow( TimeSpan.FromHours(3), HoppingWindowOutputPolicy.ClipToWindowEnd) select new { AverageTemperature = win.Avg(e => e.Temperature), AverageWindspeed = win.Avg(e => e.WindSpeed) };
Use the following statement to output the result to the linqpad result window:
(from p in averageQuery.ToPointEnumerable() where p.EventKind == EventKind.Insert select p).Dump();
The final result contains the following two events:
The first event in the result is the average result of the time period [2010/1/1 0:00:00, 2010/1/1 3:00:00) (note that the time periods in streaminsight are closed and then open), that is, the average result of the first six events: for example, average temperature averagetemperature =-(9.0 + 4.5 + 8.8 + 4.4 + 9.7 + 4.6)/6 =-6.83333333, average wind speed averagewindspeed = (4 + 41 + 6 + 39 + 9 + 59)/6 = 26.3333333.
The second event in the result is the average result of the time period [3:00:00, 2010/1/1 6:00:00. Since only the last event in the input data is in this range, the average temperature averagetemperature and average wind speed averagewindspeed are the corresponding temperature value and wind speed value of the event.
Question 2: How to calculate the average of all events in the past three hours every other hour?
var averageQuery2 = from win in weatherStream.HoppingWindow( TimeSpan.FromHours(3), TimeSpan.FromHours(1), HoppingWindowOutputPolicy.ClipToWindowEnd) select new { AverageTemperature = win.Avg(e => e.Temperature), AverageWindspeed = win.Avg(e => e.WindSpeed) };
Unlike the query in question 1, hoppingwindow is used to calculate the semantic actions of the past three hours every one hour. Export the result as follows:
It should be noted that the average value of events in the past three hours should be calculated from 0:00:00, but in the previous time period, two events are overwritten in the period [2009/12/31 22:00:00, 2010/1/1 1:00:00) and [2009/12/31 23:00:00, 2010/1/1 2:00:00). Therefore, the output contains the two events.
Question 3: how to calculate the average value of the past one hour when a new event arrives?
var averageQuery3 = from win in weatherStream .AlterEventDuration(e => TimeSpan.FromHours(1)) .SnapshotWindow(SnapshotWindowOutputPolicy.Clip) select new { AverageTemperature = win.Avg(e => e.Temperature), AverageWindspeed = win.Avg(e => e.WindSpeed), EventCount = win.Count() };
The snapshotwindow function can be used to output events whenever a new event arrives. With altereventduration, the event at each point is extended to an interval of one hour, you can calculate the average value of multiple events.
The output result is as follows:
Question 4: how to calculate the number of events within one hour?
Similar to question 1, but the aggregate function used here is not AVG but count.
var countQuery = from win in weatherStream.TumblingWindow( TimeSpan.FromHours(1), HoppingWindowOutputPolicy.ClipToWindowEnd) select new { EventCount = win.Count() };
The result is as follows:
Next, we will introduce the User-Defined aggregation section in the basic query operations of streaminsight.