In the previous blog, we introduced the aggregation section in the basic query operations of streaminsight. This article describes how to use custom aggregation in streaminsight queries.
Test data preparation
To facilitate test query, we first prepare a static test data source:
var weatherData = new[]{ new { Timestamp = new DateTime(2010, 1, 1, 0, 00, 00, DateTimeKind.Utc), Temperature = -9.0, StationCode = 71395, WindSpeed = 4}, new { Timestamp = new DateTime(2010, 1, 1, 0, 30, 00, DateTimeKind.Utc), Temperature = -4.5, StationCode = 71801, WindSpeed = 41}, new { Timestamp = new DateTime(2010, 1, 1, 1, 00, 00, DateTimeKind.Utc), Temperature = -8.8, StationCode = 71395, WindSpeed = 6}, new { Timestamp = new DateTime(2010, 1, 1, 1, 30, 00, DateTimeKind.Utc), Temperature = -4.4, StationCode = 71801, WindSpeed = 39}, new { Timestamp = new DateTime(2010, 1, 1, 2, 00, 00, DateTimeKind.Utc), Temperature = -9.7, StationCode = 71395, WindSpeed = 9}, new { Timestamp = new DateTime(2010, 1, 1, 2, 30, 00, DateTimeKind.Utc), Temperature = -4.6, StationCode = 71801, WindSpeed = 59}, new { Timestamp = new DateTime(2010, 1, 1, 3, 00, 00, DateTimeKind.Utc), Temperature = -9.6, StationCode = 71395, WindSpeed = 9},};
Weatherdata represents a series of weather information (timestamp, temperature, weather station code, and wind speed ).
Next we will transform weatherdata into a complex event stream of point type:
var weatherStream = weatherData.ToPointStream(Application, t => PointEvent.CreateInsert(t.Timestamp, t), AdvanceTimeSettings.IncreasingStartTime);
User-Defined Aggregation
Question 1: how to calculate the average value of the past five events when a new event arrives?
To implement the semantics in question 1, we need to use countbystarttimewindow. A possible syntax is as follows:
var averageQuery = from win in weatherStream .CountByStartTimeWindow(5, CountWindowOutputPolicy.PointAlignToWindowEnd) select new { AverageTemperature = win.Avg(e => e.Temperature), AverageWindspeed = win.Avg(e => e.WindSpeed) };
(from p in averageQuery.ToIntervalEnumerable() where p.EventKind == EventKind.Insert select p).Dump();
However, when we run the above program, we will get the following running exception:
Unhandled Exception: Microsoft.ComplexEventProcessing.InvalidDefinitionException: Microsoft.ComplexEventProcessing.Compiler.CompilerException: The window-based operator 'Aggregate.1.1' contains one or more aggregates that are not supported in count-based windows.
The error message is that the Count window does not support built-in Aggregate functions. Since built-in Aggregate functions cannot be used, only custom functions are supported. That's right! This is the custom aggregation of streaminsight.
To customize streaminsight aggregation, You Need To derive your own aggregation class from the cepaggregate class (time-insensitive) or ceptimesensitiveaggregate class (time-sensitive. In this example, we select a cepaggregate class that is not time-sensitive:
// Defines the specific implementation of simpleaggregate custom aggregation. It calculates the average value of a series of input elements public class simpleaggregate: cepaggregate <double, double> {public override double generateoutput (ienumerable <double> payloads) {return payloads. average ();}}
Although the implementation of simpleaggregate has been defined, we cannot use it yet. We must also provide the extension method of LINQ to allow query writers to use User-Defined aggregation, as shown below:
// Define the extension method userdefinedaggregate so that you can customize the aggregate function // (simpleaggregate. generateoutput) can be queried using public static class udaextension {[cepuserdefinedaggregate (typeof (simpleaggregate)] public static double userdefinedaggregate <tinput> (this cepwindow <tinput> window, expression <func <tinput, double> map) {Throw ceputility. donotcall ();}}
The extension method is a signature that enables the query writer to use User-Defined aggregation and compile the query.
The final result contains the following three events:
Next, we will introduce the grouping aggregation section in the basic query operations of streaminsight.