Link: http://blogs.msdn.com/ B /masimms/archive/2010/09/27/streaminsight-synchronizing-slow-moving-reference-streams-with-fast-moving-data-streams-time-import.aspx
One of the common tasks of using streaminsight is to use a reference stream to integrate metadata or reference data from a relatively static data source (such as an SQL Server table. The difficulty of integration reference stream is that it needs to deal with liveliness, that is:
In this figure, we can see that two streams (data streams and reference streams) are connected together to create a connection stream. If the above content is expressed using the LINQ syntax, you can write it like this:
Cepstream<Sensorreading> Datastream = ...;Cepstream<Sensormetadata> metadatastream = ...;VaRJoinedquery =FromE1InDatastreamJoinE2InMetadatastreamOnE1.sensoridEqualsE2.sensoridSelectE1;
We can see that there are only two output events. This is based on the application of the streaminsight engine.ProgramTime (updated by CTI. In this example, the streaminsight engine does the following:
- Data Stream time is T5
- The referenced stream event is T0.
- Because these two events are lingering together, the output can be generated only after all events within this period of time are received. Because the reference is transferred to the data stream, the event can only be output at the speed of the reference stream.
In general, the above actions are undoubtedly correct. But sometimes what we want to implement is inconsistent with what the streaminsight engine implements-for example, the reference stream changes slowly, and we don't want to wait for it all the time. That is to say, we want the output speed of the result to be consistent with that of the data stream. So what should we do?
Set the CTI concept in the data streamIntroductionTo the reference stream:
You can use the cepstream <>. Create () overload method to complete this operation. BelowCodeDisplays the process of referencing a stream using a CSV file as the sample data source and another CSV file as the sample. You can download the entire project from here.
//////////////////////////////////////// /// // Create a time import setting, specifies that the stream will be imported to the CTI settings in datastream // VaR Timeimportsettings = New Advancetimesettings ( Null , New Advancetimeimportsettings ("Datastream" ), Advancetimepolicy . Adjust ); //////////////////////////////////////// /////////////////////////// Create a reference data stream from the refstream.csv file; use the CTI settings in datastream defined in timeimportsettings // Cepstream < Sensormetadata > Metadatastream = Cepstream < Sensormetadata >. Create ( "Refstream" , Typeof ( Textfilereaderfactory ),New Textfilereaderconfig () {Ctifrequency = 1, culturename = Cultureinfo . Currentculture. Name, delimiter = ',' , Inputfilename = "Refstream.csv" }, Eventshape . Point, timeimportsettings );
Note that the preceding syntax assumes that the "datastream" stream exists. Now, if we connect the data stream and the reference stream, we can see a stable output stream. However, if we simply look at the original output of the metadata stream:
VaRRawdata = metadatastream. toquery (cepapp,"Metadatastream","",Typeof(Tracerfactory), Traceconfig,Eventshape. Interval,Streameventorder. Fullyordered );
The following error is returned:
Error in query: Microsoft. complexeventprocessing. managementexception: Advance time import stream 'datastream' does not exist. ---> Microsoft. complexeventprocessing. compiler. compilerexception:Advance time import stream 'datastream' does not exist. |
Why? What does the imported stream mean when it does not exist? I have already defined it! The reason is thatThe imported stream has not been physically connected to another stream.. To solve this problem, you need to connect two streams before binding the output adapter.
///////////////////////////////////// ///////////////// // create a connection between two streams, bind the result to the console // var joinedquery = from E1 in datastream join E2 in metadatastream On e1.sensorid equals e2.sensorid select E1; var query = joinedquery. toquery (cepapp, "joinedoutput" , " ", typeof ( tracerfactory ), traceconfig, eventshape . interval, streameventorder . fullyordered);
Now let's look at the output:
Ref, interval from 06/25/2009 00:00:00 + 00:00 to 06/25/2009 00:00:00 + 00: 00:, mysensor_1001, 1001, 14ref: CTI at 06/25/2009 00:00:00 + 00: 00ref: CTI at 06/25/2009 00:00:09 + 00: 00ref: CTI at 12/31/9999 23:59:59 + |
Why? Where is the output? Note that referencing data is only a sequence of point events. If we want to use it as a reference stream, we need to transform the series of point events into edge events. You can use the altereventduration and clip operators to complete the above work:
// Convert the vertex event in the referenced stream to an edge eventVaREdgeevents =FromEInMetadatastream. altereventduration (E =>Timespan. Maxvalue). clipeventduration (metadatastream, (E1, E2) => (e1.sensorid = e2.sensorid ))SelectE;
This Code does the following:
- Extend the point event duration to infinite time
- Trim any vertex events that arrive with the same Sensor ID. For example, for a given value (1001, sensorid_1001), if the value of another event arrives at a later time is (1001, mysensor), the initial event will be cropped and the new value will be changed to mysensor.
Put all things together as follows:
// Convert the vertex event in the referenced stream to an edge event VaR Edgeevents = From E In Metadatastream. altereventduration (E => Timespan . Maxvalue). clipeventduration (metadatastream, (E1, E2) => (e1.sensorid = e2.sensorid )) Select E; //////////////////////////////////////// ////////////////// // Create a connection between the two streams, bind the result to the console // VaR Joinedquery = From E1 In Datastream Join E2 In Edgeevents On E1.sensorid Equals E2.sensorid Select New {Sensorid = e1.sensorid, name = e2.name, value = e1.value };
The final result is as follows:
ref, interval, 12:00:00. 00.000, 12: 00: 1001, mysensor_1001 , 14ref: CTI at 06/25/2009 00:00:00 + 00: 00ref, interval, 12:00:01. 01.000, 12: 00: 1001, mysensor_1001 , 4ref, interval, 12:00:02. 02.000, 12: 00: 1001, mysensor_1001 , 77ref, interval, 12:00:03. 03.000, 12: 00: 1001, mysensor_1001 , 44ref, interval, 12:00:04. 04.000, 12: 00: 1001, mysensor_1001 , 22ref, interval, 12:00:05. 05.000, 12: 00: 1001, mysensor_1001 , 51ref, interval, 12:00:06. 06.000, 12: 00: 1001, mysensor_1001 , 46ref, interval, 12:00:07. 07.000, 12: 00: 1001, mysensor_1001 , 71ref, interval, 12:00:08. 08.000, 12: 00: 1001, mysensor_1001 , 37ref, interval, 12:00:09. 09.000, 12: 00: 1001, mysensor_1001 , 12/31, 45ref: CTI at 9999/23:59:59 + |