Basic streaminsight concepts

Source: Internet
Author: User

Microsoft streaminsight is an application platform for developing and deploying time-space event streams. Streaminsight includes a time-space data stream model, which provides more unified and query language functions. It processes events and ensures output consistency. Thanks to its real-time low-latency output, streaminsight monitoring, analysis and Association of data streams from multiple sources to extract meaningful rules and trends.

Traditional database technology has developed rapidly and has been widely used. However, it cannot process a new type of data generated in applications such as network routing, sensor networks, and stock analysis, that is, stream data. streaming Data is characterized by continuous arrival of data, fast speed, and large scale. Sensor data and information are generated in the form of data sequences (streams) and must be processed continuously in real time. This requires a new data stream Management System, (DSMs) for storing, managing, and processing these data. most applications on data streams are monitoring-type, and these monitoring tasks are generally a combination of simple events. However, a simple Data Stream Management System (DSMs) can only process filtering and aggregation of simple events, the concept of complex events is not supported. Therefore, it is necessary to build a Complex Event Processing System (CEP) on the data stream to meet such requirements. Because domain practice is the result of years of experience in a specific business unit, it is difficult to expect a single stream engine (out-of-the-box-with no development done) to address the needs of all fields. However, the data stream system is expected to have a scalable mechanism to seamlessly integrate the logic of specific fields into the query pipeline. As a result, streaminsight is designed as an extensible system to accept and execute user-defined modules (udms ).

Stream and event)

A physical stream is a series of events. The event Ei = ⟨ P; C producer is a notification from the outside, including: (1) payload P = ⟨ P1 ;:::; PK metadata, and (2) control parameter C provides metadata. This control parameter includes the time when an event is generated, the event that affects the output within a period of time, and the output of the event within the period of c = ⟨ le; re interval, range [le; Re) defined, the left endpoint (LE) of the range is called the start time. It is the time when an event occurs and also the timestamp of the event. Assume that the duration is X time units, and the right endpoint of the time is called the end time (endtime). In short, Re = le + X.

Compensations: streaminsight allows you to compensate (or correct) for early events. This is an event that occurs at any point on the right using the third control parameter renew, the deleted event (to be a full contraction), represented as renew = Le (I. E ., zero lifetime ).

Canonical History Table: Canonical History Table (CHT) is the logical representation of the event stream. Each CHT consists of a cycle (Le and RE) and a payload. All time is the application time, not the system time. Therefore, the streaminsight model is a time-varying relational data stream. The following is a CHT example. the CHT is an actual physical event (whether new or undo) and the control parameter C = role le; Re; Renew callback ⟩.

Detecting time progress: we need to ensure that an event is not a random sequence, which is implemented using time-based punctuation. A time-based punctuation represents a special event, these punctuation marks are called current time increments (or ctis) in streaminsight ).

Stream Query and operations

CQ is composed of a tree-like operation. Each execution has an input stream and an output stream. The query in streaminsight is represented by LINQ, And the streaminsight operation is good, they also define semantics for CHT. This makes related time operator algebra deterministic, even when data arrives out of order. Enter the system through the input adapter and output the system through the output adapter.

There are two main operations: Span-based and window-based. A span-based operation accepts the execution of some calculations from an input event and generates the event output. Span-based operations include filter, project, and temporal join. On the other hand, for example, Count, top-K, sum, and so on, reporting a result (or series of results) for each independent window is based on the window. The result is that all events are computed in this window. Streaminsight supports several types of Windows: Snapshot (sliding), hopping, tumbling, and count-based windows.
In addition, a stream can be output to multiple operations, called multicast, while the Union operator is used to merge multiple streams. In addition, streaminsight will support user-defined operations to represent custom computing in the future;

Http://www.cnblogs.com/StreamInsight/

Http://streaminsight.codeplex.com/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.