Flink basic principle and Application scenario analysis

Source: Internet
Author: User
Tags data structures apache flink hadoop ecosystem

Apache Flink is an open source distributed, high performance, high availability, accurate streaming framework. Supports real-time stream processing and batch processing


Flink characteristics

Support for batch processing and data Flow program processing gracefully and smoothly support both the Java and Scala APIs support both high-throughput and low-latency support for event processing and unordered processing via the Satastream API, based on the dataflow data flow model at different time semantics (time time, processing time) Supports flexible windows (time, technology, session, custom triggers) only once for fault-tolerant guarantee automatic back-pressure diagram processing (batch) machine learning (batch) complex event processing (streaming) built-in support for iterative program (BSP) Efficient custom memory management in the dataset (batch) API and robust switching capabilities in in-memory and Out-of-core compatible with Hadoop's MapReduce and Storm integrated Yarn,hdfs,hbase and other components of the Hadoop ecosystem


Flink's application Scenario

Optimise real-time search results for e-commerce: All Alibaba's infrastructure teams use flink real-time new product details and inventory information to provide users with a higher level of relevance. Real-time streaming services for data analytics teams: King provides real-time data analysis through the Flink-powered data analytics platform, dramatically reducing time-to-watch network/sensor detection and error detection from game data: Bouygues Telecom is one of the largest telecommunications providers in France, Use Flin to monitor their wired and wireless networks for fast fault response. Business Intelligence Analytics Etl:zalando uses Flink to transform data to facilitates to the data warehouse, transforming complex conversion operations into relatively simple and ensuring that analytics end users can access data faster.

Based on the above case studies, Flink is ideally suited for:

Multiple data sources (sometimes unreliable): When the data is generated by millions of different users or devices, it is safe to assume that the data will arrive in the order in which the events were generated, and that in the case of the upstream data failure, some events may be a few hours behind them, and the data that is late will need to be calculated, and the result is accurate Application state Management: When programs become more complex than simple filtering or enhanced data structures, managing the state of these applications at this time will become more difficult (for example: counters, windows of past data, state machines, built-in databases). Flink provides tools that are effective, fault-tolerant, and controllable, so you don't need to build these features yourself. Fast data processing: There is a focus in real-time or near-real-time use case scenarios where data should be accessible from the moment the data is generated. When necessary, Flink is fully capable of meeting these delays. Massive data processing: These programs need to be distributed across many nodes to support the required scale. Flink can run seamlessly in large clusters, just like in a small cluster.


for more information on big data, videos and technical exchanges, please Dabigatran:

QQ Group No. 1:295,505,811 (full)

QQ Group number 2:54902210

QQ Group number 3:555684318




Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.