About Apache Flink

Source: Internet
Author: User
Tags apache flink

About Apache Flink

Apache Flink is a scalable, open source batch processing and streaming platform. Its core module is a data flow engine that provides data distribution, communication, and fault tolerance on the basis of distributed stream data processing, with the following architectural diagram:

The engine contains the following APIs:
1. DataSet API for static data embedded in Java, Scala, and Python
2. DataStream API for unbounded streams embedded in Java and Scala, and
3. Table API with a sql-like expression language embedded in Java and Scala.

Flink also contains a number of other areas of the component:
1.Machine Learning Library
2.Gelly, a graph processing API and library

Flink System Overview

Flink supports the Java and Scala language data processing API and has an optimized distributed run custom memory management.

Flink Features

1. Fast,flink uses in-memory data flow and integrated iterative processing at run time, which can become very fast for data-intensive computation and iterative computation.


2, high reliability and high flexibility. The flink contains its own memory management components, serialization components, and type inference components.

3. Elegant and beautiful API design

Workcount Scala Sample

case class Word (word: String, frequency: Int)val counts = text.flatMap {line => line.split(" ").map(word => Word(word,1))}.groupBy("word").sum("frequency"

Closure code example

case class Path (from: Long, to: Long)val tc = edges.iterate(10) { paths: DataSet[Path] =>    val next = paths    .join(edges).where("to").equalTo("from") {    (path, edge) => Path(path.from, edge.to)    }    .union(paths).distinct()    next}

4. Compatible with Hadoop, can be run on yarn

Reference

Apache Flink

About Apache Flink

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.