Pinot Architecture Introduction

Source: Internet
Author: User

1. High Level Architecture
1. Purpose: Provide Analysis Services for a given data set
2. Input data: Hadoop & Kafka
3. Indexing technology: In order to provide fast queries, Pinot uses column storage and various indexing techniques (bitmap,inverted index) 2. Data Flow2.1 Hadoop (historical)
1. Input data: AVRO, CSV, JSON, etc.;
2. Process flow: The file on the HDFs is transformed into an indexed segment by the MR Task and then pushed to the historical node of the Pinot cluster to provide the ability to query;
3. Data invalidation: Indexed segment can be configured to retain the date, after the pre-configured expiration date is automatically deleted; 2.2 Realtime
1. Input data: Kafka stream
2. Processing process: Real-time data node through the consumption of Kafka data, in memory to generate indexed Segment, and periodically flush to disk, and then provide the function of the query;
3. Data failure: Real-time node data retention date will be relatively short, such as the retention of 3 days of data, real-time node data will be stored in the historical node before aging;
2.3 Query Routing: SELECT COUNT (*) from table where time > t is converted to the following two-hop query statement:
1. Historical Node:select count (*) from table where time > T and Time < T1
2. Realtime Node:select Count (*) from table where time > T1
Note:1. All queries of the user are sent to Pinot Broker;
2. The user does not need to care whether the query is sent to real-time or historical node;
3. Pinot Broker automatically slices the request according to Query and sends it to the real-time node and the history node as needed;
4. Finally, the results can be automatically merged;
3. Pinot Components Architecture
Note:1. The whole system uses Apache Helix as the management of the cluster;
2. Use Zookeeper to store the state of the cluster while preserving the configuration of Helix and Pinot;
3. Pinot uses NFS to push Segment generated by MR on HDFS to Pinotserver.
3.1 Historical Node
3.1.1 Data Preparation1. Indexed Segment are created in Hadoop
2. The Pinot team provides a Library to generate Segment
3. Data format can be Avro, CSV, JSON
3.1.2 Segment creation on HADOOP1. The data in HDFS is divided into 256/512mb shards.
2. Each mapper creates a new index Segment with a shard
3.1.3 Segment move from HDFS to NFS1. Read data from HDFS, send to Pinot Controller node via HttpPost
2. Pinotcontroller will store Segment in NFS on Mount on pinot Controller node
3. The Pinot Controller then assigns Segment to a Pinot Server
4. Distribution of relevant information is maintained and managed by Helix
3.1.4 Segment move from NFS to historical Node1. Helix will monitor Pinot Server's survival status
2. When a server is started, Helix notifies Pinot server of the segment assigned to the server
3. Pinot Server downloads Segment from controller server and loads to local disk
3.1.5 Segment Loading1. The extracted Segment contain the metadata and the positive and inverted indexes for each column
2. Then load to memory or be mmap to the server depending on the load mode (memory, mmap)
3. After the load is complete, Helix notifies the broker node that the Segment can be used on the server and the broker will route the query to the server at query time
3.1.6 Segment Expiry1. Pinot Control Service has a background cleanup thread to remove expired Segment based on meta data
2. Deleting Segment will clean up the data on NFS in the Controller service and the metadata information on the Helix service.
3. Helix notifies Pinot Server to switch Segment offline, turns Segment offline, and then deletes the data from the local disk.
Note:1. Hadoop jar Pinot-hadoop-0.016.jar segmentcreation job.properties
2. Hadoop jar Pinot-hadoop-0.016.jar Segmenttarpush job.properties
3. The Segment loading process is an offline online switchover triggered by the Helix 3.2 Real time Node
3.2.1 Kafka Consumption1. Pinot creates a resource,pinot that assigns a set of instances from Kafka topic consumption data
2. If Pinot Server is hung, this consumption will be re-distributed to other nodes
3.2.2 Segment Creation1. When Pinot Server consumes a pre-configured number of events, it turns data into offline Segment in memory
2. When Segment is created successfully, Pinot commits offset to Kafka, and if it fails, Pinot regenerates checkpoint from the last Segment
3.2.3 Segment Expiry1. Can only be configured to days, after expiration Segment from live node to history node
Note: The real-time node-generated Segment format is the same as the historical node generation Segment format, which facilitates Segment redistribution from real-time nodes to historical nodes.
3.3 Pinot Cluster Management
3.3.1 Overview 1. All administrative commands are required through Pinot controllers, such as these commands: allocating Pinot Server and Brokers, Creating new Table, uploading new segments
2. All Pinot admin cmd needs to be translated internally via the Helix admin Api into Helix cmd and then Helix cmd to modify the metadata in Zookeeper
3. The Helix Controller, as the brain of the system, escapes the change of metadata into an action set and executes the corresponding action on the corresponding participant.
4. The Helix controller is also responsible for monitoring Pinot server, and when Pinot server is started or hung up, Helix Controller discovers and modifies the corresponding external view,pinot Broker to observe These changes, and then dynamically change the routing of the Table
3.3.2 terminology corresponds to 1. Pinot Segment: Corresponds to Helix Partition, each Segment has a multi-split copy
2. Pinot table: Composed of multiple Segment, Segment belonging to the same table has the same Schema
3. Pinot Server: Corresponds to Helix Participant, mainly for saving Segment
4. Pinot Broker: Corresponds to Helix spectator, which is used to observe changes in the state of the Segment and Pinot servers. To support multi-tenancy, Pinot Broker also acts as a Helix Participant
The 3.3.3 zookeeper is primarily used to store the state of a cluster and also to store some configuration of Helix and Pinot
3.3.4 Broker node is primarily responsible for routing the client-requested query to the Pinot server instance, collecting the returned results of the Pinot server, and finally merging them into the final result, which is returned to the client. Features are:
1. Service discovery: Perceptual Server, Table, Segment, time range, calculate query execution Route 2. Scatter gather: Distributes query to the corresponding server, merges the results returned by each server, and returns to the client Pinot Broker implements a variety of options Pinot Server policies: Segment Uniform distribution strategy, greedy algorithm (Maximum/small Pinot server participation), randomly select the segment server. If Pinot Server fails or times out, Pinot Broker can only return partial results, in the future by supporting other means, such as re-try or sending the same execution plan to multiple segment replicas
4. Pinot Index Segment
4.1 Differences in row and column storage 4.1.1 Row storage features 1. OLTP2. A whole row of data is saved together in 3. Insert/update Easy 4.1.2 Columnstore features 1. Olap
2. Only the columns involved will be read when queried
3. Any column can be indexed: fixed length index, sparse index, etc.
4. Easy to compress
5. With bitmap you can improve query execution performance 4.2 Anatomy of Index Segment4.2.1 Segment Entities1. Segment Metadata: Primarily defines metadata information for Segment, including:
Segment.name
Segment.table.name
Segment.dimension.column.names
Segment.metric.column.names
Segment.time.column.name
Segment.time.interval
Segment.start.time/segment.end.time
Segment.time.unit
......
2. Column Metadata, including: column. Cardinality
Column: Totaldocs
Column: Datatype:int/float/string
Column: Lengthofeachentry
Column: Columntype:dimension/metric/time
Column: IsSorted
Column: Hasdictionary
Column: Issinglevalues
......
3. Creation Metadata (Creation.meta), including: Dictionary (. dict): A coded Dictionary of columns
Forward index (. sv.sorted.fwd): Single Value sorted Forward index, prefix compression indexes
5. Query Processing

5.1 Query Execution Phases5.1.1 query parsing using ANTLR as a parser converts PQL into Query parse tree
5.1.2 Logical Plan phase transforms the query parse tree into a Logical plan tree by querying the metadata information
5.1.3 Physical Plan phase further optimization, specific execution plan based on Segment information
5.1.4 Executor Service executes physical operator tree on the corresponding Segment
5.2 Pqlpql is a subset of SQL and does not support joins, subqueries are not supported

Ref:https://github.com/linkedin/pinot/wiki/architecture

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Pinot Architecture Introduction

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.