Double 11 singles day has just pulled down the curtain, and the excited heart still stays at that moment:
When the second hand just crossed the 11th zero point, the millions of handcuffs from the online and offline line poured into the annual amnesty in the first time - from entering the venue to clicking on the details page, and then paying the order in one go. .
At the same time as the front desk, everyone’s carnival, the background data traffic is also flooding in the form of a flood peak that breaks through the historical high:
The payment success peaked at 256,000 pens per second
Real-time data processing peaks 472 million / sec
As the most important group data public layer in the real-time data processing task (protecting the real-time data of the business, the core task of the media big screen, etc.), the peak of the total data processing on the day is a record high of 180 million / sec! Imagine that when a million people enter the double 11 singles day venue in one second, they should still be comfortable.
The generation of flow calculations comes from the rigorous demands of data processing timeliness:
Since the business value of data will decrease rapidly with the loss of time, it must be calculated and processed as soon as possible after the data is generated, so that the business can be grasped in the first time through the data. This year's dual 11 stream computing is also facing a test of real-time data peaks.
First, let's compare the peak of this year's (2017) compared to last year's (2016) data peak:
2016: The peak of payment success is 120,000 pens/second, and the total data processing peak is 93 million/second.
2017: The payment success peaked at 256,000 pens per second, and the real-time data processing peaked at 472 million pieces per second. Alibaba Group's data public layer total data processing peak value of 180 million / sec
In the case of double 11 singles day traffic peaks this year, the real-time data update frequency is still stable: from the first second, the rushing party into the order payment, to complete the real-time calculation and delivery to the media full screen full path, seconds Level response. In the face of increasing traffic, real-time data is getting faster and faster. Behind the hold data peak is a comprehensive upgrade of Alibaba's flow computing technology.
Stream computing application scenario
The Data Technology and Products Department is positioned in the Alibaba Data Center. In addition to offline data, the real-time data it produces also serves multiple data scenarios within the group. Including this year (in fact, any previous year) double 11 singles day media large-screen real-time data, real-time data for business staff for business, and various live broadcast products for internal executives and small two, covering the entire Alibaba Group big data business unit.
At the same time, with the continuous development of the business, so far, the daily real-time processing peak value of over 40 million / s, the total number of processing records per day has reached the trillion level, the total processing data volume has reached the PB level.
In the face of the real-time data of massive data, we have successfully achieved data delay control in the range of seconds. In the calculation accuracy, high precision, 0 error has been achieved, and accurate processing has been achieved. For example: on the day of this year's double 11 singles day day, the first record of the double eleven media screen is processed from the transaction table through the stream calculation calculation to reach the media big screen second level response.
Data link in the practice of streaming computing in data
After years of experience in data flooding, our stream computing team has accumulated a wealth of experience in engine selection, optimization performance, and development flow computing platforms. We have also formed a stable and efficient data link architecture. The following figure shows the entire data link:
There are many sources of business data, and the incremental data is acquired in real time through two tools (DRC and middleware logagent) and synchronized to DataHub (a PubSub service).
The real-time calculation engine Flink job is processed in real time by subscribing to these incremental data, and after the ETL processing, the detailed layer is reflowed back to Datahub, and all the business parties will define real-time data for multi-dimensional aggregation, and the aggregated data will be put. In a distributed database or a relational database (Hbase, Mysql), and provide real-time data services through a public data service layer product (One Service).
In the past year, we have done a lot of work in computing engine and computing optimization, achieving the improvement of computing power and development efficiency.
Compute engine upgrade and optimization
In 2017, we carried out a comprehensive upgrade on the real-time computing architecture, from Storm to Blink, and carried out a lot of optimization on the new technology architecture, real-time peak processing capacity increased by more than 2 times, and the smooth processing capacity is more Is more than 5 times higher:
Optimized state management
A large number of states are generated in the real-time calculation process. They were previously stored in HBase and will now be stored in RocksDB. Local storage reduces network overhead and can greatly improve performance. It can meet fine-grained data statistics (now the number of keys can be increased. To the billion level, is it awesome~)
Optimize checkpoint (snapshot/checkpoint) and compaction (merge)
State will become larger and larger as time goes by. If you do a full checkpoint every time, the pressure on the network and disk is very large; so for the data statistics scenario, by optimizing the configuration of rocksdb, using incremental checkpoint, etc. Means can greatly reduce network transmission and disk read and write.
Asynchronous Sink
Changing the sink to an asynchronous form maximizes CPU utilization and provides a large TPS.
Abstract public component
In addition to engine-level optimization, the data center platform has also developed its own aggregation component based on Blink (currently all real-time public layer tasks are implemented through this component). This component provides the functionality commonly used in data statistics to abstract the topology and business logic into a json file. This only needs to be controlled by parameters in the json file, to achieve development and configuration, greatly reducing the development threshold, shortening the development cycle - and then give a chestnut: Before we do the development workload of 10 people / day, now because of componentization The workload has been reduced to 0.5 person/day, which is good news for both the demand side and the developer side, while the standardized components improve job performance.
According to the above ideas and functional precipitation, the flow computing development platform [Red Rabbit] was finally polished.
The platform generates real-time tasks in a simple "Tola" format, without writing a single line of code, providing regular data statistics components, and integrating metadata management, reporting system access and other functions. As a team supporting the Group's real-time computing business, our unique features in the [Red Rabbit Platform] that have been precipitated after the live ammunition of the double 11 singles day have become its unique highlights:
First, the size dimension merge
For example, many real-time statistical jobs need to calculate the sky granularity and hourly granularity. They are calculated separately by two tasks. The aggregation component will combine these tasks and share the intermediate state, reducing network transmission by more than 50%. It also streamlines computing logic and saves CPU.
Second, streamlined storage
For the Keyvalue stored in RocksDB, we designed an encoding mechanism that uses the index to effectively reduce the state storage by more than half. This optimization can effectively reduce the pressure on the network, cpu, and disk.
Third, high performance sorting
Sorting is a very common scenario in real-time. The top component takes advantage of the PriorityQueue in memory and the new MapState feature in blink, which significantly reduces the number of serializations and improves performance by about 10 times.
Fourth, bulk transfer and write operations
When the final result table HBase and Datahub are written, if we write a library every time we process a record, it will greatly limit our throughput. Our components use a time-triggered or record-triggered mechanism (timer function) to implement bulk transfer and mini-batch sink, and can be flexibly configured according to service delay requirements, improving task throughput while reducing service. End pressure.
Data security
For high-priority applications (24-hour uninterrupted service), you need to implement disaster recovery across the equipment room. When there is a problem with a link, you can switch to other links in seconds. The following figure shows the link of the entire real-time common layer. Security architecture diagram:
From data collection, data synchronization, data calculation, data storage, and data services, the entire link is independent. Through dynamic configuration in Oneservice, link switching can be implemented to ensure that data services are not terminated.
The above content is the secret weapon of stream computing technology that guarantees the double peak of traffic flow this year. We are not only innovating, but also hope to precipitate and reuse technology and optimize technology.
As the technology of the flow calculation is constantly changing, we will continue to optimize the upgrade flow calculation technology based on the Ali rich business scenario:
Platformization, service, Stream Processing as a service
The unification of the semantic layer, Apache Beam, Flink's Table API, and ultimately Stream SQL are very hot projects.
Real-time intelligence, the deep learning of the current fire may cause sparks in the future
Real-time offline unification, this is also a big trend, compared to the current real-time set, offline set of practices, real-time offline unification is also what the major engines are trying to achieve.
Finally, everyone is welcome to discuss and progress with us in the message area.