Boot Camp Series-the foundation for massive data storage

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Boot Camp Series-the foundation for massive data storageAugust 12, 2015 09:24 Read 16831

As the bottom data and business support Department of Weibo, Weibo platform has experienced 5 years of development. With the growth of data and business explosion, we have encountered many challenges in the storage of massive data, along with the accumulation of rich experience.

This boot camp, the audience is the fresh graduates, the purpose of the new students to systematize and targeted understanding of the core technology platform and core business, so that the new students in the boot camp after the end of the platform to the bottom of the structure and business have a certain understanding.

This article is mainly for the new students to introduce one of the core technologies of the platform--mass data storage, mainly introduced in the massive data storage in large-scale distributed system architecture changes and design.

Course Outline:

1. Course Objectives

2. Storage Services Overview

3. mysql and MySQL distributed architecture design

4. Redis and redis distributed architecture Design

5. Thinking and discussion

I. Objectives of the course

1. Understand the storage services overview, and the differences between RDBMS and NoSQL

2. Understanding of MySQL, Redis, hbase Basic implementation mechanism, features, applicable scenarios

3. Understand the large-scale distributed service solution of several storage products

4. Learn how to use the platform MySQL,Redis Client components

5. Understand the issues that you want to pay attention to for MySQL and Redis Distributed system design

6. Understand the platform of several typical cases

7. Understanding of several storage products in the platform of the custom modification and terminology

Ii. Overview of storage services

1. The relational database is a data service based on the Entity Relationship model (Entity-relationship models) with the following characteristics.

Suitable for storage of structured data

Query Language sql,insert Delete Update select
Mainstream relational database is persistent storage system, system performance and machine performance are more correlated
Several types of mainstream relational databases
- MySQL
- Oracle
- DB2
- SQL S erver
Performance
- limited to server performance, is disk performance
- limited to the complexity of data
- Common SSD disk Server, single-machine read performance up to million/s

large Internet services mostly use MySQL as a relational database, and the core business of the Weibo platform (such as the Weibo content user Weibo list) is the same.

This training will also focus on MySQL and its distributed architecture solutions.

2. NoSQL(not only SQL) database, referring to the non-relational database, the rise of the opportunity is that the traditional relational database to deal with large-scale, high concurrency capacity is limited, and the universality of NoSQL advantage can compensate for the lack of relational database in this area

Storage of unstructured data, semi-structured data
Performance

The industry uses NoSQL as a memory-focused service, constrained by I/O and network, typically request response time in milliseconds, single-machine QPS at level 100,000 (related to data size and storage complexity)

Some common types of nosql products

K-v (Memcached,Redis), this kind of nosql products in the Internet industry the widest range of applications. Memcached provides K-V memory storage with LRU elimination policies, while Redis provides memory and persistent storage that supports complex structures (List, hash, etc.)
Column (hbase,Cassandra), HBase is a distributed database cluster system based on the Columnstore
Document (MongoDb)
Graph (neo4j), the largest and most complex graph model is a human relationship, in theory is described in graph and graph database storage is most suitable, but the current scale of data, system performance still needs to be optimized

In the web2.0 era, the importance of NoSQL products in the Internet industry with the development of the Internet and mobile Internet and the proliferation of large-scale Internet applications , in response to large-scale, high concurrent access, Most of them introduce NoSQL products, in which memcached and Redis are widely used for their high maturity, high performance and high stability. Weibo platform also has a multi-scale nosql cluster, microblogging core feed business, relationship business also rely on memcached and Redis to provide high-performance services

This session will focus on Redis and its distributed architecture.

Third, MySQL

Microblogging platform core business data are stored in MySQL, currently has thousands of large-scale clusters, a single core business data breakthrough according, a single core business peak of 100,000 levels per second, the write is also million per second.

In the context of massive data and growing volumes of data, we have the experience to design a distributed MySQL system that meets high concurrency (W/R), low latency (10ms level), high availability (99.99%), and continues to tackle this problem, And our course will focus on MySQL, a massive data store.

1. About MySQL

MySQL is a relational database system RDBMS
Using SQL as the query language
Open source
Storage Engine

Innodb support transactions, row locks, write performance slightly worse
MyIsam does not support transactions, reading and writing performance is slightly better

Meets ACID characteristics
Primary key, unique key, foreign key (large scale system generally not used)
Transaction, a transaction is a series of operations, either completely executed or completely not executed
Service, port, instance, all refer to a MySQL database initiated by the server
Performance

As disk performance increases, read and write performance increases, but costs increase
Write performance of the database: write TPS increases as concurrency increases, but rises to a certain bottleneck and slows down to the point of concurrency and the TPS declines sharply

Thinking: What if there is a higher performance (exceeding the three storage medium concurrency levels)?
- Custom storage: Custom storage for service features, custom storage for your business scenarios. However, the general industry mature products to consider the versatility and will sacrifice some performance
- Introducing NoSQL

2. Architecture changes from single-machine to cluster

In the early days of business, Web services are small and generally have the following characteristics
- Service original type period, the user base is small, a variety of business common resources, Daily average write millions, read tens
- data size small, standalone performance to meet demand
- User size, development Center of gravity bias iteration speed
  
  Taking into account the small business characteristics mentioned above, in order to save resources cost and development cost, can adopt multiple business hybrid deployment form

When the number of users, data volume, traffic increased (twice times less), database pressure, how to increase the MySQL throughput in a limited degree?
- SQL optimization
- Hardware Upgrades
Pressure is also growing within a limited range, with simple, low-cost optimization that can make it possible to improve limited service performance in Chengdu
Business continuity, read performance bottlenecks && business impact, multiple business resource preemption, how to quickly solve business preemption problems to improve service performance?
- Vertical Direct split-split data by Business
Split by business to isolate the business, increase the pressure on the timeline, and not affect the performance of the content database service; After splitting, the resources increase and the service performance increases correspondingly.
With the continued development of the business, reading performance bottlenecks, read-write interaction, how to ensure the increase in the volume of read requests, do not affect write performance? How does the increase in write request volume ensure that read performance is not affected? (Write performance problems can result in data loss)

Read-write separation, write-only master,master and slave automatic synchronization; Read only with slave as the source

After the read and write separation, the slave only focuses on the read request, the read performance is optimized, and the write performance of the in-house standalone master service is optimized.

One/two M-s server performance is ultimately very limited, how to optimize when single instance service performance cannot host the volume of requests on the line?
- upgrade to a master multi-slave architecture
- a master hosts all write request, the theoretical master performance is not changed
- Multiple slave share read requests and read performance increases n times
As the volume of business grows, the following changes are seen in the service:

Data volume growth means that the original storage space is low
Write volume growth means there is a bottleneck in master write performance
The increase in read volume means that there are bottlenecks in slave read performance, but there is a limit to expanding slave: on the one hand m-s replication performance is risky; On the other hand, the cost of expanding slave is higher

How to optimize to solve the above problem?

Split horizontally

Business experience data volume growth, read-write request volume growth, the database service has evolved into a distributed architecture, a business data, how to reasonably distribute to the above complex distributed database is the next problem to be solved

3. How do you design a database based on the evolution to the final architecture?

Distributed Database Design

Hash splitting method, both according to the hash rule, the data read and write requests scattered to multiple instances, see the above horizontal split
Time splitting method, based on the determination of the time Division rules, the data by the time period of the scattered storage in multiple instances

Data is distributed within a distributed database, where one instance stores 1/n data, and one instance needs only one database to meet functional requirements.

Experience several years of development, the scale of data will multiply, when the need to re-level expansion (4 too →8 units), the need for a program, the data is divided into two, the cost of data migration is high, the need for developers to intervene.

If a database is designed with a pre-built 2 database, and each database stores 1/N/2 data, a database can be fully migrated when it needs to be scaled up, without the need for developer intervention.

on a DB instance, multiple databases are established, called logical libraries.

Logical library Design
- The logical library is relative to the object Ricou Concept: The physical library is only an instance of the database service, and the logical library refers to multiple databases created on a DB instance.
- Defining Logic The purpose of the library is to facilitate expansion. If 4 database server, the physical library on each platform contains 8 logical libraries, when the system has capacity, write volume bottleneck, you can add one more than 4 servers, synchronize the database directly, without the need to write the application to use to import

4. Based on the above distributed database table split design mode

Hash split: Hash the data of a database, scatter hash to multiple tables
- for datasets with limited data size
- Suitable for data sets that are growing at a speed of control
Hash model in conjunction with the database
- Based on the UID hash to the database, and then hash to the Tb_5 table under database _1
By time splitting, the data for the same period is stored in a single table and multiple time periods are stored in multiple tables. For example, by month, each month table stores one months of data, if you need to get all the data need to span multiple month table
- suitable for storage Increase faster datasets
- But querying data requires a table that spans multiple time periods
Combine The hash model of the database
- According to the UID hash to the database db_1, and then find 201507 201506 to get two months of data
Think of a question: how can you quickly locate a person's 1000th to 1100 data?

Secondary index quick position (first level) index position
- describe the distribution of data in and out of the index
- for quick positioning/shrinking Small Query Scope
- General List of fields: Uid, date_time, min_id, Count

5. What happens when a server goes down?

Slave (one master and more) downtime?
- Residual health slave no risk, no urgent action, routine repair
- Switch traffic to disaster room (if Disaster room is available)
- Emergency expansion [priority], restart, replace
- Lossy downgrade partial request
Master outage?
Due to the uniqueness of master data, the exception of master will directly cause data write failure
- Fast Speed Offline Master
- offline One salve read service (if slave performance is risky, then rapid expansion)
- Upgrade slave to Master
- the synchronization mechanism of new master and slave in effect

6. So complex distributed database + database splitting + data table splitting, how easy to operate the client side?

Most teams that use distributed database services have their own database client components, and the microblogging platform uses the following tiers of build to perform distributed database operations

Get Tablecontainer, get all table definition rules
get from Tablecontainer by table name the specified Tableitem
Tableitem associating multiple Jdbcteplate-datasource
the correct jdbctemplate and SQL are obtained by hash calculation by Tableitem combination uid, ID and date.
Using JdbcTemplate for SQL operations

7. Precautions

MySQL design should pay attention to the problem
- Table Character Set selection UTF8
- The storage engine uses InnoDB
- Using Varchar/varbinary to store variable-length strings
- Do not store pictures, files, etc. in the database
- The amount of data per table is controlled below 20000W
- vertical splitting of the business ahead of time
Problems that MySQL queries should encounter
- Avoid using Storage process, triggers, functions, etc.
- - let the database do the best thing
  - Reduce Business coupling Avoid server-side bugs
- avoid joins with large tables
- - MySQL is best at a single-table primary key/Index query
  - join consumes more memory, resulting in temporary tables
- Avoid perform mathematical operations in the database
- - MySQL is not good at mathematical arithmetic
  - Unable to use index
- Reduction with Data number of interactions in the library
- - Select condition query to take advantage of index
  - The condition of the same field is determined to use in instead of or

8. mysql Exercises

Design a user basic information store database of 2000qps,1 billion per second. Complete database design, database setup, Web write query service build.
Define user information structure: Uid,name,age,gender
Given 2 MySQL instances, create 2 databases per instance
Create 2 long tables per database
Write code that implements data manipulation of databases and tables in hash form

Four,Redis

Weibo as a representative of the web2.0 era of SNS services, with a large user base and hundreds of millions of active users, but also bear the high concurrency, low latency service performance pressure.

as a typical application of the NoSQL series, Redis is used to solve the web2.0-era relational database performance bottleneck with its high maturity, high availability and high performance. For example, the number of requests for microblogging services to reach millions/s, hundreds of relational databases to cope with such a high QPS, and the request time-consuming and volatile; however, using a NoSQL product such as Redis requires only 10 levels of cluster to respond, and the average request takes less than 5ms.

This chapter introduces Redis and its large-scale cluster architecture.

1. Introduction to Redis

Redis is a k-v storage system that supports memory storage and persistent storage
Supporting complex data structures, Redis native supports several types of commonly used storage structures, compared with memcached, which supports only simple key-value storage, such as
- Hash : Store hash structure data
- List : Number of storage lists according
Single Thread

High performance to avoid excessive consideration of concurrency, lock, context switching
Good data consistency, e.g. for a count of concurrent operations, there is no ' reader writer ' problem
Single-threaded cannot take advantage of multi-core, can be used to fully utilize multi-core by initiating multiple instances

Native Support Master-slave
Expiration mechanism

Passive expiration--client When accessing key, determine whether the expiration time selection expires
Active expiration-default use VALATILE-LRU
- VOLATILE-LRU: Select the least recently used data set from a dataset that has an expiration date
- Volatile-ttl: Culling data that will expire from the set of expired data sets
- Volatile-random: Choose any data culling from the data set that has an expiration time
- ALLKEYS-LRU: Pick the least recently used data culling from all data sets
- Allkeys-random: Arbitrary selection of data from all datasets Retire No-enviction (eviction): Prohibit eviction of data

Redis dictionary table structure
- Key Dictionary table ha SH table structure, having a hash structure means that there is a need to rehash,rehash on demand for a period of time, there is a doubling of memory overhead
- Value structure, the value
- expire table structure that stores the key, and store the expiration time of the key
- extra overhead 60b+
- persisted
- - AOF
Difference from MC
Platform Customization Counterservice
- Modify has H table for, incremental hash tables, such as every 100 million keys stored in a table, data more than 100 million (or a critical scale) to open the next 100 million Table
- obsolete Expire,redis active expiration policy cannot ensure that hot data is retained in memory like the LRU policy of MC, and cold data is removed from the cache. Most of our scenarios need to control the amount of data in Redis without breaking memory limits

2. Redis's main data structure

String (Key-value)
Hash (Key-field-value)
List (key-values)
Set (Key-members)
SortedSet (Key-member-score)

3. What is the distributed deployment scenario for Redis? What are the similarities and differences with MySQL

Reids is similar to MySQL because of its m-s characteristics, so the distributed deployment scenario is equivalent to MySQL
Single-instance-Small business or early business
Master-Slave--ha, read/write separation
One master multi-slave-read performance bottleneck
Data split horizontally-Insufficient capacity | Write performance bottlenecks
Common distributed Deployment scenarios

4. How does a distributed Redis architecture achieve high availability (HA)?

Using M-s high-availability scheme, the reason is also due to its master-slave characteristics
Service domain name is necessary, the current large-scale Redis cluster application mostly uses the domain name method

5. Basic Capacity Planning

Space =key Number * Single occupancy (k-v occupancy + extra space) User space = 500 million user *200b (average) =100g Weibo counter = (50 billion + expected 2 new 30 billion) *10b=800g
Traffic = Service Traffic * One-time access to a resource's hit volume Weibo counter feed traffic =10000/s * 20 = 200,000/s

6. Counterservice

Weibo has a huge data base, so the amount of data that needs to be stored is extremely large.

such as Weibo counters, with capacity records, all stored in Redis, need T-level space, high cost

So we've customized redis to fit most data with small, fixed-size data

Optimize storage space
Storage in the form of segmented hash buckets to avoid rehash (segmented storage requires key to increment)
Superior Space Occupancy Effect of

key:8b
Value: Custom

7. How can I support client access under the above distributed architecture? (redis3.0+ supports Redis Cluster)

Reids has multiple open source client support, and we are using Jedis
In addition to providing the client, the Jedis provides the operating package as well as the M-s component
The Redis series components we use are as follows:

8. Redis Exercises

Use Redis to achieve a user-friendly list and the likes count feature
Launch two Redis instances using a test environment
Use Redis to store user's likes list [{uid, time} ...] and likes Count Uid-count
Complete the business logic of liking, including like, Cancel likes, view like list, view likes count

v. Thinking and discussion

1. Memcache when the capacity reaches the bottleneck, it intercepts the LRU chain to free up space. Here are some questions to consider about the key expiration mechanism for Redis:

What happens when Redis is full? How can we avoid these problems?
Why does our custom Redis discard expire tables?

2. What are the different scenarios for MySQL and Redis?

Data hot and cold?
Data size?
Data volume level?
Data growth?
Is it persistent?
Traffic (read/write)?
Request performance requirements?

------------------Boot Camp Introduction ------------------

Micro-blogging platform Boot camp activities are organized within the microblog platform for new recruits to the team into the training course, the goal is the integration of the team, including the integration of people, atmosphere, technology integration. at present, there are 4 activities, many students quickly grow into the backbone of the platform technology.

Weibo platform is very focused on team members into the growth of the team, where someone to help you into, someone with you to grow up, but also welcome the small partners to join the microblogging platform, welcome private messages consultation.

------------------Instructor Profile ------------------

Bi Jiankun, @bijiankun Weibo platform and Big Data Department-platform Research and development system development engineer, July 2012 graduated from Harbin University of Technology, school recruit into the micro-bo work so far , has been responsible for Weibo feed, praise, Reviews and other low-level service development and program review. Focus on the architecture design and optimization of large-scale systems, and the guarantee of service stability under large-scale systems. The first stage of the boot camp.

Boot Camp Series-the foundation for massive data storage

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Boot Camp Series-the foundation for massive data storage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Boot Camp Series-the foundation for massive data storage

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support