1. Overview of
Cloud Database
1.1. Cloud computing is the foundation of the rise of cloud databases
1.2. Cloud database concept
Cloud database is a database deployed and virtualized in a cloud computing environment. Cloud database is an emerging shared infrastructure method developed under the background of cloud computing. It greatly enhances the storage capacity of the database, eliminates the repeated configuration of personnel, hardware, and software, and makes software and hardware upgrades change. It's easier. Cloud database has the characteristics of high scalability, high availability, multi-tenancy and effective distribution of supporting resources.
1.3.
Cloud database is an ideal choice for personalized data storage needs
Different types of enterprises have different storage requirements, and cloud databases can well meet the personalized storage requirements of different enterprises:
First of all, cloud database can meet the massive data storage needs of large enterprises
Second, the cloud database can meet the low-cost data storage needs of small and medium-sized enterprises
In addition, the cloud database can meet the dynamically changing data storage needs of enterprises
Whether to choose self-built database or cloud database depends on the specific needs of the enterprise
For some large enterprises, self-built databases are usually used at present
For some small and medium-sized enterprises with limited financial resources, the IT budget is relatively limited. The cloud database, a database service with zero investment in the early stage and maintenance-free in the later stage, can well meet their needs
1.4. The relationship between
cloud databases and other databases
From the point of view of data model, cloud database is not a brand new database technology, but only provides database functions as a service
The cloud database does not have its own data model. The data model used by the cloud database can be the relational model used by the relational database (Microsoft SQL Azure cloud database and Alibaba Cloud RDS all use the relational model), or it can be a NoSQL database The non-relational model used (Amazon Dynamo cloud database uses "key/value" storage)
The same company may also provide multiple cloud database services with different data models
When many companies develop cloud databases, the back-end databases directly use various existing relational databases or NoSQL database products
2. Cloud database system architecture
2.1. Overview of UMP System
UMP system is a low-cost and high-performance MySQL cloud database solution
In general, the UMP system architecture design follows the following principles:
Maintain a single system external entrance and maintain a single resource pool for the system
Eliminate single points of failure and ensure high service availability
Ensure that the system has good scalability, and can dynamically add or delete computing and storage nodes
Ensure that the resources allocated to users are also elastic and scalable, and resources are isolated from each other to ensure application and data security
2.2 .UMP system architecture
The roles in the UMP system include:
Controller server
Proxy server
Agent server
Web console
Log analysis server
Information Statistics Server
Yugong System
Open source components that depend on include:
Mnesia
LVS
RabbitMQ
ZooKeeper
Mnesia
Mnesia is a distributed database management system
Mnesia supports transactions, supports transparent data sharding, uses two-stage locks to implement distributed transactions, and can linearly scale to at least 50 nodes
Mnesia's database schema (schema) can be dynamically reconfigured at runtime, and tables can be migrated or replicated to multiple nodes to improve fault tolerance
These features of Mnesia make it used to provide distributed database services when developing cloud databases
RabbitMQ
RabbitMQ is an industrial-grade message queuing product (functions similar to IBM’s message queuing product IBM Websphere
MQ), used as a message transmission middleware to achieve reliable message transmission
The communication between each node in the UMP cluster does not need to establish a special connection, it is realized by reading and writing queue messages
Zookeeper
In the UMP system, Zookeeper mainly plays three roles:
As a global configuration server
Provide distributed locks (select a cluster "master")
Monitor all MySQL instances
LVS
LVS (Linux Virtual Server) is a Linux virtual server, a virtual server cluster system
UMP system uses LVS to achieve load balancing within the cluster
LVS cluster adopts IP load balancing technology and content-based request distribution technology
The scheduler is the only entry point of the LVS cluster system. The scheduler has a good throughput rate. It transfers requests to different servers for execution in a balanced manner. The scheduler automatically shields server failures, thereby forming a group of servers into a high High-performance, highly available virtual server
The structure of the entire server cluster is transparent to the client, and there is no need to modify the client and server programs
Controller server
The Controller server provides various management services to the UMP cluster to realize cluster member management, metadata storage, MySQL instance management, fault recovery, backup, migration, capacity expansion, etc.
A set of Mnesia distributed database services are running on the Controller server, which stores various system metadata, including cluster members, user configuration and status information, and the mapping relationship between user names and back-end MySQL instance addresses (or called "Routing Table") etc.
When other server components need to obtain user data, they can send a request to the Controller server to obtain data
In order to avoid a single point of failure and ensure the high availability of the system, multiple Controller servers are deployed in the UMP system. Then, the distributed lock function of Zookeeper helps to select a "master" who is responsible for the scheduling and monitoring of various system tasks
Web console
Web console provides users with system management interface
Proxy server
The Proxy server provides users with services to access the MySQL database. It fully implements the MySQL protocol. Users can use the existing MySQL client to connect to the Proxy server. The Proxy server obtains the user’s authentication information and resource quota restrictions (such as QPS, IOPS (I/O Per Second), maximum number of connections, etc.), and the address of the background MySQL instance. Then, the user's SQL query request will be forwarded to the corresponding MySQL instance. In addition to the basic functions of data routing, the Proxy server also implements many important functions, mainly including shielding MySQL instance failures, separation of reading and writing, sub-database sub-table, resource isolation, user access log recording, etc.
Agent server
The Agent server is deployed on the machine running the MySQL process to manage the MySQL instance on each physical machine, perform master-slave switching, create, delete, backup, and migrate operations. At the same time, it is also responsible for collecting and analyzing the statistical information of the MySQL process, Slow Query Log and bin-log
Log analysis server
The log analysis server stores and analyzes user access logs from the Proxy server, and supports real-time query of slow logs and statistical reports over a period of time
Information Statistics Server
The information statistics server regularly collects the collected user connections, QPS values and the process status of the MySQL instance with RRDtool for statistics. The statistical results can be displayed visually on the web interface, and the statistical results can also be used as future resource allocation and automation. Basis for migration of MySQL instance
Yugong System
Yugong System is a tool for full replication combined with bin-log analysis for incremental replication, which can realize dynamic expansion, shrinkage and migration without downtime