Windows Azure Cloud Service Base –RDBMS partitions

Source: Internet
Author: User
Keywords Cloud services Azure azure rdbms

Editorial Staff Note: This article is written by Azurecat Cloud and the senior project manager of the Enterprise Engineering Group, Shaun Tinline-jones and Chris Clayton.

The cloud service base application, also known as "csfundamentals," shows how to build Azure services supported by the database. This includes usage scenarios that describe logging, configuration, and data access, implementation architectures, and reusable components. The code base is designed to drill down into the best practices for delivering scalable, available services on Azure, based on a production deployment from the Windows Azure customer consulting team.

Most companies are currently working on their cloud plans, but the business drivers for specific solutions vary, such as reducing costs and dramatically increasing agility and scaling. The "vertical scalability" strategy has been replaced by "horizontal scalability" when the solution attempts to achieve "cloud scale". The former improves capacity by upgrading hardware, which increases the number of computers that collectively complete a particular task. A good example of this trade-off is choosing whether to create a Web farm with many servers that provide the same Web site content, or to have a single body try to handle the load.

Most people begin to implement this horizontal scalability plan from the compute node, but overlook more complex and potentially more important state hierarchies, such as relational database management systems (RDBMS) and caching. These services are typically IO-intensive and have only a single instance. One method of implementing horizontal scalability in the state hierarchy is called partitioning, which is to divide the RDBMS data into multiple databases logically, and each database usually takes the same table structure. For example, an employee information table can be split up to three different databases, each of which stores information about employees in different departments.

The advantages of partitioning are much more than a capacity-related scenario. This article will focus on the RDBMS partitions implemented in the Azure SQL database platform and primarily for OLTP scenarios. Examples of the advantages of partitioned database structures include:

limit of threshold or throughput limit hit frequency is too high. The amount of data is too large (index rebuild, backup, etc.). A database is not available to affect all users (while a single partition is not). The database is difficult to scale up and down as needed. Some business models, such as multi-tenant or software as a service scenario.

When using a multi-tenant database (such as a Windows Azure SQL database) as a service solution, the quality of Service (QOS) control of the client is typically limited under various conditions. Restrictions usually occur when resource pressures are climbing. Partitioning is a key strategy to help reduce resource pressures, and it can spread the load that typically affects a single server across multiple servers, each containing one partition. For example, assuming that the load is evenly distributed, creating five partitions can reduce the load on each database to about 20%.

But anything that becomes more powerful will inevitably make sacrifices. Partitioning can add complexity to a number of key areas and therefore requires better planning. These key areas include:

identity columns for all partitions should remain globally unique to prevent future business requirements from requiring fewer partitions. If the identity of all partitions is not unique, a conflict occurs when two partitions merge. Referential integrity cannot refer to rows in other partitions or enforce relationships with these lines because they belong to a separate database. You should avoid querying across partitions as much as possible, because it requires querying each partition and merging the results. Cross-section "fan out" queries are expensive not only from a performance perspective, but also increase the complexity of the partitioning framework that provides support for them. If you must query across partitions, the usual strategy is to make asynchronous queries for each partition.     However, sometimes synchronous query methods can better control the size of the result set.

In most cases, partitioning is a data access layer (DAL) concept that abstracts complex data structures for higher-level, complex application logic.

How you define "tenant" is one of the most important decisions you make when you build a partitioned architecture. A tenant is the largest unique data classification that must be on the same partition. Queries that restrict execution in a single tenant are usually faster because the query does not need to perform a fan-out operation under normal operating conditions. Some of the factors that affect the corresponding tenant definition decisions are as follows:

understanding of the higher level of application code for identifiers. The ability to perform most core business transactions at this level. The ability to avoid limiting routine day-to-day operations at the tenant level.

To illustrate these concepts and ideas more vividly, Windows Azure Customer consulting team is in the cloud service base (CSF) packet (http://code.msdn.microsoft.com/ A basic partitioned Data access layer (DAL) is built in cloud-service-fundamentals-4ca72649.

The tenant is defined as a single user in CSF. Factors contributing to the selection of this tenant include:

Most core business requirements do not require queries across multiple users. A partition is not available and affects only a group of groups of users, while other users continue to use the system normally. The number of users on a single partition can be controlled by the number of enterprises that can be sustained.

The definition and implementation of a tenant ensures that cross database transactions are not required. In Figure 1, we call this dataset the Shardlet, the data model transaction boundary.

Figure 1-Data Model transaction boundary

The first time a user connects to a database in a session, you can perform a series of simple queries to see if any functionality is unavailable due to offline partitioning.

To simplify the partitioning method demonstrated in CSF, we decided to create a partition set that has enough storage space to meet foreseeable future capacity requirements. By choosing a fixed size, you no longer need to demonstrate the increase and decrease in the number of partitions, nor do you need to perform operations such as tenant migrations. By executing a hash algorithm on the tenant name, you can generate an integer that can be used to query the range of matches in the partition diagram. CSF uses a range based mechanism in which a specific partition (captured in the partition map) is assigned a range of these numbers.

If you need to add or remove partitions from a partition set, the tenant needs to become unavailable before migrating to a new partition. Because of this severe limitation, the partition set is required to be heavily configured for the first time the partition set is created to mitigate or eliminate the need to perform complex partition management.

This solution requires that the data Access layer (DAL) detects the tenant ID to determine the location of the tenant in the partition set. If you execute a query that includes a partition that is not available, the entire query fails. If the DAL does not include a tenant ID, all partitions must be queried, which can result in higher failure probabilities and lower performance.

Some preliminary work is currently underway to provide more sample code to demonstrate more advanced partitioning methods. These examples can improve the following areas:
Passive and active zoning management.

Global uniqueness and identity management. Tenant migration between partitions within a partition set. Expansion and contraction of the partition set. Improvements to queries that cannot detect tenants.

In summary, the cloud services base sample code is a great way to start exploring the basic partitioning concept, and partitioning is an important way to create a "cloud scale" application.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.