Scale out or scale up?

Source: Internet
Author: User

As a Web website, the general feeling is that the plan is not as good as the change. When attracting venture capital, we can make various plans to look forward to the rapid growth of the number of users and PV/UV in a certain year, because the general Internet model is still focusing on users, but the growth still does not increase, the growth is often unpredictable. In this case, how much storage and computing power can be configured, and when to increase storage and computing power is a challenge. The idle storage and computing power is a waste of money, but the storage and computing power is not enough. We do not want to see the website crash. Cloud computing
The concept has been blowing for a long time, and the shadow is still invisible in China. As a technician, we certainly expect the emergence of a cloud computing platform (in fact, many Web websites do not have much core data assets to worry about). It is good to purchase storage and computing capabilities on demand, do not run into the IDC in the middle of the night. But the reality is that in China, we also need to purchase storage devices, servers, and bandwidth for our own maintenance.

From the perspective of enterprise IT architecture system, scalability must be considered for Web websites: as the number of users increases, IT system capabilities can be scaled in a timely manner. To solve this problem, there are usually two solutions: scale up and scale out. The two expansion methods are used to solve the database pressure from two dimensions. The so-called scale up is to expand up, scale
Out is the expansion of the plane type. Vertical scale-up is to increase the configuration of the DB server, add hardware configurations, and solve the access pressure by improving the hardware speed. horizontal scale-up is to split the application data, distribute originally centrally stored data to different physical database servers according to certain rules. The implementation cost of the UP mode is high. after a certain amount of pressure is applied, the hardware may not be able to meet such requirements. From the perspective of Web websites, if you can expand storage and computing capabilities by adding inexpensive devices, it is an effective means of long-term scalability, which is the advantage of scale out.

Shared storage is required for network storage. You do not want to stay in a single processing environment regardless of data blocks or files, localization, or remote access. The data center requires a brand new service, not just about servers, storage or network products, but about system application and infrastructure resources, systems built by enterprises must be able to easily adapt to future needs. Currently, the scale out mode is a popular mode in the storage field, and many companies have also launched corresponding products and services. Recently, IBM has released a brand new and highly scalable disk storage system. The system technology comes from
XIV, which was acquired by IBM in January this year, is designed to address today's diverse combinations of information from traditional applications such as Web 2.0 to financial services. This brand-new enterprise-level disk product developed by XIV demonstrates a unique grid-based storage architecture that allows for easier management and better performance scaling, with the self-tuning/recovery and auto-streamlining configuration features, you can help reduce the cost and complexity of information storage, and support the requirements for continuous and fast data access under current dynamic workloads. According to IBM, the XIV system is based on SATA disks and uses a unique parallel architecture and CacheAlgorithmNot only does the hotspot be eliminated, but its performance is far beyond those FC-based disk systems.
In addition, the sofs (scale out file services) released by IBM can help enterprises quickly implement a highly scalable global cluster NAS system, which can help ease the current data storage challenges, in particular, the current insufficient storage network bandwidth brings troubles to enterprises.

Many foreign Web websites solve the problem of computing capability expansion under stable services by means of scale out, such as Flickr and Digg. Through the shards mode, the Flickr website uses a large number of low-cost data warehouse servers for horizontal distribution to respond quickly when processing hundreds of millions of transactions per day, at the same time, the cost of expansion is very low when information continues to expand, and the expansion has little impact on the website.
Shards refers to fragmentation. In this example, application data is split horizontally. That is to say, if tens of millions of user information is available, the user information can be distributed across multiple database servers, this type of group distribution data is distributed to different database servers. Shards is used to split data blocks horizontally. The Processing Method of Digg is also quite interesting. It sharding the data with a large user access volume and other data with a small user access volume, for those "hot" data with high traffic volumes, better hardware is used to provide a better service experience, while other data, even though the access speed is a little slower, has little impact on users.

However, on the other hand, the complexity and maintenance cost of the scale out strategy is higher than that of the scale up strategy. To use scale out, you must first solve the problem of complex distributed computing (this problem is not required for the scale up solution, is a huge technical threshold; in addition, scale
The out scheme also requires a lot of rewriting of the original software to ensure that the system can run on the Distributed Server (the scale up scheme requires almost no changes to the existing software ); in addition, the scale out solution is always facing the problem of dataset, that is, the split data is still relatively concentrated in the server logic system, rather than unlimited random splitting.

There are some open-source projects abroad for the technical issues of distributed computing, which can be used for reference and solved to some extent. At the same time, we also need to take into account the characteristics of our own website data. Data such as online games, Im, and BSP can usually be abstracted into a data object and can be stored independently in any place, with little correlation between data. This situation is suitable
Scale out mode. However, for some other data, such as the sales information of e-commerce websites, There is a high correlation between them. At this time, queries often consume a lot of resources. There are also some transaction-based applications, it is more important to ensure the integrity of data. In this case, the scale out method is not necessarily suitable. On the whole, the scale-out method is the mainstream of Web websites. It adapts to the main demand for continuous and unpredictable growth of website data, the scale-up method is more suitable for enterprises with highly correlated business data and predictable data growth.


I don't know where the source is.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.