From an enterprise IT architecture, especially for Web2.0 sites, scalability must be considered: the ability to expand IT systems in a timely manner as the number of users increases. There are usually two ways to solve this problem: Scale up and Scale out, and two modes of expansion address database pressures from two dimensions.
Scale out (scale-out): literally, Scale out uses increased computing power by adding processors and adding independent servers. means that enterprises can increase the different server and storage application according to the requirement, rely on multiple servers, storage cooperative operation, load balance and fault-tolerant function to improve the computing ability and reliability.
Scale Up (vertical expansion): Refers to the enterprise back-end large servers to increase the processor and other computing resources to upgrade to achieve the application performance requirements, but the larger and stronger servers are also more expensive, often cost more than the deployment of a large number of relatively inexpensive servers to achieve performance improvement, which is represented by IBM Zseries Mainframe. And the server performance can improve the degree also has a certain limit.
Scale Up disadvantage
For now, generally speaking, scale out is cheaper than scale up. Major search engines generally use the common x86 server +linux composed of scale out architecture, other mainstream Web services are mostly used this framework, such as Yahoo, Taobao, Sina, and some ERP manufacturers into Ufida in the past two years has also begun to abandon the expensive RISC architecture.
From the point of view of data storage, the Scale up storage system is almost the best and only option ever when we need to frame a growing large data center. But it also means that when expensive storage devices are added, the business needs to be interrupted and more difficult to manage. Of course, the Scale up storage System front-end processing capacity and the number of disks in the back end can continue to expand, but in general, are in a fixed storage system architecture to upgrade the extension, when extended to a certain extent, it is difficult to continue to expand, especially the number of front-end controllers. This results in a performance bottleneck that occurs when the back-end disk is growing and the front-end controller cannot be extended.
Scale out will be the future enterprise architecture
Now, with the scale out storage system, everything seems to be much simpler. The deployment is greatly simplified and the storage structure is up to billion. In addition, the scale up architecture comes in at the same time, with both performance and capacity growing, without affecting the original use. Users on-demand procurement of storage, once the capacity is not enough, and then purchase a piece of storage on the original can be.
So the future enterprise architecture should use outward extension (Scale out) to achieve scalability, while allowing users to retain the need to increase the server to enhance the system capacity of the posterior.
From storage, the future enterprise needs a framework that is capable of coping with the enormous challenges posed by the proliferation of unstructured data, which is based on NAS space, can add several nodes that work in parallel and be managed as a single node, allowing for throughput and independent expansion of capacity. Under a single system image, these systems can be extended to multiple PB-level storage, making them ideal integration platforms.
At the same time, the horizontal extended storage pool can virtualize the underlying storage, creating resources that can be dynamically tuned as business requirements change, and bandwidth, processing power, and storage capacity can be adjusted individually and extended in real time. This concept of resource creation is critical to the increasing availability and reliability of the ongoing, growing enterprise infrastructure. Scale-out storage helps minimize management costs, data center space, power, and cooling requirements. Shared resource pools provide higher utilization and greatly reduce waste. The economic value of scale-out storage is reflected in improved scalability, faster configuration, improved performance, simplified management, and increased storage utilization.
The Scale-out storage System overcomes the limitations of physical racks and modules, and can be used as a single system to achieve an independent upgrade of performance and capacity by increasing the number of controllers or capacity nodes to increase the return on it inputs. At the same time, linear expansion ability for the business long-term cost-effective to provide protection. solves the traditional single system, the modular system needs the physical disk level management, the data layout and the performance tuning superiority disadvantage. Scale-out platform can not only improve performance but also reduce operating costs, so that a single system in a single global domain name, simply extended to a number of PB capacity range, become the ideal storage platform for management of soaring data.
In the 2011, Scale-out storage offerings and scenarios are emerging as enterprise IT users grow in scalability, flexibility, and performance requirements for their storage systems. EMC, HDS, and NetApp have already introduced more scalable X86 solutions, such as EMC's Vmax platform (SAN) and Isilon products (multiprotocol), HDs Usp-v (SAN), VSP Introduction (SAN), and Hnas ( bluearc-based nas system) and NetApp's GX (now ONTAP 8 Cluster Mode) NAS system, while domestic manufacturer Hua race also launched its own flat expansion product HS's Oceanspace 5000 last year. At the same time, the system of IBM's DS8000 series also has the characteristics of transverse scaling.
There are also many scale-out storage solutions in open source software, such as lustre, PVFS2, Glusterfs, clustered file systems, HDFS, KFS, Mfs,fastdfs, Taobaofs, and other Distributed file systems.
The bottleneck of horizontal scaling
The real problem, if want to realize scale out architecture must face the problem of distributed computing, and there are many solutions based on distributed computing, such as Hadoop, MAPR and so on, but the problem is that the performance optimization based on distributed computing has been one of the bottlenecks that enterprises have not adopted in many aspects.
Second, the Scale out scenario also requires a lot of rewriting of the original software to ensure that the system runs on the distributed server (the Scale up scenario has almost no change to the existing software). This step is often a nightmare for developers in every company.
Furthermore, the Scale out scheme is always confronted with the problem of data centralization, that is, the split data in the server logic system is still relatively concentrated rather than infinitely arbitrary split. If a lot of logic is placed on the traditional database server, the database server will cause the system to lose the ability and possibility of scale out. Therefore, to ensure the scale out of the ability to ensure that the database only to deal with substantive data submission and unavoidable data query, which is MongoDB, Redis and other new NoSQL solutions are increasingly popular reasons.
For the technical problems of distributed computing, there are some open source projects abroad, we can learn from, to a certain extent, can try to solve. At the same time, we also need to take into account the characteristics of our own web site data. such as online games, IM, BSP these data, usually each user can be abstracted into a data object, can be stored independently in any one place, the data is not related to each other, this situation is more suitable for the use of Scale out of the way. But for some other data, such as E-commerce site sales information, the relationship between them is large, this time often query needs to spend a lot of resources; There are also some things of the application, to ensure the integrity of the data is more important, at this point, the use of scale out of the way is not necessarily appropriate. Overall, the adoption of scale out is the mainstream of the web2.0 Web site, adapting to the Web site data is constantly and not very good expected to increase the main demand, and scale up this way more suitable for business data with strong relevance and data growth can be expected enterprises.
(Responsible editor: The good of the Legacy)