Often see people talking about cloud storage, but without seeing the actual figure, it's hard to imagine what cloud storage looks like, and here's a simple architecture diagram of a cloud storage.
The Orange storage node (storage node) is responsible for storing files, and the blue Control node is indexed as a file and is responsible for monitoring the balance of capacity and load between storage nodes, which together form a cloud storage. Storage node and control node are simple servers, only storage node of the hard disk, storage node server does not need to have raid function, as long as can install Linux, control node in order to protect the data, need to have a simple RAID level 01 function. Each storage node and control node has at least 2 network cards (gigabit, Wanchaoca can, some also support InfiniBand), a network card internal responsible for internal storage node and control node communication, data migration, a external responsible for external application of data reading and writing, a gigabit card, Read can reach 100MB, write can reach 70MB, if you think the external network card is not enough, you can install a few more pieces.
The gray squares above (NFS, HTTP, FTP, WebDav) are the application side, and the gray square in the upper left corner (Mgmt console) is a PC that is responsible for the management of storage nodes in the cloud storage. To the application side, cloud storage is just a file system and generally supports standard protocols such as NFS, HTTP, FTP, WebDAV, and so on, so it's easy to combine old systems with cloud storage, and the application side doesn't need to change.
Cloud storage is not meant to replace existing disk arrays, but rather to cope with the new forms of storage systems that are generated by high-speed data volumes and bandwidth, so cloud storage is typically designed with the following three points in mind:
1, capacity, bandwidth expansion is simple
Expansion is not downtime, will automatically put the new storage node capacity into the original storage pool, do not need to do cumbersome settings.
2, bandwidth is linear growth
Many customers using cloud storage are considering future bandwidth growth, therefore, the quality of the cloud storage product design will produce a great difference, some more than 10 nodes will reach saturation, so that the future expansion of bandwidth will have adverse effects, this must be clear beforehand, or wait until the discovery does not meet the demand, has bought hundreds of TB, It's too late to regret it.
3, Management is easy
Not that Google has 50,000 storage servers, even if there are many customers in the country have more than 500 storage, if not using cloud storage to unify management, management 500 storage is a huge job, inadvertently can lead to some applications crash, so the application of cloud storage is an inevitable trend, When users migrate applications to cloud storage, he manages a single store, not 500 or even 50,000. Managing a single store is not easy to make mistakes, it is difficult to manage 50,000 units separately.
This is a pure software cloud storage solution, some products are hardware solutions, they put the orange storage node and the blue Control node, on a device, the disadvantage is that the cost is high, customers can not according to their own needs, arbitrary choice of their own specifications of the hardware, such as reading and writing performance, network card , hard disk capacity, and so on, so I personally think that the software solution will be the final winner, because in the cloud storage user's point of view, they are very expensive and do not want to give up their original hardware input, these are hardware solutions are not satisfied.