Swift is an important component of the OpenStack cloud storage service, which provides highly available, distributed, persistent, large file object storage services. In addition, Swift can utilize a range of inexpensive hardware storage devices to provide secure and reliable storage services.
Q: Why use Swift? What advantages does it have?
1: Persistence of data
The persistence of data is an important metric for measuring storage systems. Persistence refers to the possibility that a user's data is lost after it is stored in the system. To prevent data loss and improve data durability, Swift uses a redundant replica (replica) approach, with the default value of replica of 3.
2: Architectural symmetry
Symmetry refers to the swift architecture design in which each server node functions and acts identically, rather than using the HDFs (Hadoop Distributed File System) master-slave architecture. Because of the master-slave architecture, it is often because of the pressure of the main, increase maintenance difficulties, once the master node hangs, it will lead to service unreliability. And the convenience of symmetry is that the system maintenance is simple, and not because of a node hanging off, the impact on the service
3: No single node failure
Swift has a symmetrical design, so each node has exactly the same status, so no one node is a single point. That is, the performance of the system will not be caused by the effectiveness of a node, the entire system is not available. In addition, Swift's processing of metadata (descriptive information about the data, such as owner, permission, type, etc.) is stored in the same way as the object file, that is, with a completely multiple uniform random distribution of storage.
4: Extensibility
When a new node is added to Wang Swift, it brings
Increased storage capacity
System Performance rise
Because it is a symmetric architecture, the expansion of the system is relatively straightforward. However, the newly added node does not store the data, and in order to ensure that the new node is equal to the old node, it is necessary to migrate the data already stored on Swift to the new node.
Therefore, one of the problems is that as the amount of storage data increases, there will be a lot of data migration, which increases the difficulty of migration and time spent. And that's one of the reasons for the SWIFT system's expansion.
5: Simple and reliable
Swift adopts a simple principle. Its architectural design, code and algorithms are easy to understand and provide high reliability and maintenance.
The architecture of Swift
There are 3 main types of servers in Swift:
Authentication node, which provides authentication of identity
If Swift is used alone, its authentication service can directly use the SWIFT built-in authentication service and place this built-in authentication service on the authentication node, and if Swift is placed in OpenStack, then Swift will use the authentication service provided by Keystone. at this point, the authentication node is not part of Swift.
2. Proxy node, forwarding client's request to Swift + provide Swift API service process
Proxy Server provides the Rest-full API, which allows developers to build their own applications based on the Swift API
3. Storage nodes, converting disk storage services to storage services in Swift, because of the different storage targets, the storage services running on the storage node are divided into the following three types:
Object Server: Objects services (that is, data that users want to store) provide a binary large object storage service that directly leverages the storage capabilities of the file system, but the object's metadata is stored in the extended properties of the file system, and therefore the objects server Requires the underlying file system to provide extended properties
Container Server: Container Service (the container's storage component, which can be understood as a folder, but the container cannot be nested like a folder) primarily handles the list of objects. But from the container to the object is a single mapping relationship, that is, the container service does not know which container the object resides on, but knows which objects are stored on the container. This part of the information is stored in the form of a file, using a completely uniform random multi-copy storage (as the object data is stored in the same way) the only difference is the use of the SQLite format for storage (a lightweight association, nested database, very few resources, used in embedded)
Account Server: The accounts service mainly deals with the container list, except that the account is not different from the container service.
A simple Swift deployment example is the deployment of Object Services, container services, and account services on storage nodes. If you deploy this way and ensure that the hardware is configured identically, the storage node is in equal status.
Swift Fault Handling
The real difficulty with Swift is that the data is inconsistent due to data corruption or physical hardware failure!
The storage system generally uses a completely homogeneous random multi-backup approach to avoid lost data, but it also leads to the possibility of inconsistent data between multiple backups. For example, a file has 3 backups, stored on a, B, C server, but due to a server sudden power outage and other unexpected circumstances, such as a after the restart of the data of a B and C must be different
Swift mainly uses the following three services to ensure the consistency of data in the case of a failure:
Auditor: Audit services, auditors repeatedly detect the consistency of accounts, containers, objects. Once the data for a file is found to be incomplete, the file is quarantined immediately. The auditor then notifies the replication replicator to copy and replace the file from the remaining copy. If there are other errors, such as all replicas are hung, this error message is recorded in the log
Updater: The main role of the updater is to defer updates. The main reason for the delay is the failure or exception in the process of uploading the user data. Under normal circumstances, the update order is: After the user uploads the data successfully, Object Server initiates a notification to container server notifying container server that a new Object has been added to a container. Container server receives the notification, updates the object list and initiates a notification to account server. Container server receives a notification and updates the Container list. And this is the ideal update order.
In practical applications, due to network disconnection, high system load, disk write wait for various reasons such as interference, it is possible to cause the failure of the update, and when an update fails, the update operation will be added to the update queue, and then by updater processing these failed update work.
Replicator: The replicator is responsible for replacing the corrupted data with a complete copy. Usually the hash value of the local file is automatically scanned at intervals, and the hash value is compared to the other copy of the remote, and if different, the corresponding copy substitution action is made
This article is from the "11097124" blog, please be sure to keep this source http://11107124.blog.51cto.com/11097124/1962076
The Swift learning of OpenStack