Swift is not a file system or a real-time data storage system, but an object store for long-term storage of permanent types of static data. This data can be retrieved, adjusted, and updated as necessary. Swift is best suited for storage of data such as virtual machine mirroring, picture, mail, and archive backups.
Swift does not use RAID, there are no central units and master points, but rather the use of consistent hashing and data redundancy at the software level, sacrificing a degree of data consistency to achieve high availability and shrinkage. Supports multi-user mode, container, and object storage. The best scenario for unstructured data storage is the problem. The so-called unstructured data is relative to structured data, after the data is data, stored in the database, you can use the two-dimensional table structure to logically implement the data, and non-structured data is not convenient to use the database two-dimensional logical table to express, including all the format of Office documents, text, pictures, subset XML, HTML, various reports, images, and audio/video in standard Universal Markup language.
Key technical features of Swift:
1. Extremely high data durability
2. Complete symmetric system architecture: symmetry means that each node in Swift can be fully equivalent and can significantly reduce costs
3, Unlimited scalability: On the one hand, the data can be unlimited expansion, on the other hand, swift performance can be linearly improved (temporarily no experience, you can follow the understanding of reading, mainly to see its architecture design)
4, no single point of failure: Swift's metadata storage is completely evenly distributed randomly, and as with the object file storage, the metadata will be stored in multiple copies.
Technical differences between Swift and HDFs
1, in Swift, the metadata distributed, cross-cluster replication, and in HDFs, the use of Central system (NAMENODE) to maintain metadata, which for HDFs, no doubt make a single point of failure, and thus extended to a very large environment is difficult
2, the swift design takes into account the multi-tenancy architecture, and HDFs does not consider this concept (in the afternoon to learn what is multi-tenancy architecture)
3, in Swift, the file can be written multiple times, in the concurrency scenario, with the most recent operation as the standard. In HDFs, the file is written once, and only one file can be written at a time
4. Swift can reliably store a very large number of files of varying sizes, while HDFs is used to store medium sized files to support data processing.
Note: The disadvantage of storage system metadata is that metadata refers to the image relationship that records the logical and physical location of the data. There are generally two types of metadata management, one is the use of centralized metadata services to maintain metadata, such as HDFs, which can lead to single point of failure and performance bottlenecks, and the other is the use of distributed metadata services to maintain metadata, such as OpenStack Swift, This maintenance approach has the performance load and the metadata synchronization consistency problem, such as Swift is to achieve high availability and scalability through the system to a certain degree of data consistency.
The principle of Swift:
1. Consistent hash
The consistency hash algorithm proposes a good criterion for determining the hash algorithm in a dynamically changing cache environment:
1, Balance: balance is the result of the hash can be distributed as far as possible in all buffers, so that all buffer space can be exploited. In order to better meet the balance, introduced the concept of virtual node, virtual node is the actual node in the hash space of the replica, a real node corresponding to a number of virtual nodes, the corresponding number is also known as the number of copies, the virtual node in the hash space in hash value arrangement.
2, monotonicity: Monotonicity refers to if some content has been allocated to the corresponding buffer by hash, and a new buffer added to the system, the result of the hash should be able to ensure that the original allocated content can be mapped to the original or new buffer Central, and not be mapped to the old or other buffers.
3, Dispersion: In a distributed environment, the client may not see all the buffers, but only to see part of it. When the terminal wants to map content to buffering through a hashing process, the resulting hash result is inconsistent because different clients see a different buffer range, resulting in the same content being mapped to the unused buffer. This situation should be avoided, as this will cause the same content to be mapped to different buffers, reducing the storage efficiency of the system.
4. Load: Another dimension when load is required for dispersion. Since the same content may be mapped to different buffers, it is possible for the same buffer to be mapped to different content by different users. As with dispersion, this situation should be avoided.
The main purpose of Swift's use of consistent hashing is to minimize the mapping of key and node to meet monotonicity when changing the number of nodes in a cluster.
Because with fewer nodes, changing node can result in a lot of data migrations, so use a virtual node.
There are two mapping relationships in Swift: for a file, the corresponding virtual node is found through the hash algorithm, and the virtual node locates the corresponding device through the mapping relationship, so that the text is stored on the device. (about the number of virtual nodes and nodes set to follow-up re-learning)
Data consistency model
According to the CAP theory, consistency, availability, and partitioning tolerance can only be met in two ways, and Swift abandons a consistency model to achieve high availability and level of scalability.
To control consistency, Swift employs the NWR policy (quorum protocol), which is a strategy for controlling conformance levels in distributed storage systems, where n represents the number of replica of a data,
W is the number of copies that update a data object to ensure a successful update, and R represents the number of copies of the REPLICAD that the data needs to read. The formula W+r>n guarantees that a data is not read and written by two different things at the same time, and the formula W>N/2 to ensure that two things cannot write the same data concurrently. In distributed systems, the number of backups is usually set to 3 copies, if it is 1, it is very dangerous, once the replica failed, then the data is corrupted, if n is two, then as long as a storage node error, there will be a single point exists. If the number of replicas is set too high, the maintenance cost of the system becomes higher.
Add: For NWR, strong consistency: r+w>n, to ensure that the copy read and write operations can produce intersection, thus ensuring that the latest version can be read, if w=n,r=1, you need to update all, suitable for a large number of read a few write operations under the scene of strong consistency. If r=n,w=1, only one copy is updated to get the latest version by reading the full distribution, which is suitable for strong consistency with a small amount of read-write.
Ring
Ring is an important component of Swift, which records the mapping between storage objects and physical locations, and queries the ring information of the cluster when querying account, Container, and object information.
The ring is designed to map virtual nodes (partition) to a set of physical storage devices and provide a certain degree of redundancy.
Swift defines loops for accounts, containers, and objects, and the lookup process is the same. Each partition in the ring has a default of 3 replicas in the cluster, where each partition is maintained by a ring and stored in the map.
The ring uses zones to ensure the physical isolation of data, and the number of copies per partition is ensured in a different zone.
In general, the reason for the ring to introduce a consistent hash is to reduce the amount of data item movement due to the increase in the number of nodes resulting in increased monotonicity, the reason for introducing partition is to reduce the number of nodes caused by too many moving data items, the introduction of replicas in order to prevent data points, improve redundancy, Zone is introduced in order to ensure partition tolerance, the introduction of weights is to ensure the balance of partition allocation.
Swift Architecture Design
Swift Deployment architecture
Main components:
1. Proxy Server
Proxy server is a server process that provides the Swift API, and Swift provides HTTP and reset-based service interfaces through Proxy server, responsible for communicating with the rest of the swift components. For each client request, it queries the location of the account, container, or object in the loop and forwards the request accordingly. Because of the stateless Rest request protocol, scale-out is possible to balance the load. Before accessing Swift, the access token is first obtained through the authentication service and then the header information X-auth-token is added to the sent request.
2. Storage Server
Storage Server provides storage services on a disk device that is tired. In Swift, there are three types of storage server account servers, container servers (Container server), and object servers.
2.1 Account Server
The account service provides account metadata and statistics, and maintains a service with a list of containers, each with information stored in a SQLite database
2.2 Container Server
The container server provides container metadata and statistics, and maintains a service that contains a list of objects. The container does not know where the object exists, only those objects that are stored in the container. These object information is stored in the form of a sqllite database file, and similar backups are made on the cluster like objects.
2.3 Object Server
The Object Server provides metadata and content services for storing, retrieving, and deleting objects on the local device. In the file system, the object is stored as a binary file, its metadata is stored in the extended properties of the file system, and it is recommended to use the default support extended attribute hungry XFS file system. Each object uses the hash value of the object name and the time stamp of the operation to store the path. The last write operation will always succeed and ensure that the latest version of the object is processed.
3. Consistency Server
The main problem with storing data on disk and providing rest-ful API is troubleshooting. The purpose of Swift's consistency server is to find and resolve errors caused by data corruption and hardware failures. Mainly consists of 3 servers, Auditor, Updater, and replicator. Auditor runs a continuous scan disk in the background of each swift server to detect the integrity of objects, containers, and accounts, and if data corruption is found, auditor will move the file to the quarantine area, and replicator will be responsible for replacing the data with a good copy. In the event of a high system load or failure, the data in the container or account will not be updated immediately. If the update fails, the update is queued in the local file system, and updater continues to process these failed update operations.
3.1 Audit Services (Auditor)
The integrity of objects, accounts, and containers is repeatedly checked on the local server, and if the method is compared to a premium error, the file will be quarantined, and the replicator is responsible for replacing the data with a good copy, and other types of errors will be recorded in the log
3.2 Replication Service (Replicator)
The service is used to detect the consistency of local partition replicas and remote replicas by comparing the hash file box with the advanced file water shadow to update the remote replica with push when inconsistencies are found: for object replication, updates are just using rsync to synchronize files to peer nodes. Replication of accounts and containers pushes missing records on the entire database file via HTTP or rsync, on the other hand, to ensure that tagged deleted objects are removed from the file system: When one item is deleted, a tombstone file is set to the latest version of the item, The replicator or the tombstone file is detected to ensure that he is removed from the entire system.
3.3 Update Service (Updater)
When an object cannot be updated immediately due to a high load or a system failure, the task is serialized to the local file system for asynchronous updates after the reply.
4. Cache Server
Cached content includes object service tokens, account and container presence information, but does not cache the data of the object itself.
Openstack Swift Learning Notes