Software-defined storage logic-Efficientandagilestoragemanagementinsoftwaredefinedenvironments

Source: Internet
Author: User
Tags fcoe

Software-defined storage logic-Efficientandagilestoragemanagementinsoftwaredefinedenvironments

Note: writing this may be a translation, or an understanding of this paper [1], or just my opinion.

Compared with IOFlow, this paper pays more attention to the Software Defined storage framework (I think it is to use the existing framework to create a new framework and then use the existing Protocol ), instead of focusing on communication protocols like IOFlow. In addition, this framework is a software-defined environment framework, not just a storage framework, but the full text focuses on storage (more challenging ). In particular, you can refer to the software-defined storage logic here.

SDE software-defined environment the data center environment includes Compute, Network, and Storage resources.

The demand for flexible and rapid in data centers is growing, and the performance of data center applications is closely related to computing, network, and storage resources. The overall smart organization and arrangement scheme (global view) for the data center will certainly break the computing, network and storage restrictions, significantly increasing QoS and user experience.

Objective of SDS

The goal of SDS is the same as that of the software-defined network. SDN goals can be divided into two dimensions: horizontal optimization capability; vertical Optimization, is software integration that controls the plane and data plane.

From the perspective of high-level applications (compared with low-level LUN and RAID), Application Deployment requires storage provision and certain performance (from the perspective of the entire system ), the system assigns the required logical volume to the application. However, the life cycle of applications changes dynamically, and resources are constantly changing (storage requirements, performance requirements, data protection requirements, copy policies, recovery point objective and recovery point time change), so if the above mentioned changes can be displayed for configuration, it is too effective for the application.

In short, the overall goal of SDS is to decouple the demand from the underlying infrastucture.

The main contribution of this article is an SDS solution called IBM OPen Platform, based on OpenStack (using its extension interface ).
Existing storage solutions enterprise-level solutions

The storage business community proposed a draft called SMI-S (the first note of Storage Management) as a unified interface for managing different storage devices. However, this is different from SDS, and it cannot meet the previously proposed SDS goal. In order to reduce the difference between it and SDS and achieve better user-level requirements and experience, IBM's storage management solution VSC has done a lot of things, and other enterprise-level storage also includes RMC and NetAPP.

Open-source community solutions

Openstack is a source cloud management system. This open-source project is now attended by many vendors and has made great achievements. Openstack provides an SDS platform. Its Storage mainly includes swift (Object Storage for applications and virtual machines) and Cinder (Block Storage for virtual machines ).

Swift

Swift manages and provides object storage and provides corresponding APIs to the client. An important feature of swift is to automatically copy data between available disks and nodes to automatically provide scalability, validity, and data protection capabilities (these are hidden functions, swift ). Then swift aims to reduce storage overhead, because the machines in the cluster all have commodity machines with large storage capacity. Nowadays, OSS is widely used in new applications, especially web applications (because swift REST APIs are prevalent over http ). Compared with file storage and block storage, Object Storage is more scalable and flexible.

Cinder

It is a block management component that provides block storage management functions, such as creating a server, adding Block devices, or deleting Block devices of a server. (These servers no longer use the storage of simple linux servers, but use Unified Storage support (ceph and netapp are even supported )). Currently, cinder can manage many storage systems, such as GBFS (IBM distributed parallel file system, which belongs to the underlying File System) and lvm (volume manager ). Cinder includes a scheduler (plugged, so you can use a third party) to select the best block device for the server (as required, it may include volume type (also an abstraction of storage resources )). For more information about cinder, click here.

These storage mechanisms of openstack support software-defined concepts to a certain extent, but they are not enough for specific support. If it is SDS, it should not be enough.

Framework Architecture

This article proposes a Platform called IBM Open Platform (as shown in), which mainly includes the abstraction of workloads, the abstraction of resources, and the ing between loads and resources, and related optimization. It is important for workload and resource abstraction to make this Framework play a role. Abstract workload is to express various different workloads in an abstract way (such as JSON or XML), and then capture the application-related requirements, for example, infrastructure and operation flow.

Resource abstraction provides a unified interface to provide, manage, and monitor underlying computing, storage, and network resources. Therefore, complex devices are transparent to users, and the SDE central control plane at the core translates the abstract representation of workload and the abstract representation from lower-level resources, then, the sdc, sdn, and sds components are organized (for flexible and effective management ).

Workload Abstraction

For example, a simple three-tier web program generally includes a web server, including a database and an application server. This software model can be described in a descriptive language format, such as JSON. Then, the unifiedcontrol plane engine in the middle resolves the language, and then arranges underlying resources for the workload organization.

From the storage point of view, this framework can create flexible storage semantics. For example, an application can specify the storage volume size, storage service type, and other related policies. Therefore, this framework allows program developers and system administrators to explicitly specify their requirements for the underlying storage. This kind of user-level storage needs can also be changed in the user program lifecycle, resulting in greater flexibility.

Resource Abstraction

So how does the resource abstraction work?

The IBM Open Platform supports not only enterprise-level storage subsystems, but also commercial storage devices, such as GBFS. Resource abstraction is to abstract storage resources (SAN, NAS, or DAS) into a resource pool, and then manage and allocate resources according to the needs of workload.

For example, this framework can use the underlying storage device in GBFS to create a file, and then assign the file to the virtual machine as a block device, you can also use the snapshot and file migration functions of GBFS to manage virtual block devices. For special underlying devices, some devices may be enterprise-level and provide some advanced features, such as firmware compression and deduplication capabilities. These functions can be used at this time. However, if the underlying storage device does not have these functions, you need to call the software method to achieve this purpose.

In short, this framework provides an abstraction of storage resources, which can achieve good storage resource utilization, improve operation performance, and reduce the complexity of the storage system.

Resource ing

From the above description, we can see that in this framework, the function of the unified control plane is to use the above request to organize the following resources. For storage requests, it is to parse the requirements of storage requests and pass them to the sds module for resource ing. The following example shows how to map storage resources.

Performance-aware storage Configuration

In terms of image, a "service type" label is required for each storage Resource Creation request, representing a special requirement for (storage provisioning, for example, the RAID level or the resiliency profile ). Each service type represents a set of storage configurations. For example, for services of the "platinum" level, low latency may be required. Such storage volumes created by workload are more likely to be stored in ssd, rather than ordinary disks. In addition, if there are many available SSDS that constitute a resource pool, this framework will analyze the utilization of devices in the ssd resource pool, then, place the newly created storage volume in one of the SSDS (with minimal performance impact ).

Storage fabric Management

After the storage unit is created, the connection between the workload server and the storage volume is created. Then, the following storage network technologies, such as fc, iscsi, infiniband, and fcoe, provide unified interfaces. Then this framework provides the best zone management and fabric analysis to determine the best storage fabric between the server and the storage volume. (For example, you can analyze the utilization rate of the storage device and then balance the load among multiple ports) (you can also use the climbing algorithm to find the best path)

Storage recovery

Restoration capability is an important feature of data protection. There are two forms of resiliency on this platform. One is fabric resiliency. The application can select an IO path to ensure that every node in the path is good (not failue, as required on the fcoe storage node ); for device-level data protection measures, this framework allows applications to select a replication policy, such as a point-in-time snapshot, a synchronous image, or an asynchronous image. Storage restoration can also use the overall functions of sde, including computing, to achieve a more holistic data restoration function.

Continuous mode optimization ILM

An important concept is storage Information lifecycle management (ILM ). This is because the value of data changes dynamically in its life cycle. For example, the mail data has the highest value at the beginning, and then the mail has become less and less valuable over time. Therefore, the goal of ILM is to put the correct data on the correct storage tier at the right time. This is the Storage tiering Technology (io tier is determined based on historical io behaviors ). Our framework allows applications to control the storage tier based on their unique needs. For example, specify the type of a service as higher tier and lower tier, and then automatically execute some tier policies at some io thresholds.

Note:For example, when programming an application, you can specify the disk on which data will be stored at a certain time in the future ~~~

SDE Integration

The demand for flexible and rapid in data centers is growing, and the performance of data center applications is closely related to computing, network, and storage resources. The overall smart organization and arrangement scheme (global view) for the data center will certainly break the computing, network and storage restrictions, significantly increasing QoS and user experience.

The storage part in this sde framework works well with other parts such as sdc and sdn, and the entire framework has a wide range of APIS, and various optimizations can be made on this framework. For example, VM placement problems.

Storage-aware VM placementVM placement is related to cpu, memory, and vdisk. For an I/O-intensive vm, it is very important (latency and bandwidth) to place its vdisk somewhere in the SAN. For a computing-intensive VM, the allocation of cpu and memory is even more important. Therefore, the cpu, storage, and vdisk issues must also be taken into account for vm placement. The logic of this framework can provide a flexible vm storage placement solution. LAB

IBM researchers have created a small lab environment. This framework is well applied and works well.

References

[1] Alba, A., et al. "Efficient and agile storage management insoftware defined environments." IBM Journal of Research and Development 58.2 (2014): 1-12.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.