Application Data Persistence for Kubernetes

Last Update:2018-06-26 Source: Internet

Author: User

Tags wordpress blog glusterfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Stateless applications and stateful applications with stateful and stateless applications are based on the need for persistent data retention, i.e. the application of persisted data is a stateful application, whereas a stateless one. Common systems are often stateful applications, such as microblogging and such applications, where all users publish content and messages to keep records. However, a system is often composed of many microservices or smaller application modules. Some microservices or modules do not actually require data persistence. For example, building a WordPress blog system requires deploying a front-end PHP application, as well as a back-end MySQL database. While the entire blogging system has a persistent need, it is a stateful system, but its submodule front-end PHP application does not store the data locally, but instead stores it in the MySQL database. So a WordPress system can be disassembled into a stateless front end and a stateful backend. Stateful and stateless applications abound in reality. In the case of instances, stateless applications should be more, because for most systems, the number of read requests is often much higher than the number of write requests. One feature of non-persistent container containers is that when a container exits, all of its internal data and state is lost. The same mirror launches a new container instance, which by default does not inherit the state of the previous instance. This is not a problem for stateless applications, but rather a good feature to ensure the consistency of stateless applications. But it's a big hurdle for stateful applications. Imagine what a disaster would be if your MySQL container had lost all of its data before each reboot. Container data persistence inevitably, users will run stateful applications in containers, so the level of the container engine must meet the need for data persistence. Docker provides the concept of a volume (Volume) at the container engine level. The user can create a data volume container to provide persistent support for the container. The container instance needs to persist data to the data volume container. When the application container exits, the data is still stored safely in the data volume container. In addition, Docker supports multiple storage methods in the form of plug-ins. With volume plug-ins (volumeplugin), Docker containers can now host directories, software-defined storage (such as Glusterfs and Ceph), cloud storage (such as cloud storage provided by public clouds such as AWS and GCE), and storage management solutions (such as flocker, etc.). Persistent volume and persistent volume requests Docker provides a mechanism for volume at the container engine level to meet the needs of container data persistence. In a multi-host environment, there are more details to consider in the container cloud scenario. For example, when a container instance of a stateful application drifts from one host to another, how to ensure that it is mountedThe volume can still be properly docked. In addition, on the cloud platform, users need to acquire and consume storage resources in a simple way without unduly worrying about the underlying implementation details. For example, a user has an application that requires a 100GB of high-speed storage to store a large number of fragmented files. What the user needs to do is submit the resource request to the cloud platform and then fetch and consume the storage resource without worrying about which disk of the storage server The underlying store is from. to meet the storage needs of container users in the cloud environment. Kubernetes provides the concept of persistent volumes (PERSISTENTVOLUME,PV) and persistent volume requests (PERSISTENTVOLUMECLAIM,PVC) at the container orchestration level. Persistent volumes define specific storage connection information, such as the address and port of the NFS server, the location of the volume, the size of the volume, and how it is accessed. In OpenShift, a Cluster administrator defines a series of persisted volumes that form a resource pool for a persisted volume. When a user deploys a container application with persistence requirements, the user needs to create a persistent volume request. In this request, the user declares the size of the storage required and how to access it. Kubernetes will be responsible for docking the persistent volumes that match requirements based on the user's persistent volume request. The end result is that after the container starts, the back-end storage of the persisted volume definition is mounted to the container's specified directory. OpenShift is architecturally based on kubernetes, so users can use the storage provisioning model of kubernetes persistent volumes and persisted volume requests in openshift to meet the demands of data persistence.

2. The life cycle of a persisted volume's lifecycle is divided into "supply", "bind", "Use" "Recycle" and "release" five stages.

1. Supply in Kubernetes, the supply of storage resources is divided into two types: static supply and dynamic supply. For static provisioning, the Cluster Administrator creates a persisted volume of columns to form a resource pool for a persisted volume. Dynamic provisioning is where the cluster's infrastructure cloud dynamically creates persistent volumes, such as OpenStack, Amazonwebservice, on demand. The access method is to describe the access characteristics of a persisted volume, such as read-only or read-only writable. Can only be mounted by a node, or it may be used by multiple node nodes. There are currently three types of access options available. Readwriteonce: Readable and writable, can only be mounted by a node. Readwritemany: Readable and writable, can be mounted by multiple node nodes. Readonlymany: Read-only, can be mounted by multiple node nodes. It is important to note that the access method is very much related to the storage used by the backend, not to set a persistent volume to readwritemany, which can be mounted by multiple node nodes. For example, OpenStack Cinder and CEPHRDB these block devices do not support readwritemany this mode. The access to various back-end storage is described in detail in Kubernetes's official documentation. 2. A bound user defines a persisted volume request for a persisted volume request when the container app is deployed. The user declares the attributes of the required storage resource, such as size and access, in a persistent volume request. Kubernetes is responsible for locating the persisted volume object in the resource pool of the persisted volume and docking the persistent volume request with the target persisted volume. The status of the persisted volume and persisted volume requests becomes bound, which is the binding state. 3. The mount point of the volume is specified in the container definition of the deploymentconfig when the user deploys the container, and the mount point is associated with the persisted volume request. When the container starts, the back-end storage specified by the persisted volume is mounted on the container-defined mount point. The application runs inside the container, and the data is eventually written to the back-end storage through the mount point, thus enabling persistence. 4. Release when the application is no longer using storage, the associated persisted volume request can be deleted so that the persisted volume status becomes released, which is released. 5. Recovery when the status of the persisted volume becomes released, Kubernetes will reclaim the persisted volume based on the recycle policy defined by the persisted volume. There are currently three recovery strategies supported: • Retian: Preserves data and manually reclaims persisted volumes. Recycle: Deletes all data on the volume by executing rm-rf. Only NFS and Hostpath support this way at this time. Delete: Dynamically delete back-end storage. The moldAwsebs, GCEPD, and openstackcinder support this type of support, which is supported by the underlying IaaS. Kubernetes provides users with a way to store consumption on the container cloud through persistent volumes and persistent volume requests. Under this model, users can quickly and easily build storage solutions for the container cloud that meets the needs of the application. In Kubernetes1.3, persistent volumes and persistent volume requests introduce the concept of tags, which gives the user greater flexibility. For example, we can attach different labels to different types of persistent volumes, such as "SSD" "RAID0" "Ceph" "Shenzhen Room" or "American Engine room". The user can define the corresponding tag selector in a persistent volume request to obtain a more precise back-end persistence volume that matches the application's requirements. 3. Persistent volumes support a lot of the backend storage supported by Kubernetes persistent volumes, including the host's local directory (hostpath), Network File System (NFS), Openstackcinder distributed storage (such as Glusterfs, CEPHRBD and CEPHFS) and cloud storage (e.g. Awselasticblockstore or gcepersistentdisk). A common puzzle is "which kind of storage should I choose?" "Different storage backend has different characteristics and there is no storage for all scenarios. Users should choose the storage that meets their needs based on the requirements of the current container application. 1.hostpathhostpath type storage refers to the directory on the host where the container is mounted on the computer point. This approach is only suitable for use in scenarios where testing is a goal. Allowing the container to mount the host directory introduces a security risk. Depending on the data on a node, the container and a compute node have a strong binding relationship, which introduces the risk of single point failure. 2.nfsnfs is a common type of storage. NFS has been around for a long time and is widely used on UNIX and Linux, and all Linux system administrators are not unfamiliar with it. Because of the wide range of system support, NFS is now the most common storage backend for persistent volumes. 3.glusterfsglusterfs is an open-source Distributed File system. Glusterfs has a strong ability to stretch, and users can use Glusterfs to build petabytes of storage clusters for storage of multiple types of data, such as video, images, and data, on common computer hardware. The main features of Glusterfs are: is completely based on software implementation. Completely independent of the specific host, storage, network hardware. • Highly resilient expansion. Users can build storage capacity from GB to petabytes. • High Availability. Data can retain multiple pairs in the storage integrationThe single point of failure. • Compatible with POSIX file system standards. Based on standards, the application of the upper layer is modified. • Supports many different types of volumes. such as replicated volumes, distributed volumes, and striped volumes to meet the needs of different scenarios. 4.cephceph is currently a very popular open source distributed storage solution. Like Glusterfs, Ceph is a fully software-based distributed storage. One feature of Ceph is that its native provides a variety of interface methods, such as restful objects, blocks, and file systems. Glusterfs and Ceph are very good distributed storage, and many people like to compare them. Should say Glusterfs and ceph each have merits and demerits, in about, vegetables radish each their own. The kubernetes persistent volume supports two ways to Mount Ceph storage: Block Device (RBD) and file System (CEPHFS). Currently, because Ceph officially believes that CEPHFS is not fully mature to meet the standards of enterprise production use, it is not recommended to use it in production, although CEPHFS support is already available in Kubernetes and OpenShift code. The 5.openstack Cindercinder is an OpenStack block storage service that provides flexible storage support for host instances on OpenStack. For OpenShift clusters running on OpenStack, users can define a persistent volume based on OpenStack Cinder. An example of the definition of a cinder persisted volume is shown below. The Volumeid property points to the unique identity of the data volume that the administrator created in cinder. 4. Storage resource orientation matching the requirements for storage vary from one user to another, except for size and access, there may be special requirements for the speed of the disk, the data center where it is stored, and so on. To flexibly meet storage requirements and storage resource docking, Kubernetes supports different labels (label) for persistent volumes and, on the persistent volume request side, defines a tag selector to affirm what persistent volume requests need to match. With the label and Tag Selector (Selector), the Kubernetes implements a directed match for persistent volume requests. &NBSP;1 Create a persistent volume create the following example of the two persisted volumes pv0001 and pv0002. Both persisted volumes have the same size and access, and none of them have any labels. The 2 tag tag is labeled DISKTYPE=SSD by the OC Label Command for the persistent volume pv0002. 3 Create a persistent volume request creates a persistent volume request with a tag Selector. As the following definition shows, the storage size of this persistent volume request is1Gi, access mode is read-only shared rwo. The tag Selector is of type matchlabels and the definition value is "disktype": "SSD", which means that the persisted volume that matches the persisted volume request must have a "Disktype": "SSD" label. 4 after the request is matched to a resource-oriented matching persistent volume request, you can see the status of the persisted volume, and you will notice that although pv0001 and pv0002 meet the pvc0001 requirements for both space size and access, the pvc0001 final match is pv0002 with the target tag. 5 Tag Selector Currently, persistent volume requests support two tag selectors: Matchlabels and Matchexpressions. The Matchlabels selector can match exactly one or more labels. The matchexpressions selector supports fuzzy matching of labels. The user can use the operator in or notin to blur the value of the label. 5. Persisting instance 1 checking mount point 2 backup data in the previous section of the example has been pushed to registry a lot of mirrors, so the current container in the/registry directory has a lot of image-related files. You need to back up these files first. The Ocrsync command allows you to synchronize data from a directory in a container to a host. OC Rsync is a handy and practical command that synchronizes both the container and the file on the host computer in both directions. To use this command, the target container must have one of the two applications, Rync or tar. 3 Creating storage for easy experimentation, this example uses NFS as the back-end storage. The configuration process and steps for using Glusterfs, Ceph, or other storage backend in the actual production are similar. Execute the following command to create a shared directory for NFS. 4 Create a persisted volume based on the NFS information created above, create a persisted volume. On the experiment host, save the following JSON as a file Pv.json. Once created, you can view the persisted volume that you just created by using OC get PV, and it will be available with a status of available. 5 Create a persistent volume request the following creates a persistent volume request that declares the storage requirements for the app. On the experiment host, save the following JSON as a file Pvc.json. Here it is clear that the backend storage needs 3GB, the access mode is readwriteonce. Looking at the status of persisted volume requests and persisted volumes, you will see that the system has connected them together. The status of persistent volume and persisted volume requests has become bound. 6 The associated persistent volume request restores the backed-up data to the NFS directory created earlier. At this point, you can test the Delete registry container and the Replication controller will recreate it. Once the container is started, check the container again/regIstry directory, you will find that the directory data should disappear. Because the container defaults to non-persisted data. Adds a persistent volume request Docker-registry-claim for the Reigstry container definition and associates it with the mount point Registry-storage. After the container definition of Deployment config is modified, OpenShift creates a new container instance. Check the container/registry directory and you will find that the directory data is restored. At this point, we successfully hooked up the registry component to persistent storage. The configuration of this example is based on the NFS persisted volume implementation, and the process of using glusterfs or ceph persisted volumes is similar, except that the definition of the persisted volume needs to be modified slightly.

Application Data persistence for kubernetes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More