Kubernetes is not just a container management tool. It is a platform designed to handle various workloads packed in any number of containers and combinations. Kubernetes has multiple built-in controllers that can be mapped to various layers of the
cloud-native architecture.
DevOps engineers can view Kubernetes controllers as a means to indicate the infrastructure requirements of various workloads that the team runs. They can define the required configuration state by declaring methods. For example, containers/pods deployed as part of ReplicationController are guaranteed to be always available. The container packaged as DaemonSet is guaranteed to run on each node of the cluster. The declarative approach enables DevOps teams to use code to control the infrastructure. Some of the deployment models discussed below follow the principles of an immutable infrastructure, where each new deployment leads to an atomic deployment.
DevOps engineers can define the required configuration state through declarative methods-each workload is mapped to the controller.
Understand cloud native use cases
The Kubernetes control plane keeps track of deployments to ensure that they meet the required configuration state defined by DevOps.
The basic deployment unit of Kubernetes is a pod. It is the basic building block of Kubernetes and the smallest and simplest unit in the Kubernetes object model. Pods represent processes running on the cluster. Whether the service is stateful or stateless, it is always packaged and deployed as a pod.
The controller can create and manage multiple pods in the cluster, processing copies that provide self-healing capabilities within the cluster. For example, if a node fails, the controller may automatically replace the failed pod by arranging the same pod on different nodes.
Kubernetes is equipped with multiple controllers that can handle the required pod state. Such as ReplicationController, Deployment, DaemonSet and StatefulSet controllers. The Kubernetes controller uses the provided pod template to create the required state of its responsible pod. Like other Kubernetes objects, Pods are defined in YAML files and submitted to the control plane.
When running
cloud-native applications in Kubernetes, O&M personnel need to understand the use cases that the controller solves to take full advantage of the platform's features. This helps them define and maintain the required configuration state of the application.
Each mode introduced in the previous section maps to specific Kubernetes controllers. These controllers allow more precise and fine-grained control of Kubernetes workloads, but in an automated manner.
Kubernetes' declarative configuration encourages an immutable infrastructure. The control plane tracks and manages deployment to ensure that the required configuration state is maintained throughout the application lifecycle. Compared to traditional VM-based deployments, DevOps engineers will spend less time maintaining workloads. An effective CI/CD strategy that leverages Kubernetes primitives and deployment models eliminates the need for operators to perform tedious tasks.
Scalable layer: stateless workloads
Stateless workloads are packaged and deployed as ReplicaSet in Kubernetes. ReplicationController forms the basis of ReplicaSet, which ensures that a specified number of pod copies are always run at any given time. In other words, ReplicationController ensures that a pod or a group of similar pods is always available.
If there are too many pods, ReplicationController may terminate additional pods. If there are too few, ReplicationController will continue to start other pods. Unlike manually created pods, pods maintained by ReplicationController will be automatically replaced when they fail, delete, or terminate. After destructive maintenance such as kernel upgrades, re-create the pod on the node. Therefore, even if the application requires only one pod, it is recommended to use ReplicationController.
A simple use case is to create a ReplicationController object to reliably run an instance of a pod indefinitely. A more complex use case is to run several identical copies of scale-out services, such as a web server. When deployed in Kubernetes,
DevOps teams and operators package stateless workloads as ReplicationControllers.
In recent versions of Kubernetes, ReplicaSets replaced ReplicationControllers. They all target the same scenario, but ReplicaSet uses set-based label selectors, which makes it possible to use complex annotation-based queries. In addition, deployment in Kubernetes relies on ReplicaSet.
Deployment is an abstraction of ReplicaSet. When the desired state is declared in the Deployment object, the Deployment controller will change the actual state to the desired state at a controlled rate.
It is highly recommended to deploy stateless services that manage cloud-native applications. Although services can be deployed as pods and ReplicaSets, deployments can upgrade and patch applications more easily. DevOps teams can use deployments to upgrade pods, but not ReplicaSet. In this way, a new version of the application can be launched with minimal downtime. The deployment brings PaaS-like functionality to application management.
Persistence layer: stateful workload
State workloads can be divided into two categories: services that require persistent storage (single instances) and services that need to run in high reliability and availability modes (multiple instances of replication). Pods that require access to the persistent storage backend are very different from the set of pods that run clusters for relational databases. Although the former requires long-lasting durability, the latter requires high availability workload. Kubernetes solves these two situations.
A single pod can be supported by exposing the underlying storage to the volume of the service. The volume can be mapped to any node of the scheduling pod. If multiple pods are scheduled on different nodes of the cluster and need to share a backend, manually configure a distributed file system (such as Network File System (NFS) or Gluster) before deploying the application. The modern storage drivers provided in the Yunyuan ecosystem provide container local storage, where the file system itself is exposed through the container. Use this configuration when the pod only needs persistence and persistence.
For scenarios where high availability is expected,
Kubernetes provides StatefulSets-a set of specialized pods to ensure the ordering and uniqueness of pods. This is especially useful when running primary/secondary (formerly known as master/slave) database cluster configurations.
Similar to deployment, StatefulSet manages pods based on the same container specifications. Unlike Deployment, StatefulSet retains a unique identifier for each pod. These pods are created according to the same specifications, but are not interchangeable: each pod has a persistent identifier, which can be retained in any rearrangement.
StatefulSet is very useful for workloads that require one or more of the following:
Stable, unique network identifier.
Stable and durable storage.
Orderly and elegant deployment and expansion.
Orderly, elegant deletion and termination.
Orderly automatic rolling update.
Kubernetes treats StatefulSets differently than other controllers. When N copies are being used to schedule StatefulSet pods, they will be created in order, from 0 to N-1. When deleting StatefulSet pods, they terminate in the reverse order, from N-1 to 0. Before applying an extension operation to a pod, all its predecessors must be running and ready. Kubernetes ensures that all its successors are completely shut down before the pod is terminated.
When the service needs to run Cassandra, MongoDB, MySQL, PostgreSQL cluster or any database workload with high availability requirements, it is recommended to use StatefulSet.
Not every persistent workload must be a StatefulSet. Some containers rely on persistent storage backends to store data. To add persistence to these types of applications, pods may rely on volumes supported by host-based storage or container-native storage backends.
Parallelizable layer: batch processing
Kubernetes has built-in primitives for batch processing, which is useful for execution to completion or scheduled jobs.
Run-to-finish jobs are usually used to run processes that need to perform operations and exit. Big data workloads that run before processing data are an example of such work. Another example is a job that processes every message in the queue until the queue becomes empty.
The job is a controller that can create one or more pods and ensure that the specified number of pods are successfully terminated. When the pod is successfully completed, Job will track the successful completion. After reaching the specified number of successful completions, the job itself is completed. Deleting a job will clean up the pod it created.
Jobs can also be used to run multiple pods in parallel, which makes them ideal for machine learning training. Job also supports parallel processing of a set of independent but related work items.
When Kubernetes runs on hardware with GPUs, machine learning training can take advantage of Job. Emerging projects such as Kubeflow-a simple, portable, and extensible project dedicated to deploying machine learning on Kubernetes-will package raw materials as jobs in machine learning training.
In addition to running parallel jobs, you may also need to run scheduled jobs. Kubernetes exposes CronJobs, which can be run once at a specified time point or periodically at a specified time point. The CronJob object in Kubernetes is similar to a line of the crontab (cron table) file in Unix. It runs regularly on a given schedule and is written in cron format.
Cron jobs are particularly useful for scheduling regular jobs such as database backups or sending emails.
Event-driven layer: Serverless
Serverless computing (Serverless) refers to the concept of building and running applications that do not require server management. It describes a more fine-grained deployment model in which applications bundled as one or more functions are uploaded to the platform and then executed, scaled down, and billed in response to the exact needs currently required.
Function as a service (FaaS) runs in a serverless computing environment to provide event-driven computing. Developers use functions triggered by events or HTTP requests to run and manage application code. Developers deploy small code units to FaaS. These codes are executed as independent components according to actual needs, and can be expanded without the need to manage servers or any other underlying infrastructure.
Although Kubernetes does not have integrated event-driven primitives to respond to alerts and events triggered by other services, there are still efforts to introduce event-driven features. The Cloud Native Computing Foundation, the custodian of Kubernetes, has been focusing on these working groups dedicated to serverless. Open source projects such as Apache OpenWhisk, Fission, Kubeless, OpenFaaS and Oracle's Fn can run as an event-driven serverless layer in a Kubernetes cluster.
Code deployed in a serverless environment is fundamentally different from code packaged as a pod. It consists of autonomous functions and can be connected to one or more events that may trigger code.
When event-driven computing, serverless computing becomes an integral part of Kubernetes, developers will be able to deploy functions that respond to internal events generated by the Kubernetes control plane and custom events triggered by application services.
Legacy layer: Headless Service
Even if your organization often uses
microservices architecture to build and deploy applications to containers on the cloud, there may be some applications that continue to exist outside of Kubernetes. Cloud native applications and services must interact with those traditional single applications.