Analysis of Kubernetes Application Deployment model (Principles)

Last Update:2015-12-16 Source: Internet

Author: User

Tags etcd

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Analysis of Kubernetes Application Deployment model (Principles)

Abstract: this series of articles focus on actual deployment and give you a quick grasp of Kubernetes. This article describes the principles and concepts that need to be understood before deployment, including the Kubernetes component structure, functions of each component role, and Kubernetes application model.

[Editor's note] Kubernetes can be used to manage Linux container clusters, accelerate development and simplify O & M (that is, DevOps ). However, at present, the number of articles on Kubernetes on the network is much higher than the actual usage. This series of articles focuses on actual deployment and gives you a quick grasp of Kubernetes. This article is the first article. It mainly introduces the principles and concepts that need to be understood before deployment, including the Kubernetes component structure, functions of each component role, and Kubernetes application model.

For more than a decade, Google has been using containers to run its business in the production environment. The system responsible for managing its container cluster is Borg, the predecessor of Kubernetes. In fact, many Google developers working on the Kubernetes project have previously worked on the Borg project. Most Kubernetes application deployment models originate from Borg. Understanding these models is the key to understanding Kubernetes. The Kubernetes API version is v1 at present. This article describes its Application Deployment model based on code 0.18.2. Finally, we use a simple use case to describe the deployment process. After the deployment, it explains how to use Iptables rules to implement various types of services.

Kubernetes Architecture

A Kubernetes cluster consists of two roles: Kubernetes agent (agents) and Kubernetes Service (master node). The components of the proxy role include Kube-proxy and Kubelet, which are deployed on one node at the same time, this node is also a proxy node. Components of service roles include kube-apiserver, kube-schedver, and kube-controller-manager. They can be deployed on the same node, it can also be deployed on different nodes (the current version does not seem to work ). The Kubernetes cluster depends on two third-party components: etcd and docker. The former provides status storage, and the two are used to manage containers. Clusters can also use distributed storage to provide storage space for containers. Displays the components of the current system:

Kubernetes proxy nodes Kubelet and Kube-proxy run on the proxy node. They listen to Service node information to start containers and implement Kubernetes networks and other business models, such as services and pods. Of course, each proxy node runs Docker. Docker is responsible for downloading container images and running containers.

Kubelet

The Kubelet component manages Pods and their containers, images, volumes, and other information.

Kube-Proxy

Kube-proxy is a simple network proxy and Load balancer. It specifically implements the Service model, and each Service will be reflected on all Kube-proxy nodes. Based on the Pods covered by the Service selector, Kube-proxy performs load balancing on these Pods to serve Service visitors.

Kubernetes Service Node

Kubernetes service components form the control plane of Kubernetes. Currently, they run on a single node, but will be deployed separately in the future to support high availability.

Etcd

All persistence statuses are saved in etcd. Etcd supports watch at the same time, so that components can easily get system status changes, so as to quickly respond to and coordinate work.

Kubernetes API Server

This component supports APIs, responds to REST operations, verifies API models, and updates the corresponding objects in etcd.

Scheduler

By accessing/binding API in Kubernetes, Scheduler allocates Pods on each node. Schedtes is plug-in. Kubernetes will support custom Scheduler in the future.

Kubernetes Controller Manager Server

The Controller Manager Server is responsible for all other functions. For example, the endpoints Controller is responsible for creating and updating Endpoints objects. The node Controller is responsible for node discovery, management, and monitoring. In the future, these controllers may be split and implemented in plug-ins.

Kubernetes Model

The greatness of Kubernetes lies in its Application Deployment model, including Pod, Replication controller, Label, and Service.

Pod

The minimum deployment unit of Kubernetes is Pod rather than container. As a First class API citizen, Pods can be created, scheduled, and managed. Simply put, like the peas in a Pod, the application containers in a Pod share the same context:

PID namespace. But not in docker
Network namespace. Multiple containers in the same Pod access the same IP address and port space.
IPC namespace. applications in the same Pod can use SystemV IPC to communicate with POSIX message queues.
UTS namespace. applications in the same Pod share a host name.
Each container application in the Pod can also access the shared volumes defined at the Pod level.

In terms of the lifecycle, pods should be short-lived rather than long-lived applications. Pods is scheduled to the node and kept on this node until it is destroyed. When a node dies, the Pods assigned to the node will be deleted. The Pod migration feature may be implemented in the future. In actual use, we generally do not directly create Pods. We use replication controller to create, copy, monitor, and destroy Pods. A Pod can contain multiple containers, and they often collaborate to complete an application function.

Replication controller

The replication Controller ensures that a certain number of replica are running. If the number is exceeded, the Controller will kill some of them. If the number is less, the Controller will start some. The Controller also guarantees this quantity when the node fails and is maintained. Therefore, we strongly recommend that you use the replication controller even if the number of copies is 1, instead of directly creating the Pod.

In terms of the lifecycle, the replication controller will not terminate itself, but the span will not be better than the Service. The Service is able to manage Pods across multiple replication controllers. In addition, the replication controller can be deleted and created within the lifecycle of a Service. Service and client programs do not know the existence of the replication controller.

Pods created by the replication controller can be replaced with each other in the same semantics. This is especially suitable for stateless services.

A Pod is a temporary object. It is created and destroyed and cannot be recovered. The replicaset dynamically creates and destroys pods. Although the Pod is assigned to an IP address, this IP address is not persistent. This raises a question: how do external users consume the services provided by the Pod?

Service

Service defines the logical set of a Pod and the policy for accessing this set. The set is completed by defining the Label selector provided when the Service is defined. For example, we assume that there are three Pod backups to back up an image processing backend. These backend backups are logically the same, and the front-end does not care which backend is providing services to it. Although the actual Pod that makes up the backend may change, the front-end client will not be aware of this change or track the backend. A Service is an abstraction used to implement such separation.

For the Service, we can also define the Endpoint, which dynamically connects the Service and Pod.

Service Cluster IP and kuber proxy

Each proxy node runs a kube-proxy process. This process obtains the changes of the Service and Endpoint objects from the Service process. For each Service, it opens a local port. Any connection to this port will proxy a Pod IP address and port in the backend Pod set. After a service is created, the service Endpoint model displays the IP address and port list of the backend Pod. kube-proxy selects the backend service from the list maintained by this endpoint. In addition, the sessionAffinity attribute of the Service object will also help kube-proxy to select the specific backend. By default, backend pods are randomly selected. You can set service. spec. sessionAffinity to "ClientIP" to specify the traffic proxy of the same ClientIP to the same backend. In terms of implementation, kube-proxy will use IPtables rules to redirect traffic destined for the Cluster IP address and port of the Service to this local port. The following sections describe what is the Cluster IP of the service.

Note: In versions earlier than 0.18, the Cluster IP address is called PortalNet IP.

Internal user service discovery

Kubernetes objects created in a cluster or clients that are accessed on the proxy cluster node are called internal users. To expose services to internal users, Kubernetes supports two methods: environment variables and DNS.

Environment Variable

When kubelet starts a Pod on a node, it sets a series of environment variables for the Pod container for the currently running Service so that the Pod can access these services. Generally, the variables {SVCNAME} _ SERVICE_HOSTh and {SVCNAME} _ SERVICE_PORT are used. {SVCNAME} indicates that the Service name is capitalized, And the hyphens are used as underscores. For example, for Service "redis-master", its port is TCP 6379 and the assigned Cluster IP address is 10.0.0.11. kubelet may generate the following variables for the newly created Pod container:

REDIS_MASTER_SERVICE_HOST = 10.0.0.11
REDIS_MASTER_SERVICE_PORT = 6379
REDIS_MASTER_PORT = tcp: // 10.0.0.11: 6379
Redis_master_port_61__tcp = tcp: // 10.0.0.11: 6379
REDIS_MASTER_PORT_6379_TCP_PROTO = tcp
REDIS_MASTER_PORT_6379_TCP_PORT = 6379
REDIS_MASTER_PORT_6379_TCP_ADDR = 10.0.0.11

Note that only pods created after a Service have the environment variables of the Service.

DNS

An optional Kubernetes attachment (recommended for use) is a DNS service. It tracks Service objects in the cluster and creates DNS records for each Service object. In this way, all pods can access the service through DNS.

For example, if we have a service named my-service in the Kubernetes namespace "my-ns", the DNS service will create a DNS record for "my-service.my-ns. Pods in the same namespace can use "my-service" to obtain the Cluster IP assigned to this Service, pods in other namespaces can use the fully qualified name "my-service.my-ns" to get the address of this Service.

Pod IP and Service Cluster IP

The Pod IP address actually exists on a network card (which can be a virtual device), but the Service Cluster IP address is different. No network device is responsible for this address. Kube-proxy uses Iptables rules to redirect data to its local port and then balance the data to the backend Pod. The preceding Service environment variables and DNS both use the Cluster IP address and port of the Service.

Take the image processing program we mentioned above as an example. When a Service is created, Kubernetes assigns it an address 10.0.0.1. This address is allocated from the address pool specified by the service-cluster-ip-range parameter (the old version is the portal_net parameter) of the API, for example, -- service-cluster-ip-range = 10.0.0.0/16. Assume that the port of this Service is 1234. All kube-proxies in the cluster will notice this Service. When the proxy finds a new service, it opens an arbitrary port on the local node, creates the corresponding iptables rule, and redirects the service IP address and port to the new port, start to accept the connection to this service.

When a client accesses this service, these iptable rules start to take effect, and the client traffic is redirected to the port opened by kube-proxy for this service, kube-proxy randomly selects a backend pod to serve the customer. This process is shown in:

Based on the Kubernetes network model, clients that access the Service using the Service Cluster IP address and Port can be located on any proxy node. To access the Service externally, we need to provide an external IP address for the Service.

External Access Service

The IP address assigned to the Service object in the Cluster IP range pool can only be accessed internally. If the Service acts as an internal layer of an application, it is quite appropriate. If this Service is used as a front-end Service to provide services to customers outside the cluster, we need to provide public IP addresses for this Service.

An external visitor is a visitor who accesses the cluster proxy node. To provide services for these visitors, we can specify spec. publicIPs when defining the Service. In general, publicIP is the physical IP address of the proxy node. Like the virtual IP address allocated on the previous Cluster IP range, kube-proxy will also provide Iptables redirection rules for these publicIP addresses to forward traffic to the backend pods. With publicIP, we can use common Internet technologies such as load balancer to organize external access to services.

Spec. publicIPs is marked as obsolete in the new version, instead of spec. type = NodePort. For this type of service, the system assigns a node-level port to each proxy node in the cluster. Clients that can access the proxy node can access this port, to access the service.

Label and Label selector

Label labels play an important role in the Kubernetes model. The Label is represented as a key/value pair and is attached to a Kubernetes-managed object, typically Pods. They define the recognition attributes of these objects for organizing and selecting these objects. The Label can be attached to an object when it is created, or the Label of the object can be managed through API when the object exists.

After defining the Label of an object, other models can use the Label selector to define the object to which the object works.

Two Label selectors are available: ity-based and Set-based.

For example, the choose ity-based selector sample is as follows:

environment = productiontier != frontendenvironment = production，tier != frontend

For the selector above, the first matching Label has the environment key and is equal to the production object, and the second matching has the tier key but the value is not equal to the frontend object. Because kubernetes uses the AND logic, the third item matches the production but is not the frontend object.

Sample Set-based selector:

environment in (production, qa)tier notin (frontend, backend)partition

The first option is an object with an environment key and the value is the label appended to the production or qa. The second option has a tier key, but its value is not frontend or backend. The third option is an object with a partition key. The value is not verified.

The replication controller and Service use label and label selctor to dynamically configure the target object. When defining the Pod, the replication Controller specifies the Label of the Pod to be created and the selector that matches the Pod. The API server should verify the definition. We can dynamically modify the Pod Label created by replication controller for debugging and data recovery. Once a Pod is removed from the replication controller due to a Label change, replication controller immediately starts a new Pod to ensure the number of copies in the replication pool. For Service, Label selector can be used to select the backend Pods of a Service.

OpenStack, Kubernetes, and Mesos

Problems encountered during Kubernetes cluster construction and Solutions

For details about Kubernetes, click here
Kubernetes: click here

Next article: Analysis of the Kubernetes Application Deployment model (deployment)

Author profile: Yong yongsheng, an architect of Kyushu cloud. Years of development experience in Linux, J2EE, and cloud computing technologies. Currently, it is active in various projects in the OpenStack community. The main technical direction is the virtual network project Neutron, which was one of the major contributors of the Neutron project in the early stage.

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More