This is a creation in Article, where the information may have evolved or changed.
"Editor's note" This article mainly introduces the Kubernetes cloud platform PAAs in the continuous integration and continuous deployment, based on Docker and the use of the Network solution selection and application, as well as the increase in business requirements experienced by the network program changes, including:
- Kubernetes + flannel;
- Network customization based on Docker libnetwork;
- Kubernetes + Contiv + kube-haproxy;
- Application container IP fixed scheme based on Kubernetes.
Background introduction
The PAAs platform is deployed in a mirrored manner, and the corporate business domain corresponds to the application within the platform. Platform Application management includes application configuration management and application run-state management. An application running state corresponds to a replication Controller of the Kubernetes (which uses the RC abbreviation) and a service, and the application instance corresponds to the pod in the Kubernetes, and based on this management method, You need to provide mutual calls between applications, while providing HTTP/TCP-based direct access to some applications.
Kubernetes + Flannel
First of all, Kubernetes flannel. Flannel mainly provides inter-host container communication, in Kubernetes pod, service model, Kube-proxy and iptables to achieve pod and service communication (external Access service through the host port- >cluster ip:port->pod ip:port).
Based on this network access feature, our platform provides the following features:
- Access –watch k8s Endpoints Event Management router information based on the platform domain name provided by Gorouter;
- Based on Skydns and customizing the Kube2sky components and Kubelet, providing access to business domain names between applications (pods) in the same namespace-Kube2sky parsing and registering domain name information based on kubernetes Service annotation, Kubelet set the domain search and external DNS when the container starts;
- Implement the container TTY Access console-each Kubernetes node deployment Platform Component TTY Agent (establishes a TTY connection to the Kubernetes node based on the node information the pod belongs to);
The network access diagram is as follows:
Under the Kubernetes flannel model, the container network is a closed subnet, which provides 4-layer and 7-layer calls between platform applications, while providing direct access to external applications based on domain names (working on seven levels), but does not meet the need for users to use IP access directly outside the platform.
Network customization based on Docker Libnetwork
After the flannel network has been used stably, it begins to study the networks plugin so that the application service instances are available to users directly using the public IP. At that time, the version of Docker was 1.8, itself does not support the network plug-in, while Kubernetes itself to provide a set of network plug-in based on the web, but the bug itself [the invoked twice with Non-infra container ID # [20379] (https://github.com/kubernetes/.../20379)]. So try to start with the Docker network plugin, and try to combine libnetwork from the Docker source point of view to customize.
The entire architecture is divided into three tiers:
- Client layer-docker CLI and kubernetes (Docker client);
- Docker layer-docker Daemon and integrates libnetwork at the code level (built-in OvS driver);
- Controller Layer-ovsdb-server and network controller (self-developed ipam);
The entire solution consists of the following three processes:
To start Docker Daemon:
Initialize network controller, load OVS Driver, OVS Driver call Libovsdb Create Docker0-ovs Bridge, OVS Driver attach a physical NIC on the host to On the Docker0-ovs;
To start the container:
OVS driver Create veth pair for connecting network namespaces, OVS driver Call network controller get container IP and VLAN Tag--OVS driver will veth pair One end of the Docker0-ovs, and set the VLAN Tag---OVS Driver Settings container interface Ip,mac address and routing, set each network interface to up;
To stop a container:
OVS Driver Call Network Controller release container IP--delete network link, OVS driver call libovsdb Delete port;
Kubernetes + Contiv + kube-haproxy
With the development of the Docker version, Docker1.9 began to support Contiv Netplugin, and we began to study contiv applications, while also completing the use of Haproxy replacement kube-proxy [https://github.com/ Adohe/kube2haproxy], and finally use the Docker 1.10 contiv on-line.
[Contiv's network model has been explained in detail in previous sharing http://dockone.io/article/1691] here, we describe the PAAs in the Contiv overall deployment structure according to our actual network access relationship:
Kube-haproxy replaced the Kube-proxy, mainly to provide service IP public calls, while avoiding the increase in the number of containers brought about by the iptables rules of a large number of growth, convenient debugging. It is important to note that the Haproxy deployed machine IP and kubernetes Service IP are on the same network segment.
The convenience of the Contiv is that the user can access it directly from the instance IP, but there was a problem during the use: the computer room power outage resulted in incorrect distribution of some IP, and Contiv did not provide an interface to view the assigned IP at that time.
Application Container IP Pinning
Docker 1.10 supports the designation of IP boot containers, and since some applications have a fixed demand for instance IP, we begin to design and develop container IP fixed solutions.
As mentioned earlier, when the application runs, it corresponds to a replication controller and a service within the kubernetes. The current strategy used to redeploy the application is primarily a rebuild strategy. The process of rebuilding involves removing all pods under RC and RC, updating and creating a new RC (Kubernetes will generate a new pod based on the RC configuration).
In the default Kubernetes contiv network environment, the container (POD) IP network connection is done by Contiv network plugin, Contiv Master only enables simple IP address assignment and recycling, each time the application is deployed, There is no guarantee that the pod IP will not change. So we introduced the new pod-level ipam to ensure that the pod IP was always constant when the same application was deployed multiple times.
As a pod-level ipam, we integrate this functionality directly into the kubernetes. Pod as the minimum dispatch unit of Kubernetes, the original kubernetes Pod Registry (mainly responsible for processing all pod and pod Subresource related requests: Pod additions and deletions, pod bindings and status updates, exec/ Attach/log) does not support the allocation of Ip,pod IP to pods when the pod is created by acquiring the IP of the Pod Infra container, and the Pod Infra container IP is dynamically allocated for Contiv.
Custom Pod Registry operation Flowchart:
Based on the original kubernetes code, we modified the pod structure (adding Podip in the Podspec) and re-wrote the pod Registry and introduced two new resource objects:
- Pod IP allocator:pod IP Allocator is an IP address allocator based on ETCD, which mainly realizes the allocation and recovery of pod IP. Pod IP Allocator records the allocation of IP addresses through bitmaps, and persists the bitmap to ETCD;
- The pod IP recycler:pod IP recycler is a ETCD-based IP address Recycle Bin and the core of the pod consistent IP implementation. The Pod IP recycler is based on the RC Full name (namespace RC name) to record the IP addresses that have been used by each application and to pre-use the reclaimed IP at the next deployment.
The pod IP recycler will only reclaim the IP of the pod created through RC, and the IP of the pod created by the other controller or directly will not be recorded, so the IP of the pod created in this way will not remain the same, and the pod IP Recycle detects the TTL of each reclaimed IP object and is currently set to a one-day retention period.
The Kubelet has also been modified to include the creation of containers based on the specified IP in pod spec (Docker run join IP designation) and the release of IP operations when pods are deleted. The UML timing diagram for creating pods is as follows:
There are two main scenarios of pod creation in PAAs:
- The first deployment and expansion of the application, the situation is mainly from the IP pool randomly allocated;
- Application redeployment: When redeploying, the IP that has been freed has been stored in the IP Recycle list according to the RC full name, where the IP is prioritized from the Recycle list to achieve the IP fixed effect.
The UML timing diagram for deleting pods is as follows:
Pod Deletion process trigger scenario: Delete App/app indent/re-deploy rebuild policy.
The overall removal process is: Call Apiserver Pod Delete and set deletiontimestamp by Paasng or Kube-controller-manager, Kubelet hear delete time and get gracefuldeletiontime, delete application container, notify Apiserver release IP (release IP when the pod belongs to RC, according to whether there is a corresponding RC name to decide whether to store in the IP Recycle list), Delete the pause pod and notify Apiserver to delete the Pod object.
In addition, in order to prevent problems, we added additional rest APIs to the Kubernetes: including queries for assigned IPs, assigning/releasing IPs manually, etc.
Summarize
The container IP fixed scheme is on-line, the operation is basically no problem, but the stability needs to be improved. The main performance is not to stop the old pod within the expected time, and thus unable to release the IP resulting in no reuse (the initial reason is that the occasional stall of Docker prevented the container from being stopped within the specified time). Our short-term work around is a manual fix using the extra Rest API, which later incorporates monitoring and provides an automated remediation mechanism for IP change issues that can be caused by unpredictable problems, while the IP fix itself continues to enhance stability and optimize on demand.
Q&a
Q: Would you like to ask this IP extranet for direct access?
A: In the flannel network can not directly access the pod, in the Contiv network, pod allocated network segment is on the switch side of the tag, the default office network can be directly accessed.
Q: Your company has spent considerable effort in container interoperability and significantly increased complexity, if direct use of Kubernetes's host network can directly bypass these complex points?
A: First, the needs of the company's business. The company has 1k+ business domain, running in different Docker containers, each business domain configuration is basically fixed, such as Tomcat or nginx used port, if the use of host network, port conflict is the first problem, the whole service management mode also need to change.
Q: Does the article refer to an application's operating state corresponding to the kubernetes of an RC and Service,rs is better than RC?
A: Our Kubernetes version is 1.2,kubernetes 1.2 inside RS This thing is still in a very early version, only later, RS is also the recommended next generation RC.
Q: What kind of application has to be fixed IP, is there any other way to avoid it?
A: The business domains call each other, and some business domains require the call Fang Bai list to be available. There are also business domains that require online data access, the ability to add appropriate firewall permissions, and so on.
Q: If the IP is fixed, is the IP of the container connected to the host network through the bridge to the host's Web bridge?
A: In the case of fixed IP, the Contiv-based network is still working, except that the ipallocator is responsible for allocating the IP at the time of the Docker runtime when the Ip,docker is started using--IP. The current OvS works as a physical network card that is connected to the host by Ovsbridge.
Q: Fixed IP, the assigned IP needs and host the same network segment?
A:kubernetes node host network segment and pod network segment are different. In principle it can be the same or different.
does q:kubernetes support docker boot plus parameters, such as--ip?
A: Not supported by default, we have made some changes to Kubelet: for example, passing in the VLAN ID by parameter and specifying the--IP of Docker run according to the IP assigned in Podspec.
Q: As far as I know contiv now more is the support of CNM, to kubernetes words you custom development of many?
A:kubernetes, we use CNM. More is to adapt the current contiv to the relevant components to make modifications, as well as the back of the fixed IP (ipam integration into the kubernetes apiserver).
The above content is organized according to the November 8, 2016 night group sharing content. Share people
Wangchengchang, a senior development engineer at the PAAs platform, focuses on platform DevOps solution process optimization, continuous deployment, platform log collection, Docker and Kubernetes research。 Dockone Weekly will organize the technology to share, welcome interested students add: Liyingjiesz, into group participation, you want to listen to the topic or want to share the topic can give us a message.