Summary
Project Background (XX Bank customer): The private cloud on the k8s to run like MySQL in the state of the database services, performance and latency are relatively sensitive, not like the web bias application of stateless delay performance almost acceptable. But the network performance and delay is poor based on overlay mode, and the network architecture is more complex. And banks need to be simple and controllable for IP network management. Sr-iov is based on hardware implementation of virtual network card,kubernetes pause loss of performance, close to host, in addition to support Qos,vlan features are also customer needs. That is, according to user customization based on Sr-iov network Plug-ins.
Pause pod kubernetes
Problem: currently kubernetes (1.8) (later version may support Large), POD does not have network-related configuration, Kubelete call CNI plugin The default will only Cni_args incoming Pod_name basic information. If you want to assign IP addresses and configure network features such as Qos,vlan, you cannot pass through Cni_args, and you cannot configure options-selectable network parameters in pod spec like volume to pass in CNI plugin. A workable solution: Before declaring a pod, the network configuration information is stored in the external Configmap or elsewhere based on the Pod_name, kubernetes pause pod and the custom Cni,ipam network plug-in obtains the configuration information from the outside according to the Pod_name. CNI Working principle
Kubernetes Guide CNI
CNI: Container Network Interface Network plug-in is a stand-alone executable file, called by the upper container management platform. The Web plugin has only two things to do: Add the container to the network and remove the container from the network. The data that invokes the plug-in is passed in two ways: environment variables and standard input. Kubernetes the workflow after using the CNI network plug-in:
Kubernetes first create the pause container to generate the corresponding network namespace call Network driver (because the configuration is CNI, so will invoke CNI related code CNI driver according to the configuration call specific CNI plug-in CNI plug-in to P The Ause container is configured with the correct network, and the other containers in the pod are pause networks.
Kubernetes network Interface CNI and the practice of Ling que Cloud
Operational personnel perspective, in the traditional operation and maintenance of the IP must have a strong control (banks, etc.), the pod needs fixed IP: Yu Yunwi, the network is very important resources, to the IP for strong control, the service back and forth, will let his sense of security drop a lot. Operation Services has a lot of ip-based things, there is flow and sudden monitoring, if you service the IP has been changing, through this IP it is very difficult to use this service, the equivalent of IP monitoring is meaningless, because do not know the IP traffic went up which service, it is difficult to correspond to this matter.
There is no way to do it for IP security policy.
Kubelet and CNI plugin invoke logical diagram:
Hackers-guide-kubernetes-networking Kubernetes Unfortunately still supports only one CNI interface/POD with one cluste R-wide configuration. This is very limiting since we could want to configure multiple network The per POD, interfaces using potentially over Lay solutions with different policies (subnet, security, QoS). Kubelet would pass the POD name and namespace as part of the Cni_args variable (for example) K8s_pod_namespace=default; k8s_pod_name=mytests-1227152546-vq7kw; "). We can use this to customize the network configuration/pod namespace (e.g. put every namespace in a different s ubnet). Future kubernetes versions'll treat networks as equal citizens and include network configuration as part of the POD or n Amespace spec just like memory, CPUs and volumes. For the time being, we can use annotations to store configuration or record POD networking data/state.
Multus-cni
The integration of kubernetes and CNI plugin from Source view
Kubernetes/pkg/kubelet/network/cni/cni.go func (plugin *cninetworkplugin) buildcniruntimeconf (PodName string, Podns string, Podsandboxid Kubecontainer. Containerid, Podnetnspath String) (*LIBCNI. runtimeconf, error) {RT: = &libcni. runtimeconf{ContainerID:podSandboxID.ID, Netns:podnetnspath, Ifname:network. Defaultinterfacename, Args: [][2]string{{"Ignoreunknown", "1"}, {"K8s_pod_namespace", POD
Ns}, {"K8s_pod_name", Podname}, {"k8s_pod_infra_container_id", Podsandboxid.id},},} LIBCNI func (c *cniconfig) addnetwork (NET *networkconfig, RT *runtimeconf) invoke. Execpluginwithresult (Pluginpath, net.
Bytes, C.args ("ADD", RT)//The Runtimeconf.args is introduced as an environment variable. Stdoutbytes, err: = E.rawexec.execplugin (Pluginpath, netconf, args. Asenv ()) Type CNI interface {addnetworklist (NET *networkconfiglist, RT *runtimeconf) (types. result, error) delnetworklist (NET *nEtworkconfiglist, RT *runtimeconf) Error addnetwork (NET *networkconfig, RT *runtimeconf) (types. result, error) Delnetwork (NET *networkconfig, RT *runtimeconf) Error} type runtimeconf struct {containerid stri Ng Netns String ifname string Args [][2]string//A Dictionary of Capability-specific da Ta passed by the runtime//to plugins as top-level keys in the ' runtimeconfig ' dictionary//of the plugin ' s STDI n Data. Libcni'll ensure that only keys//in this map which match the capabilities of the plugin are passed//to the P Lugin Capabilityargs map[string]interface{}}
CNI Development Reference ResourcesCNI Source Analysis (comparison system) in-depth understanding of the CNI CNI spec document official Maintenance plugin Plug-ins: Plugins with Cnitool debugging their own written Plugin:cnitool script run the container Test yourself plugin:docker-run.sh Official plugin Sample:sample
Kubelet principleKubernetes Guide Kubelet kubernetes Introduction: Kubelet and Pod
Its core work is to monitor the Apiserver, configuration directory (default/etc/kubernetes/manifests/) and other lists, once the current node of the pod configuration changes, according to the latest configuration to perform the response action, to ensure the operation of pod status and expectations of the same.
If a local pod is found to be modified, Kubelet will make the appropriate modifications, such as deleting a container in the pod, and then deleting the container through Docker client. If you find the pod that deletes this node, remove the pod and remove the container from the pod by Docker client. Regularly report the status of the current node to the Apiserver, for scheduling use, through cadvisor monitoring nodes and container resources. Create a container for each pod with a "kubernetes/pause" image. The pause container is used to take over the network of all other containers in the pod. Each creation of a new pod,kubelet creates a pause container and then creates additional containers.
Kubelet Source Analysis
Kubelet Source Analysis: Start process (v1.5.0 version) parsing parameter configuration information, and so on, after initialization preparation, create kubedeps This object:
In fact, it holds kubelet the objects of various important components, the reason is to pass it as a parameter, is to achieve Dependency Injection. Simply put, the Kubelet dependent Component object is passed in as a parameter to control the behavior of the Kubelet. For example, when testing, as long as the build fake component implementation, it is easy to test. Kubedeps contains a lot of components, some of which are listed below: Cadvisorinterface: A component that provides Cadvisor interface functionality, responsible for collecting monitoring information for containers Dockerclient:docker clients, and for interacting with Docker Kubeclient:apiserver client, used to communicate with API server Mounter: Perform mount related operations networkplugins: Network plug-ins, perform network setup work Volumeplugins:volume Plug-ins, execute Volume set work runkubelet function:
Created by builder Kubelet object (Pkg/kubelet/kubelet.go#newmainkubelet) runs Kubelet object according to run mode, Various components start with Goroutine
asynchronous event-driven: Syncloop is the Kubelet main loop method that listens for changes from different pipelines (files, URLs, and Apiserver) and brings them together. When a new change occurs, it invokes the corresponding handler function to ensure that the pod is in the desired state.
Important objects contained in the Kubelet object: Podconfig: This object contains pod information that the node will run from the three sources in the file, network, and Apiserver, and is sent through the pipeline, reading the pipeline to get the latest pod configuration in real time Runtime: When the container is running, a layer of encapsulation of the container engine (docker or RKT) is responsible for invoking the state of the container engine interface to manage the container, such as starting, pausing, killing containers, and so on Probemanager: If POD is configured with state monitoring, then Probemanager Periodically checks pod for proper operation and updates pod status to Apiserver via Statusmanager Volumemanager: Responsible for the volume management required by the container. To detect whether a volume has been mount, to obtain the volume of pod use, and so on Podworkers: the specific performer, each time the pod needs to be updated, it is sent to it podmanager: The information about the pod is cached, and is where all the information needs to be accessed. Nodelister: Ability to read information about nodes in Apiserver
Kubelet Source Analysis-start operation and information processing
Kubelet Source Analysis: Pod new process (v1.5.0 version): Pod creation order Note: Create Pod data directories, store volume and plugin information if PV is defined, wait for all volume mount complete (volumemana GER will do these things in the background) if there is an image secrets, go to Apiserver get the corresponding Secrets data call container Runtime method, to implement the real container creation logic
Create a running Pause container (the Pause container is used to take over the network of all other containers in the pod) through Docker. Network configuration: If the pod is host mode, so is the container; in other cases, the container uses the None network mode to allow Kubelet network Plug-ins to configure themselves .
Article Source: https://yucs.github.io/2017/12/06/%202017-12-6-CNI/
Markdown files are kept updated in Github.com/yucs/yucs-awesome-resource, welcome to star, watch
If discrepancy please consult, the article continues to update ...