Neutron-based Kubernetes SDN practice Experience

Source: Internet
Author: User
Tags network function haproxy

First of all, to everyone popular science under the kubernetes choice of the network interface, a brief introduction of the network implementation of the background.

The container network Interface is a set of container networks definition specification, including method specification, parameter specification, response specification and so on. The MLM only requires that network resources be freed when the container is created, when the container is allocated network resources, and the container is deleted. The entire interaction between the publisher and the caller is as follows:

The interaction between the MLM and the outside world is passed through the process parameters and environment variables, and only the output results conform to the specifications of the MLM, and there is no special requirement for implementing the language. For example, earlier versions of Calico used Python to implement the MLM specification, providing a network implementation for kubernetes. The common environment variables are set up as follows:

    • Cni_command: Call the specified operations, add means to increase the network card, del means to release the network card

    • Cni_containerid: Container ID

    • Cni_netns: Container Network namespace file location

    • Cni_args: Additional parameters to pass

    • Cni_ifname: Set the container NIC name, such as Eth0

Because of this, the implementation of the code is very easy to expand, in addition to the Macvlan of the bridge, and other basic implementation, there are a large number of third-party implementations to choose from, including Calico, Romana, flannel and other common implementations. At the same time, it supports a variety of container operations, including Docker, Rkt, Mesos, hyper and other container engines can be used. This is also a major reason why Kubernetes chose to use the service.

In contrast, the implementation of the CNM (Cotainer network model) proposed by Docker is more complex, but more perfect, and more close to the traditional network concept. As shown in the following:

The sandbox is the network namespace of the container, endpoint is a network card that connects to the container, and the network is a set of endpoint that communicates with each other, which is closer to the network definition in neutron.

In CNM, the Docker engine invokes the network implementation via the HTTP REST API to configure the network for the container. These API interfaces cover more than 10 interfaces such as network management, container management, creating endpoint, and so on. The CNM model also implies additional constraints such as the service mechanism, DNS mechanism, and so on, which are included with Docker itself, so that, to some extent, the CNM model is only implemented specifically for Docker containers and is not friendly to other containers.

Due to these technical reasons and some commercial reasons, Kubernetes finally chose the network interface as its own.

Of course, Kubernetes also provides some trickery methods to convert the interface of the CNM to the invocation of the model, thus realizing the common use of the two models. For example, this script transforms Kubernetes's call to to_docker into the corresponding operation of the Docker CNM network, enabling the conversion of the MLM to the CNM.

Next, we introduce the concept of network and the principle of communication in kubernetes.

In Kubernetes's network model, three basic constraints are agreed:

    1. All containers can communicate with each other directly by IP without snat.

    2. All hosts and containers can communicate with each other directly by IP without snat.

    3. The container sees its own IP as the container IP that other containers see.

On the basis of satisfying the constraint, kubernetes does not care about the specific network communication principle, only takes three constraints as the established fact, on this basis, according to kubernetes own logic processing network communication, thus avoids the kubernetes function tangled in the complex network realization.

In the network concept, there are two core IPs in Kubernetes:

    • POD IP: With the implementation of the service, kubernetes regardless of whether the IP can be reached, only responsible for the use of this IP to configure iptables, do health checks and other functions. By default, this IP is accessible within the Kubernetes cluster, and can be used for ping operations.

    • Cluster IP: That is, the service IP, this IP is only used in kubernetes to implement the service interactive communication, is essentially just a few dnat rules on iptables. By default, this IP only provides access to the service port and is not ping.

As an example of a clustered DNS service, the associated core iptables is as follows:

These iptables are generated by Kube-proxy, and Kube-proxy is not actually responsible for forwarding, so even if the Kube-proxy service is abnormal, the resulting iptables can still allow traffic to flow correctly between the service IP and pod IP. The network traffic path can be referenced by:

When the port 10.254.0.3 of the DNS service is accessed, Kube-proxy generates iptables Dnat rules, forwards traffic to the backend pod IP and corresponding port, and distributes the traffic randomly and evenly according to the number of IPs in the backend pod.

And Kube-proxy can get the status update of service and pod from Kube-apiserver, update iptables according to its status at any time, so as to realize the high availability and dynamic expansion of service.

On the basis of the IP communication mechanism, Kubernetes also improves network security and response performance through networking policy and ingress.

Network Policy provides networking isolation capability, which is based on Sig-network group Evolution, and Kubernetes only provides built-in Labelselector and label and Network Policy The API definition itself is not responsible for how the isolation is implemented. In the implementation of the kubernetes used by the network, only Calico, Romana, Contiv, and so on are few and few to realize network policy integration. A typical network policy definition is as follows:

Apiversion:extensions/v1beta1

Kind:networkpolicy

Metadata

Name:test-network-policy

Namespace:default

Spec

Podselector:

Matchlabels:

Role:db

Ingress:

-From:

-Podselector:

Matchlabels:

Role:frontend

Ports

-Protocol:tcp

port:6379

It specifies the constraint that the pod with the role:db tag can only be accessed by the pod with the role:frontend tag, except that all traffic is denied. In terms of functionality, Network policy can be equivalent to a neutron security group.

Ingress is responsible for the external provision of services, through Nginx to provide a separate interface to achieve the external provision of all services in the cluster, thereby replacing the existing implementation of exposing each service with Nodeport. At present, Kubernetes's Ingress provides nginx and GCE two kinds of implementations, interested students can directly refer to the official documents, Https://github.com/kubernetes/ingress/tree/master/controllers.

Kubernetes community, the more common network implementation is mainly the following two kinds:

    1. based on overlay network : flannel, weave as the representative. Flannel is the overlay network solution CoreOS for Kubernetes, and also the Kubernetes default network implementation. It is based on the overlay network of Vxlan or UDP whole cluster, thus realizes the communication of the container on the cluster, and satisfies the three basic constraints of the Kubernetes network model. Due to the additional losses such as packet unpacking of packets in the communication process, the performance is poor, but it has been basically satisfied.

    2. The network is implemented on the basis of L3 routing : Calico, Romana as the representative. Among them, Calico is widely circulated the performance of the best Kubernetes network implementation, based on pure three-layer routing to achieve network communication, combined with iptables implementation of security control, can meet the performance requirements of most cloud. However, because it requires that BGP be turned on on the host to form a routing topology, it may not be allowed on some data centers. At the same time, Calico supports network Policy earlier, and can calico its own data directly in kubernetes, enabling deep integration with kubernetes.

From the above network implementation, the current Kubernetes network implementation is still not quite mature SDN, so our company after examining Kubernetes, decided based on neutron, for kubernetes to provide a usable SDN implementation, This is the origin of the Skynet project.

Let me share with you some of the experience of Skynet in the process of practice.

In practice, the first thing to solve is kubernetes in the network concept, how to translate into neutron, can be more appropriate to achieve the function.

In the first version, the concept translation in the Kubernetes network corresponds to the following table:

    • POD----> virtual machines

    • Service-------> LoadBalancer

    • Endpoints-------> Pool

    • Service back-end pod----> member

However, because Kubernetes supports setting up multiple service ports on the same service, each load balancer for neutron supports only one external port. Fortunately, after the Mitaka version of OpenStack last year, Neutron LBaaS V2 was officially released, so there was a second version of the concept translation.

    • POD----> virtual machines

    • Service-----> Lbaasv2 loadbalancer

    • Service Port----->LBAASV2 Listener

    • Endpoints-----> Lbaasv2 Pool

    • Service back-end pod------>LBAASV2 member

    • POD livenessprobe----->health Monitor

The basic terms of the LBaaS V2 are illustrated below:

    • Load Balancer: Loads the balancer, which corresponds to a haproxy process that occupies a subnet IP. can be logically mapped to a service in kubernetes.

    • Listener: Listener that represents a front-end listening port provided by the load balancer itself. Corresponds to the port in ports in the service definition.

    • Pool: The member collection record for the listener backend.

    • Member: A member of the listener backend. The addresses list for the endpoints used by the service, each of which corresponds to the mapping of the Targetport in the service declaration.

    • The Member Health Checker in Monitor:pool, similar to the livenessprobe in Kubernetes, is not currently mapped.

For a mapping of the number of resources: a service for kubernetes, corresponding to a load Balancer. Each port in the service corresponds to a listener that listens for this load balancer. Each listener backend is docked to a pool containing the backend resources. Each service in the kubernetes has a corresponding endpoints to contain the back-end pod. Each ip+service in endpoints declares that the targetport of port corresponds to a member in the pool.

After the initial mapping of the concept, we briefly introduce the ideas in the development.

In the overall structure, Skynet resides between Kubernetes and Neutron, realizing the specification of the MLM, and configuring the network based on the neutron for the container. Service-watcher is responsible for monitoring the resources of kubernetes, translating the concept of service into neutron, thus realizing the complete network function. As shown below:

Kubelet is the direct operator that creates pods, and calls Skynet implementations through the MLM interface specification when setting up a network for the pod. Skynet assigns IP to the container by calling Neutron, and implements the setting of communication rules such as IP, routing, etc. by operating in the Pod Container Network command space.

and neutron native DHCP, LBaaS v2 and other mechanisms can basically remain the same. This enables complete integration, enabling the Kubernetes cluster to achieve full neutron SDN functionality. When DNS is needed in the container, it can be implemented by Neutron's own DHCP agent to perform the parsing and work properly in the cluster network.

As mentioned earlier, Skynet implements the MLM specification, and the interactive process between Kubelet and Skynet is as follows:

A brief introduction to each of the following steps:

Kubelet calls Skynet through the mechanism of the MLM, the main parameters are as follows:

    • Cni_command: Call the specified operations, add means to increase the network card, del means to release the network card

    • Cni_containerid: Container ID

    • Cni_netns: Container Network namespace file location

    • Cni_args: Additional parameters to pass

    • Cni_ifname: Set the container NIC name, such as Eth0

When you perform the add operation, Skynet creates a port for the pod by Neutron-server, based on the parameters passed in and the pod's configuration.

When you perform the add operation, Skynet creates a network device for the container and mounts it to the container namespace, based on port and network configuration.

Neutron-linuxbridge-agent, generates iptables based on the container's network and security group rules. This makes use of the neutron native security group function, but also can directly take advantage of Neutron's complete set of SDN implementations, including Vrouter, FWaaS, Vpnaas and other services.

Service-watcher The Kubernetes service is mapped to a neutron LBaaS v2 implementation, in the case of VLAN networks, the traffic between pod and service is communicated as follows:

When the container in the cluster accesses the service, kubernetes is accessed by the service name by default, and the service name through the neutron DHCP mechanism, which can be resolved by the DNSMASQ process of each network, obtains the service corresponding load balanced IP address, then can be used for network communication. Relays that are responsible for traffic by the physical switch.

In the actual implementation, the loadbalancer of a service in kubernetes is mapped to neutron as an example.

For example, the following service implementations:

Kind:service

Apiversion:v1

Metadata

Name:neutron-service

Namespace:default

Labels

App:neutron-service

Annotations

skynet/subnet_id:a980172e-638d-474a-89a2-52b967803d6c

Spec

Ports

-Name:port1

Protocol:tcp

port:8888

targetport:8000

-Name:port2

Protocol:tcp

port:9999

targetport:9000

Selector

App:neutron-service

Type:nodeport

Kind:endpoints

Apiversion:v1

Metadata

Name:neutron-service

Namespace:default

Labels

App:neutron-service

Subsets:

-Addresses:

-ip:192.168.119.187

Targetref:

Kind:pod

Namespace:default

Name:neutron-service-puds0

uid:eede8e24-85f5-11e6-ab34-000c29fad731

Resourceversion: ' 2381789 '

-ip:192.168.119.188

Targetref:

Kind:pod

Namespace:default

Name:neutron-service-u9nnw

uid:eede9b70-85f5-11e6-ab34-000c29fad731

Resourceversion: ' 2381787 '

Ports

-Name:port1

port:8000

Protocol:tcp

-Name:port2

port:9000

Protocol:tcp

Pod and service use specific annotations to specify the neutron network, IP, and other configuration, and kubernetes as far as possible decoupling.

When the above service is mapped to load balancer, it is defined as follows:

{

"Statuses": {

"LoadBalancer": {

"Name": "Neutron-service",

"Provisioning_status": "ACTIVE",

"Listeners": [

{

"Name": "neutron-service-8888",

"Provisioning_status": "ACTIVE",

"Pools": [

{

"Name": "neutron-service-8888",

"Provisioning_status": "ACTIVE",

"HealthMonitor": {},

"Members": [

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.188",

"Protocol_port": 8000,

"id": "461a0856-5c97-417e-94b4-c3486d8e2160",

"Operating_status": "ONLINE"

},

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.187",

"Protocol_port": 8000,

"id": "1d1b3da6-b1a1-485b-a25a-243e904fcedb",

"Operating_status": "ONLINE"

}

],

"id": "95f42465-0cab-477e-a7de-008621235d52",

"Operating_status": "ONLINE"

}

],

"L7policies": [],

"id": "6cf0c3dd-3aec-4b35-b2a5-3c0a314834e8",

"Operating_status": "ONLINE"

},

{

"Name": "neutron-service-9999",

"Provisioning_status": "ACTIVE",

"Pools": [

{

"Name": "neutron-service-9999",

"Provisioning_status": "ACTIVE",

"HealthMonitor": {},

"Members": [

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.188",

"Protocol_port": 9000,

"id": "2faa9f42-2734-416a-a6b2-ed922d01ca50",

"Operating_status": "ONLINE"

},

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.187",

"Protocol_port": 9000,

"id": "81f777b1-d999-48b0-be79-6dbdedca5e97",

"Operating_status": "ONLINE"

}

],

"id": "476952ac-64a8-4594-8972-699e87ae5b9b",

"Operating_status": "ONLINE"

}

],

"L7policies": [],

"id": "C6506b43-2453-4f04-ba87-f5ba4ee19b17",

"Operating_status": "ONLINE"

}

],

"Pools": [

{

"Name": "neutron-service-8888",

"Provisioning_status": "ACTIVE",

"HealthMonitor": {},

"Members": [

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.188",

"Protocol_port": 8000,

"id": "461a0856-5c97-417e-94b4-c3486d8e2160",

"Operating_status": "ONLINE"

},

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.187",

"Protocol_port": 8000,

"id": "1d1b3da6-b1a1-485b-a25a-243e904fcedb",

"Operating_status": "ONLINE"

}

],

"id": "95f42465-0cab-477e-a7de-008621235d52",

"Operating_status": "ONLINE"

},

{

"Name": "neutron-service-9999",

"Provisioning_status": "ACTIVE",

"HealthMonitor": {},

"Members": [

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.188",

"Protocol_port": 9000,

"id": "2faa9f42-2734-416a-a6b2-ed922d01ca50",

"Operating_status": "ONLINE"

},

{

"Name": "",

"Provisioning_status": "ACTIVE",

"Address": "192.168.119.187",

"Protocol_port": 9000,

"id": "81f777b1-d999-48b0-be79-6dbdedca5e97",

"Operating_status": "ONLINE"

}

],

"id": "476952ac-64a8-4594-8972-699e87ae5b9b",

"Operating_status": "ONLINE"

}

],

"id": "31b61658-4708-4a48-a3c4-0d61a127cd09",

"Operating_status": "ONLINE"

}

}

}

The corresponding Haproxy process configuration is as follows:

# Configuration for Neutron-service

Global

Daemon

User Nobody

Group Nogroup

Log/dev/log local0

Log/dev/log Local1 Notice

Stats Socket/var/lib/neutron/lbaas/v2/31b61658-4708-4a48-a3c4-0d61a127cd09/haproxy_stats.sock mode 0666 level user

Defaults

Log Global

Retries 3

Option Redispatch

Timeout Connect 5000

Timeout client 50000

Timeout server 50000

Frontend 6cf0c3dd-3aec-4b35-b2a5-3c0a314834e8

Option Tcplog

Bind 192.168.119.178:8888

Mode TCP

Default_backend 95f42465-0cab-477e-a7de-008621235d52

Frontend C6506B43-2453-4F04-BA87-F5BA4EE19B17

Option Tcplog

Bind 192.168.119.178:9999

Mode TCP

Default_backend 476952ac-64a8-4594-8972-699e87ae5b9b

Backend 476952ac-64a8-4594-8972-699e87ae5b9b

Mode TCP

Balance Roundrobin

Server 81f777b1-d999-48b0-be79-6dbdedca5e97 192.168.119.187:9000 weight 1

Server 2faa9f42-2734-416a-a6b2-ed922d01ca50 192.168.119.188:9000 weight 1

Backend 95f42465-0cab-477e-a7de-008621235d52

Mode TCP

Balance Roundrobin

Server 1d1b3da6-b1a1-485b-a25a-243e904fcedb 192.168.119.187:8000 weight 1

Server 461a0856-5c97-417e-94b4-c3486d8e2160 192.168.119.188:8000 weight 1

In summary, through the neutron-based Skynet, we have initially implemented the SDN function for Kubernetes, while providing the following network enhancements:

    1. The pod IP, MAC, hostname and other network configuration to maintain;

    2. Based on the neutron security group, the network isolation function between pods is realized, which is more common.

    3. Support Direct external service through Haproxy, performance will be much better than native iptables.

Of course, there are currently some kubernetes features that are not supported in the Skynet network scenario and need to be enhanced or implemented later:

    1. Headless Services This type of service that does not have a cluster IP cannot be processed.

    2. Because the message between Neutron-server and Neutron-plugin is carried out through RABBITMQ, it is not particularly suitable for the situation of rapid network change in the container environment, which is a bottleneck of the whole scheme.

Neutron-based Kubernetes SDN practice Experience

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.