This is a creation in Article, where the information may have evolved or changed.
- Service
- Kube-proxy
- Nodeport
K8s can have multiple copies of the pod, but there are several issues when accessing the pod:
- The client needs to know the address of each pod
- If the pod on a node fails, the client needs to perceive
To solve this problem, k8s introduced the concept of service to guide the client traffic.
Service
Take the following My-nginx as an example.
The definition files for Pod and service are as follows:
[root @localhost K8s]# cat Run-my-nginx.yaml apiversion:extensions/v1beta1kind:deploymentmetadata:name: Span class= "Hljs-keyword" >my -nginxspec:replicas: 2 template:metadata: Labels:run: my -nginx spec:containers:-Name: my -nginx Image:nginx Ports:-Containerport: 80 [root< Span class= "hljs-variable" > @localhost k8s]# cat Run-my-nginx-service.yaml ApiVersion:v1kind:Servicemetadata:name: my -nginx labels:run: my -nginxspec:ports:-Port: 80 protocol:tcp selector:run:
my -nginx
Pod My-nginx defines a replicas of 2, or 2 replicas, with a port number of 80;service My-nginx defined as selector run: my-nginx
, that is, the service selects all run: my-nginx
the label's pods, and the defined port is 80.
With Kubectl create-f xx.yml created, you can see 2 pods on the cluster, with addresses of 10.244.1.10/10.244.2.10 respectively; you can see 1 service,ip/port for 10.11.97.177/ 80, the docking of the endpoints is 10.244.1.10:80,10.244.2.10:80, that is, 2 pod service address, these three URLs in the cluster any node can use Curl Access.
[Root@localhostK8S]# Kubectl Get pods-n default-o wideNAME ready STATUS Restarts IP NODEmy-nginx-379829228-3n7551/1Running0 +H10.244.1.10Note2my-nginx-379829228-xh2141/1Running0 +H10.244.2.10Node1[root@localhost~]#[Root@localhost~]# Kubectl Describe Svc My-nginxName:my-nginxnamespace:defaultlabels:run=my-nginxselector:run=my-NGINXTYPE:CLUSTERIPIP:10.11.97.177Port: <unset> the/tcpendpoints:10.244.1.10: the,10.244.2.10: theSession Affinity:none
However, if you are looking at the IP information of each node in the cluster and cannot find the IP 10.11.97.177, how does curl access the backend endpoints through this (Virtual) IP address?
Here's the answer.
Kube-proxy
K8S supports 2 proxy modes, userspace and iptables. Starting with the v1.2 version, Iptables Proxy is used by default. So what is the difference between these two modes?
1, userspace
As the name implies, userspace is user space. Why do you call it that? Look at the picture below.
Kube-proxy will listen for each service to a random port (proxy port), and add a iptables rule: So to clusterip:port messages are redirect to proxy port Kube-proxy after receiving the message from the proxy port it listens to, round robin (default) or session affinity (session affinity, that is, the same client IP all go the same link to the same pod service), distributed to the corresponding pod.
Obviously userspace will cause all messages to go through the user state, performance is not high, now k8s is no longer used.
2, Iptables
We look back to see userspace, since the user state will increase performance loss, then there is no way to go? In fact, the user state is just a message lb, which can be completely done by iptables. K8s The following diagram clearly illustrates the difference between iptables and userspace: Kube-proxy is only a controller, not a server, and the real service is the netfilter of the kernel, The user state is iptables.
Kube-proxy's iptables mode also supports round robin (default) and session affinity.
So iptables how to do lb, but also can round-robin it? We look at the iptables rule My-nginx this service on a node by Iptables-save.
-A kube-services- D 10.11.97.177/ +-P tcp-m Comment--comment"Default/my-nginx:cluster IP"-M TCP--dport the-j kube-svc-bepxdjbuhfcsyic3-a kube-svc-bepxdjbuhfcsyic3-m Comment--comment"Default/my-nginx:"-M statistic--mode random--probability0.50000000000-j kube-sep-u4uwlp4or3lojbxu-a kube-svc-bepxdjbuhfcsyic3-m Comment--comment"Default/my-nginx:"-j Kube-sep-qhrwslkoo5yupi7o-a Kube-sep-u4uwlp4or3lojbxu- S 10.244.1.10/ +-M Comment--comment"Default/my-nginx:"-j kube-mark-masq-a kube-sep-u4uwlp4or3lojbxu-p tcp-m Comment--comment"Default/my-nginx:"-M tcp-j DNAT--to-destination10.244.1.10: the-A kube-sep-qhrwslkoo5yupi7o- S 10.244.2.10/ +-M Comment--comment"Default/my-nginx:"-j kube-mark-masq-a kube-sep-qhrwslkoo5yupi7o-p tcp-m Comment--comment"Default/my-nginx:"-M tcp-j DNAT--to-destination10.244.2.10: the
1th rule, finally see this virtual IP. node does not need to have this IP address, iptables in the destination address for Virutal IP compliant TCP packets, will go KUBE-SVC-BEPXDJBUHFCSYIC3 rules.
2nd 3 rule, the KUBE-SVC-BEPXDJBUHFCSYIC3 chain realizes the random matching of the message by 50% statistical probability to 2 rules (Round-robin).
4th 5 and 5/6 are pairs of 2 sets of rules that transfer messages to a real service pod.
At this point, from the physical node received the destination address is 10.11.97.177, the port number of 80 message started, to pod My-nginx received the message and response, described a complete link. It can be seen that the entire message link does not pass any user-state process, the efficiency and stability are relatively high.
Nodeport
In the above example, since 10.11.97.177 is actually a valid address within the cluster, because it does not actually exist, the access fails when accessed from outside the cluster, and the service burst needs to be leaked out. One solution given by k8s is Nodeport, where the client can access the K8s service based on the IP of any of the physical nodes in the nodeport+ cluster. How is this done?
The answer is still iptables. Let us take a look at the following example of Sock-shop, the method of creation see K8s.io, no longer repeat.
describe svc front-end -n sock-shopName: front-endNamespace: sock-shopLabels: name=front-endSelector: name=front-endType: NodePortIP: 10.15.9.0Port: <unset> 80/TCPNodePort: <unset> 30001/TCPEndpoints: 10.244.2.5:8079Session Affinity: None
View Iptables-save on any node:
-a kube-nodeports-p tcp-m comment--comment"Sock-shop/front-end:"-M TCP--dport30001-j kube-mark-masq-a kube-nodeports-p tcp-m Comment--comment"Sock-shop/front-end:"-M TCP--dport30001-j kube-svc-lfmd53s3ezeaousj-a kube-services- D 10.15.9.0/ +-P tcp-m Comment--comment"Sock-shop/front-end:cluster IP"-M TCP--dport the-j kube-svc-lfmd53s3ezeaousj-a kube-svc-lfmd53s3ezeaousj-m Comment--comment"Sock-shop/front-end:"-j kube-sep-sm6tgf2r62adfgqa-a Kube-sep-sm6tgf2r62adfgqa- S 10.244.2.5/ +-M Comment--comment"Sock-shop/front-end:"-j kube-mark-masq-a kube-sep-sm6tgf2r62adfgqa-p tcp-m Comment--comment"Sock-shop/front-end:"-M tcp-j DNAT--to-destination10.244.2.5:8079
As clever as you are, you must have seen it clearly.
However, the Kube-proxy iptables has a flaw, that is, when the pod fails to automatically retry, need to rely on readiness probes, the main idea is to create a detection container, when the back-end pod is detected, update iptables.