Ipvs Study Notes (1)

Source: Internet
Author: User

Thank you very much.Yfydz boss released ip_vs Implementation Analysis SeriesArticleThis allows me to understand the working principle and source code composition of ipvs as soon as possible.

HoweverYfydz's article is too long to facilitate subsequent retrieval. I plan to sort it out and send it to my blog.

 

1,IpvsThere are three load balancing Modes

Nat,Tunnel,Direct routing(Dr)

Nat: All interactive data must pass through the balancer

Tunnel: Semi-Join Processing Method.IPEncapsulation

Dr: ModifyMacAddress, which must be in the same CIDR block.

 

2,IpvsSupported balanced SchedulingAlgorithm

Round-Robin Scheduling)

Weighted Round Scheduling (weighted round-robin scheduling)

Least-connection Scheduling)

Weighted Least-connection Scheduling)

Locality-based least connections Scheduling)

Locality-based least connections with replication Scheduling)

Destination hashing Scheduling)

Source hashing Scheduling)

 

3,IpvsCodeRecord

The kernel isLinux-kernel 3.3.7

 

3. 1, Struct

IpvsEach struct is defined inInclude \ net\ Ip_vs.hAndInclude \ Linux\ Ip_vs.hHeader file

Struct ip_vs_protocol

This structure is used to describeIpvsSupported IP protocol.IpvsThe IP layer protocol supports TCP, UDP, ah, and ESP.

Struct ip_vs_conn

This structure is used to describeIpvs.

Struct ip_vs_service

This structure is used to describeIpvsExternal Virtual Server Information

Struct ip_vs_dest

This structure is used to describe the actual server information.

Struct ip_vs_schedct

This structure is used to describeIpvsScheduling Algorithm. Currently, scheduling methods include RR, WRR, LC, wlc, lblc, lblcr, DH, and sh.

Struct ip_vs_app

This structure is used to describeIpvsApplication module object

Struct ip_vs_service_user

This structure is used to describeIpvsUser space virtual service information

Struct ip_vs_dest_user

This structure is used to describeIpvsReal Server Information of the user space

Struct ip_vs_stats_user

This structure is used to describeIpvsUser space statistics

Struct ip_vs_getinfo

This structure is used to describeIpvsUser space acquisition Information

Struct ip_vs_service_entry

This structure is used to describeIpvsService rule item information of a user space

Struct ip_vs_dest_entry

This structure is used to describeIpvsReal Server rule item information of the user space

Struct ip_vs_get_dests

This structure is used to describeIpvsObtain Real Server item information from user space

Struct ip_vs_get_services

This structure is used to describeIpvsObtain virtual service item information of user space

Struct ip_vs_timeout_user

This structure is used to describeIpvsUser space timeout Information

Struct ip_vs_daemon_user

This structure is used to describeIpvsKernel daemon information

 

3. 2Module initialization

Net \ netfilter \ ipvs\ Ip_vs_core.cFile

Static int _ init ip_vs_init (void)

IpvsService Initialization

Net \ netfilter \ ipvs\Ip_vs_ctl.cFile

Int _ init ip_vs_control_init (void)

IOCTLInitialization

Net \ netfilter \ ipvs\Ip_vs_proto.cFile

Int _ init ip_vs_protocol_init (void)

Protocol initialization

Net \ netfilter \ ipvs\Ip_vs_conn.cFile

Int _ init ip_vs_conn_init (void)

Connection Initialization

Net \ netfilter \ ipvs\ Ip_vs_core.cFile

Static struct nf_hook_ops ip_vs_ops []

Ret = nf_register_hooks (ip_vs_ops, array_size (ip_vs_ops ));

NetfilterMount point array. For specific data packet processing, see the corresponding. HookFunctions

 

3. 3Specific implementation of Scheduling Algorithms

Algorithms andIp_vs_schedulerStruct

RrAlgorithm inNet \ netfilter \ ipvs\ Ip_vs_rr.cFile implementation, and so on.

Static struct ip_vs_scheduler ip_vs_rr_scheduler = {

. Name = "RR",/* name */

. Refcnt = atomic_init (0 ),

. Module = this_module,

. N_list = list_head_init (ip_vs_rr_scheduler.n_list ),

. Init_service = ip_vs_rr_init_svc,

. Update_service = ip_vs_rr_update_svc,

. Schedule = ip_vs_rr_schedule,

};

Init_service

Algorithm initialization,Call when binding the virtual service ip_vs_service to the scheduler (ip_vs_bind_scheduler () function)

Update_service ()

The function is called when the target server changes (for example, ip_vs_add_dest (), ip_vs_edit_dest)

The algorithm core function schedule () is called before the new ipvs connection in the ip_vs_schedule () function, find the real server to provide services, and establish the ipvs connection.

Specific Algorithm ImplementationSource code+YfydzBoss'sIpvsImplementation analysis.

 

3. 4, Connection management

Net \ netfilter \ ipvs\Ip_vs_conn.cFile

Struct ip_vs_conn * ip_vs_conn_in_get (const struct ip_vs_conn_param * P)

Direction

Struct ip_vs_conn * ip_vs_conn_out_get (const struct ip_vs_conn_param * P)

Outbound

Struct ip_vs_conn *

Ip_vs_conn_new (const struct ip_vs_conn_param * P,

Const Union nf_inet_addr * daddr, _ be16 dport, unsigned flags,

Struct ip_vs_dest * DEST, _ u32 fwmark)

Establish a connection

Static inline void

Ip_vs_bind_dest (struct ip_vs_conn * CP, struct ip_vs_dest * DEST)

Bind a Real Server

Int ip_vs_bind_app (struct ip_vs_conn * CP, struct ip_vs_protocol * pp)

Bind application protocol

Static inline void ip_vs_bind_xmit (struct ip_vs_conn * CP)

Bind sending Method

Static inline int ip_vs_conn_hash (struct ip_vs_conn * CP)

Add the connection structure to the connectionHashTable

Static inline int ip_vs_conn_unhash (struct ip_vs_conn * CP)

Slave connectionHashTable disconnected

Static void ip_vs_conn_expire (unsigned long data)

Connection timeout

Static inline void ip_vs_control_del (struct ip_vs_conn * CP)

Disconnect from master connection

Void ip_vs_unbind_app (struct ip_vs_conn * CP)

Unbind from Application

Static inline void ip_vs_unbind_dest (struct ip_vs_conn * CP)

Contact and Real Server binding

Static void ip_vs_conn_flush (struct net * Net)

Release all connections

Void ip_vs_random_dropentry (struct net * Net)

Delete the connection at regular intervals

Static inline int todrop_entry (struct ip_vs_conn * CP)

Determine whether to delete the connection

 

3. 5Protocol Management

Net \ netfilter \ ipvs\Ip_vs_proto.cFile

Static int _ used _ init register_ip_vs_protocol (struct ip_vs_protocol * pp)

Register oneIpvsProtocol

Static int unregister_ip_vs_protocol (struct ip_vs_protocol * pp)

Cancel oneIpvsProtocol

Struct ip_vs_protocol * ip_vs_proto_get (unsigned short PROTO)

Returns the service structure pointer.

Void ip_vs_protocol_timeout_change (struct netns_ipvs * ipvs, int flags)

Modify protocol timeout tag

Int * ip_vs_create_timeout_table (int * Table, int size)

Create status timeout table

Int IP_vs_set_state_timeout (int * Table, int num, const char * const * names,

Const char * Name, int)

Modify status timeout table

Const char * ip_vs_state_name (_ 002 proto, int state)

Name of the returned Protocol Status

The following describes the implementation of the TCP protocol in detail. The relevant code file is net \ netfilter \ s \ ip_vs_proto_tcp.c.

Struct ip_vs_protocol ip_vs_protocol_tcp = {

. Name = "TCP ",

. Protocol = ipproto_tcp,

. Num_states = ip_vs_tcp_s_last,

. Dont_defrag = 0,

. Init = NULL,

. Exit = NULL,

. Init_netns = _ ip_vs_tcp_init,

. Exit_netns = _ ip_vs_tcp_exit,

. Register_app = tcp_register_app,

. Unregister_app = tcp_unregister_app,

. Conn_schedule = tcp_conn_schedule,

. Conn_in_get = ip_vs_conn_in_get_proto,

. Conn_out_get = ip_vs_conn_out_get_proto,

. Snat_handler = tcp_snat_handler,

. Dnat_handler = tcp_dnat_handler,

. Csum_check = tcp_csum_check,

. State_name = tcp_state_name,

. State_transition = tcp_state_transition,

. App_conn_bind = tcp_app_conn_bind,

. Debug_packet = ip_vs_tcpudp_debug_packet,

. Timeout_change = tcp_timeout_change,

};

Static void _ ip_vs_tcp_init (struct net * Net, struct ip_vs_proto_data * PD)

TCPInitialization Function

Static void _ ip_vs_tcp_exit (struct net * Net, struct ip_vs_proto_data * PD)

TCPExit Function

Static int tcp_register_app (struct net * Net, struct ip_vs_app * Inc)

RegisterTCPApplication Protocol

Static voidTcp_unregister_app (struct net * Net, struct ip_vs_app * Inc)

CancelTCPApplication Protocol

Static int

Tcp_conn_schedule (int af, struct sk_buff * SKB, struct ip_vs_proto_data * PD,

Int * verdict, struct ip_vs_conn ** CPP)

TCPConnection scheduling,This function is called in the ip_vs_in () function.

Struct ip_vs_conn *

Ip_vs_conn_in_get_proto (int af, const struct sk_buff * SKB,

Const struct ip_vs_iphdr * IPH,

Unsigned int proto_off, int inverse)

Connect in the direction

Struct ip_vs_conn *

Ip_vs_conn_out_get_proto (int af, const struct sk_buff * SKB,

Const struct ip_vs_iphdr * IPH,

Unsigned int proto_off, int inverse)

Query Outbound Connections

Static int

Tcp_snat_handler (struct sk_buff * SKB,

Struct ip_vs_protocol * PP, struct ip_vs_conn * CP)

This function completes the source Nat operation on part of the protocol data. For TCP, the NAT part of the data is the source port

Static inline void

Tcp_fast_csum_update (int af, struct tcphdr * tcph,

Const Union nf_inet_addr * oldip,

Const Union nf_inet_addr * newip,

_ Be16 Oldport, _ be16 Newport)

Fast Calculation of TCP Checksum. Because only one parameter is modified on the port, fast calculation can be performed according to rfc1141.

Static int

Tcp_dnat_handler (struct sk_buff * SKB,

Struct ip_vs_protocol * PP, struct ip_vs_conn * CP)

This function completes the NAT operation on some data of the Protocol. For TCP, the NAT data is the destination port.

Static int

Tcp_csum_check (int af, struct sk_buff * SKB, struct ip_vs_protocol * pp)

Calculate the checksum in the IP protocol. For TCP and UDP headers, there are checksum parameters. The Checksum in TCP is required, while the UDP checksum does not need to be calculated.

This function uses the Linux kernel to provide standard checksum and calculation functions.

Static const char * tcp_state_name (INT state)

This function returns the Protocol Status name string

Static const char * const tcp_state_name_table [ip_vs_tcp_s_last + 1] = {

[Ip_vs_tcp_s_none] = "NONE ",

[Ip_vs_tcp_s_established] = "established ",

[Ip_vs_tcp_s_syn_sent] = "syn_sent ",

[Ip_vs_tcp_s_syn_recv] = "syn_recv ",

[Ip_vs_tcp_s_fin_wait] = "fin_wait ",

[Ip_vs_tcp_s_time_wait] = "time_wait ",

[Ip_vs_tcp_s_close] = "close ",

[Ip_vs_tcp_s_close_wait] = "close_wait ",

[Ip_vs_tcp_s_last_ack] = "last_ack ",

[Ip_vs_tcp_s_listen] = "listen ",

[Ip_vs_tcp_s_synack] = "synack ",

[Ip_vs_tcp_s_last] = "bug! ",

};

TCP protocol status name Definition

Static void

Tcp_state_transition (struct ip_vs_conn * CP, int direction,

Const struct sk_buff * SKB,

Struct ip_vs_proto_data * PD)

TCPStatus Conversion

Static inline void

Set_tcp_state (struct ip_vs_proto_data * PD, struct ip_vs_conn * CP,

Int direction, struct tcphdr * th)

SetTCPConnection status

Static struct tcp_states_t tcp_states []

TCPStatus conversion table

Static void tcp_timeout_change (struct ip_vs_proto_data * PD, int flags)

Timeout changes

Static int

Tcp_app_conn_bind (struct ip_vs_conn * CP)

This function binds the multi-connection application protocol processing module and ipvs connection.

 

To be continued

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.