Name)
IP-Linux IPv4 protocol implementation
Synopsis (Overview)
# Include <sys/socket. h>
# Include <net/netinet. h>
Tcp_socket= Socket (pf_inet, sock_stream, 0 );
Raw_socket= Socket (pf_inet, sock_raw,Protocol);
Udp_socket= Socket (pf_inet, sock_dgram,Protocol);
Description)
Linux implementation describes the Internet protocol in rfc791 and rfc1122, version 4.IPThis includes the implementation of multi-channel broadcast technology that complies with rfc1112 Layer 2. It also includes IP Routers with packet filters.
Programmer interfaces are compatible with BSD sockets. For more information about sockets, seeSocket(7)
To create an IP socketSocket (pf_inet, socket_type, Protocol)Method callSocket(2) functions. Valid socket types (socket_type) include:Sock_streamUsed to openTCP(7) socket,Sock_dgramUsed to openUDP(7) socket, orSock_rawUsed to openRaw(7) sockets are used to directly access the IP protocol.ProtocolIt refers to the IP protocol included in the IP header identifier (header) to be received or sent. It is unique and valid for TCP sockets.ProtocolThe value is0AndIpproto_tcpUnique and valid for UDP socketsProtocolThe value is0AndIpproto_udp.ForSock_rawYou can specify a valid iana ip protocol code defined in rfc1700 to assign values.
When a process wants to accept a new access package or connection, it should useBind(2) bind a socket to a local interface address. Only one IP socket can be bound to any given local address (address, Port). It is declared when BIND is called.Inaddr_anyThe socket will be boundAllLocal interface. When called on an unbound socketListen(2) orConnect(2) The socket is automatically bound to a local addressInaddr_anyRandom idle port.
Unless you have setS0_reuseaddrOtherwise, a bound TCP local socket address is unavailable for a period of time after it is disabled. Be careful when using this identifier because it will make TCP unreliable.
Address format)
An IP socket address is defined as a combination of an IP interface address and a port number. The basic IP Protocol does not provide a port number.UDP(7) andTCP(7). For raw sockets,Sin_portSet to IP protocol.
-
Struct sockaddr_in {sa_family_t sin_family;/* address family: af_inet */u_int16_t sin_port;/* port in byte order */struct in_addr sin_addr;/* Internet address */}; /* Internet address. */struct in_addr {u_int32_t s_addr;/* address in byte order */};
Sin_familyAlways setAf_inet. This is required. In Linux 2.2, if this setting is missing, most network functions will returnEinval Sin_portContains the port numbers sorted by network bytes. The port numbers below 1024 are calledReserved port.Only valid user IDs are 0 orCap_net_bind_serviceOnly functional processes can be used.Bind(2) to these sockets, note that the original (raw) IPv4 protocol does not have such a port concept. They only use a higher protocol suchTCP(7) andUDP(7.
Sin_addrThe IP host address.Struct in_addrInADDRSome include host interface addresses in byte order of the network.In_addrYou can only useInet_aton(3 ),Inet_addr(3 ),Inet_makeaddr(3) database functions or directly use the name Parser (seeGethostbyname(3. IPv4 addresses are divided into single-point broadcast, broadcast transmission, and multi-point broadcast addresses. the single-point broadcast address specifies a single interface of a host. The broadcast address specifies all hosts in a CIDR Block, while the multi-point broadcast address addresses all hosts in a multi-point transfer group. only when the socket ID is setSo_broadcastIn the current implementation, connection-oriented sockets can only use a single point of transfer address.
Note that the address and port are always stored in byte order, which means you need to call the number assigned to the port.Htons(3).All Address/port processing functions in the standard library are run in byte network.
There are several special addresses:Inaddr_loopback(127.0.0.1) always represents the local host through the loop device;Inaddr_any(0.0.0.0) indicates any address that can be bound;Inaddr_broadcast(255.255.255.255) indicates any host, which is boundInaddr_anyIt has the same effect.
Socket options)
IP supports protocol-related socket options.Setsockopt(2) Set and useGetsockopt(2) The socket option level for reading. IP isSol_ip
-
Ip_options
-
Set or obtain the IP option for each packet sent by the socket. this parameter is a pointer to the storage buffer containing the option and option length.
Setsockopt(2) The system call sets the IP Option associated with a socket. The maximum IPv4 Option Length is 40 bytes. See rfc791 to obtain available options. If
Sock_streamWhen the initial connection request packet received by the socket contains the IP option, the IP option is automatically set to the option from the initial package, and the Routing header is reversed. after the connection is established, the access package modification option is not allowed. by default, the source route option for all incoming packets is disabled. You can use
Accept_source_routeSysctl to activate. Other options such as timestamp are still processed. For datagram sockets, IP options can only be set by local users.
Ip_optionsOf
Getsockopt(2) The current IP Option for sending will be placed in the buffer you provide.
-
Ip_pktinfo
-
Pass an include
PktinfoStructure (this structure provides some information about the access package)
Ip_pktinfoAuxiliary information. This option is only valid for the socket of the datagram class.
-
-
Struct in_pktinfo {unsigned int ipi_ifindex;/* interface Index */struct in_addr ipi_spec_dst;/* Route Destination Address */struct in_addr ipi_addr;/* Header ID Destination Address */};
-
Ipi_ifindexIt refers to the unique index of the interface for receiving packets.
Ipi_spec_dstIt refers to the destination address in the route table record, and
Ipi_addrThe destination address in the header. If
Ip_pktinfo,Then the outgoing packet will pass through
Ipi_ifindexIn
Ipi_spec_dstSet as the destination address.
-
Ip_recvtos
-
If this option is enabled
Ip_tos,The secondary information is transmitted together with the incoming packet. It contains a byte used to specify the type of the service/priority field in the header. This byte is a Boolean integer identifier.
-
Ip_recvttl
-
When this identifier is set, a message containing the time to live field of the received packet expressed in one byte is sent.
Ip_recvttlControl information. This option is not supported yet
Sock_streamSocket.
-
Ip_recvopts
-
Use one
Ip_optionsControl information transfer all access IP options to the user. Route Header ID and other options have been filled for the local host. This option is not supported
Sock_streamSocket.
-
Ip_retopts
-
Equivalent
Ip_recvoptsHowever, the original unprocessed option with a timestamp and the route record item that is not entered in the route section are returned.
-
Ip_tos
-
Set or receive the type-of-Service (TOS service type) field of each IP package originating from this socket. it is used to subscribe the priority in the network. TOS is a single-byte field. some standard TOS identifiers are defined:
Iptos_lowdelayUsed to minimize latency for interactive communication,
Iptos_throughputUsed to optimize throughput,
Iptos_reliabilityUsed for reliability optimization,
Iptos_mincostIt should be used as "fill data". For this data, low-speed transmission is irrelevant. at most, only one of these TOS values can be declared. others are invalid and should be cleared. lack of time-saving, Linux first sends
Iptos_lowdelayBut the exact method depends on the configured queuing rules. A valid user ID 0 or
Cap_net_adminCapability. Priority can also be passed in protocol-independent mode (
Sol_socket, so_priority) Socket options (see
Socket(7.
-
Ip_ttl
-
Sets or retrieves the current survival time field of the packet sent from this socket.
-
Ip_hdrincl
-
If yes, you can provide an IP header before user data.
Sock_rawValid. See
Raw(7) to obtain more information. When this identifier is activated, its value is
Ip_optionsSet and
Ip_tosIgnored.
-
Ip_recverr
-
Reliable error messages that allow passing extensions. if this identifier is activated in the data report, all generated errors will be queued up in an error queue for each socket. when you receive an error from the socket operation, you can set it by calling
Msg_errqueueIdentified
Recvmsg(2) To receive. The description is incorrect.
Sock_extended_errThe structure is
Ip_recverr,Level:
The auxiliary information of sol_ip is transmitted.This option is useful for reliably handling errors on unconnected sockets. the received data section of the error queue contains the error packet.
-
Use the IP address as follows:
Sock_extended_errStructure: ICMP packet receipt Error
Ee_originSet
So_ee_origin_icmp,Set the local error
So_ee_origin_local.
Ee_typeAnd
Ee_codeSet as the type and code field of the ICMP header identity.
Ee_infoInclude
EmsgsizeMTU.
Ee_dataNot used currently. When the error comes from the network, all IP options on the socket are activated (
Ip_options,
Ip_ttl, Etc.) and as a control information containing the transfer in the error package. The payload of the packet that causes the error will be returned with normal data.
-
In
Sock_streamSocket,
Ip_recverrThere will be slight differences in semantics. it does not save the next time-out error, but immediately transmits all incoming errors to the user. this is useful when the TCP connection time is short, because it requires fast error handling. be careful when using this option: because it does not allow proper recovery from route transfers and other normal conditions, it makes TCP unreliable and undermines protocol specifications. note that there is no error queue in TCP;
Msg_errqueueFor
Sock_streamThe socket is invalid. Therefore, all errors will be returned by the socket function, or only
So_error.
-
For raw sockets,
Ip_recverrAllow all received ICMP errors to be passed to the application. Otherwise, the error is reported only on the connected socket.
-
It sets or retrieves an integer Boolean ID.
Ip_recverrThe default value is off ).
-
Ip_pmtu_discover
-
Set or receive path MTU Discovery setting for the socket (path MTU Discovery setting ). when allowed, Linux will execute the path MTU discovery (path MTU found) defined in rfc1191 on this socket ). the don't segment identifier is set in all outgoing data reports. system-level default values are as follows:
Sock_streamSocket
Ip_no_pmtu_discSysctl control, and all other sockets are blocked.
Sock_streamFor sockets, the user has the responsibility to block the data according to the MTU size and re-transmit the data if necessary.
Emsgsize), The kernel will reject packages larger than the known path MTU.
Path MTU discovery (path MTU discovery) identifier |
Description |
Ip_pmtudisc_want |
Set each path. |
Ip_pmtudisc_dont |
No path MTU discovery (path MTU found ). |
Ip_pmtudisc_do |
Path MTU discovery (path MTU discovery ). |
When PMTU (path MTU) is allowed for search, the kernel automatically records the path MTU (path MTU) of each target host.Connect(2) It is convenient to connect to a specified peer machine.Ip_mtuThe socket option retrieves the currently known path MTU (path MTU) (for example, whenEmsgsizeAfter an error occurs). It may change over time. For a non-connected socket with many destinations, the new MTU of a specific destination can also use the error Queue (seeIp_recverr) To access the access. New errors will be queued for each incoming MTU update.
When MTU is searched, the initial packet from the datagram socket may be discarded. UDP-enabled applications should be aware of this and consider the packet relay transfer policy.
To boot the path MTU to discover the process on an unconnected socket, we can start it with a large datagram (with a header size greater than 64 KB) and gradually contract it by updating the path MTU.
To obtain the initial estimation of the path MTU connection, you can useConnect(2) connect a datagram socket to the destination address and callIp_mtu Option Getsockopt(2) retrieve the MTU.
-
Ip_mtu
-
Retrieves the current known path MTU of the current socket. It is valid only when the socket is connected. An integer is returned.
Getsockopt(2) Valid.
-
Ip_router_alert
-
Set the IP router warning (IP routeralert option) option for all packets to be forwarded on the socket. valid only for raw socket, which is useful for RSVP backend daemon in user space. the decomposed packages cannot be forwarded by the kernel. You have the responsibility to forward them. the socket binding is ignored. These packets are only filtered by protocol. an integer ID is required.
-
Ip_multicast_ttl
-
Set or read the survival time value of the Multi-Point broadcast package of the socket. this is important for setting the possible minimum TTL for multicast packets. the default value is 1, which means that the multicast packet does not exceed the bandwidth segment unless explicitly required by the user program. the parameter is an integer.
-
Ip_multicast_loop
-
Sets or reads a Boolean integer parameter to determine whether the multicast broadcast packet sent should be sent back to the local socket.
-
Ip_add_membership
-
Add a multicast group. The parameter is
Struct ip_mreqnStructure.
-
Struct ip_mreqn {struct in_addr imr_multiaddr;/* IP multicast group address */struct in_addr imr_address;/* IP address of the Local interface */INT imr_ifindex;/* interface Index */};
-
Imr_multiaddrThe address of the multicast group to which the application wants to join or exit. It must be a valid multicast address.
Imr_addressThe Local interface address used by the system to add multicast groups.
Inaddr_anyConsistent, then the system selects an appropriate interface.
Imr_ifindexIndicates to join/detach
Imr_multiaddrGroup interface index, or set to 0 to indicate any interface.
-
Because of compatibility, the old
Ip_mreqThe interface is still supported.
Ip_mreqnThere is only one difference, that is, it does not include
Imr_ifindexField.
Setsockopt(2.
-
Ip_drop_membership
-
Disconnects from a multicast group. The parameter is
Ip_mreqnOr
Ip_mreqStructure, which corresponds
Ip_add_membershipSimilar to. t p
Ip_multicast_ifSet the local device for the multicast socket. The parameter is
Ip_mreqnOr
Ip_mreqStructure, which corresponds
Ip_add_membershipSimilar.
-
When an invalid socket option is passed
Enoprotoopt.
Sysctls
The IP protocol supports the sysctl interface to configure some global options. sysctl can be read or written/Proc/sys/NET/IPv4 /*File or useSysctl(2) interface for access.
-
Ip_default_ttl
-
Set the default survival time value of the packet outside. This value can be used for each socket
Ip_ttlOption to modify.
-
Ip_forward
-
Use a Boolean flag to activate the IP forwarding function. You can also set IP Forwarding according to the interface.
-
Ip_dynaddr
-
Dynamic socket address and disguise record rewriting when the interface address is changed. this is useful for dialing interfaces with changed IP addresses. 0 indicates that the data is not overwritten. 1 enables the function, and 2 activates the redundancy mode.
-
Ip_autoconfig
-
No documentation
-
Ip_local_port_range
-
Contains two integers, defining the local port range allocated to the socket by default. the allocation starts with the first number and ends with the second number. note that these ports cannot conflict with the ports used in disguise (although this can be handled ). at the same time, random selection may cause some firewall package filter problems, they will mistakenly think that the local port is in use. the first number must be at least> 1024, preferably> 4096 to avoid conflict with the well-known port, thus minimizing firewall problems.
-
Ip_no_pmtu_disc
-
If enabled, MTU is not executed on TCP socket by default. if a firewall (used to discard all ICMP packets) or an interface is mistakenly configured on the path (for example, an end-to-end connection with different MTU ports is set ), path MTU may fail. it is better to repair the damaged vro on the path than to close the MTU throughout the whole process, because doing so will lead to high sales on the network.
-
Ipfrag_high_thresh, ipfrag_low_thresh
-
If the number of IP fragments waiting in the queue reaches
Ipfrag_high_thresh,The queue is empty
Ipfrag_low_thresh.This contains an integer that represents the number of bytes.
-
Ip_always_defrag
-
[New Feature in kernel 2.2.13; in earlier kernel versions, this feature was used during compilation
Config_ip_always_defragOption to control]
When the Boolean identifier is activated (not equal to 0, this is generated when some host identification packages between the source and target are too large to be split into many fragments.) It will be combined (fragment) before processing ), even if they are to be forwarded immediately.
This is done only when a firewall or transparent proxy server with a single network connection is running. For normal routers or hosts, never open it. otherwise, communication between fragments in different connections may be disrupted. in addition, fragment reorganization also takes a lot of memory and CPU time.
This is automatically enabled when camouflage or transparent proxy is configured.
-
Neigh /*
-
See
ARP(7)
IOCTLs
AllSocket(7) The description of IOCTL can be applied to IP addresses.
The IOCTL used to configure the firewall application is recorded inIpchainsPackageIpfw(7.
IOCTL used to configure common device parametersNetdevice(7) There is a description.
Notes)
UseSo_broadcastOption-it does not have permission requirements in Linux. an accidental broadcast can easily overload the network. for new application protocols, it is best to use multicast groups instead of broadcast. we do not encourage the use of broadcast.
Some other BSD socket implementations provideIp_rcvdstaddrAndIp_recvifSocket options to obtain the destination address and interface for receiving data packets. Linux has a more commonIp_pktinfoTo complete the same task.
Errors (error)
Enobufs, eperm for eacces, etc .)
-
Enotconn
-
The operation only defines the connected socket, but the socket is not connected.
-
Einval
-
Invalid parameters are passed. For sending operations
Blackhole)Caused by routing.
-
Emsgsize
-
The datagram is greater than the MTU in the path and cannot be split into fragments.
-
Eacces
-
Users who do not have the necessary permissions attempt to perform an operation that requires certain permissions, including
So_broadcastSend a packet to the broadcast address when the ID is set.
ProhibitedRoute sending package.
Cap_net_adminOr, if the valid user ID is not 0, modify the firewall settings.
Cap_net_bind_serviceWhen the capability or valid user ID is not zero, bind a reserved port.
-
Eaddrinuse
-
Try to bind to an existing address.
-
EnomemAnd
Enobufs
-
Insufficient memory available.
-
EnoprotooptAnd
Eopnotsupp
-
Invalid socket option passed.
-
Eperm
-
The user does not have the permission to set a high priority, modify the configuration or send signals to the request process or group.
-
Eaddrnotavail
-
Request an interface that does not exist or the source address of the request is not local.
-
Eagain
-
Operations on a non-blocking socket will be blocked.
-
Esocktnosupport
-
The socket is not configured or an unknown socket is requested.
-
Eisconn
-
Called on a connected socket
Connect(2)
.
-
Ealready
-
The connection operation on a non-blocking socket is in progress.
-
Econnaborted
-
Once
Accept(2) The connection is closed during execution.
-
Epipe
-
The connection is accidentally closed or the connection is closed by the peer.
-
Enoent
-
It is called on a socket that has not been reported
Siocstamp.
-
Ehostunreach
-
No valid route table record matches the destination address. This error can be caused by ICMP messages from the remote router or because of the local route table.
-
Enodev
-
The network device is unavailable or is not suitable for sending IP addresses.
-
Enopkg
-
The kernel subsystem is not configured.
-
Enobufs, enomem
-
There is not enough idle memory. This often means that the memory allocation is limited by the socket buffer limit, not because of the system memory, but this is not 100% correct.
Other errors may be generated by overlapping protocol families. SeeTCP(7 ),Raw(7 ),UDP(7) andSocket(7 ).