Socket Programming Overview
When it comes to the network programming must be inseparable from the socket, used to most of the time by writing down its usage, this time hope to understand some of the lower level of things, of course, these are the basis of network programming ~
(1) Socket address structure
Most say socket functions require a pointer to the socket address structure as a parameter, and each protocol family defines its own socket address structure, which begins with the structure sockadd_
.
IPV4 socket Address structure
The IPV4 socket address structure is often referred to as the "internetwork socket address Structure", named Sockaddr_in, and defined in the
/ * Internet address. */typedefuint32_t in_addr_t;structin_addr{in_addr_t s_addr;//32-bit IPV4 address};//The book below is not quite the same as in the source code and will be posted later//This should be related to the standards supported by the system//In the bookstructsockaddr_in{uint8_t Sin_len;//unsigned 8-bit integer (1 bytes)sa_family_t sin_family;//8-bit integer (1 bytes)in_port_t Sin_port;//At least 16-bit unsigned (2 bytes) structIN_ADDR sin_addr;//32 bits (4 bytes) Charsin_zero[8];//(8 bytes)};//The size of the socket is at least 16 bytes
There is one sentence in the book: sa_family_t
it can be any unsigned integer type. In implementations that support length fields, it sin_len
is usually a 8-bit unsigned integer, whereas in implementations that do not support length fields, it is a 16-bit unsigned integer.
Based on the latter, in the source code of my Ubuntu14.04, it is inconsistent with the content given in the book:
struct sockaddr_in { __sockaddr_common (Sin_); //macro definition in_port_t sin_port; /* Port number. */ struct in_addr sin_addr; /* Internet address. */ /* Pad to size of ' struct sockaddr '. */ //this part is to make sockaddr_in and sockaddr equal in size unsigned char sin_zero[sizeof (struct sockaddr)-__sockaddr_common_size-si Zeof (in_port_t)-sizeof (struct IN_ADDR)]; };
About this __SOCKADDR_COMMON (sin_)
is a macro definition with the following contents:
#define __SOCKADDR_COMMON(sa_prefix) \sa_family_t sa_prefix##family
In plain words, the sa_prefix prefix is stitched together with this family, __SOCKADDR_COMMON (sin_);
meaning that it is equivalent to the following statement:
sa_family_t sin_family;
Also means that in My computer's source code, there is no sin_len field, just as in the book said a situation, the book:
In the Support Length field implementation, sa_family_t is a 8-bit unsigned integer that is a 16-bit unsigned integer in the implementation that does not support the length field.
So, in theory, sa_family_t is usually a 16-bit, 2-byte unsigned integer, and indeed, the following definition is found:
typedefunsignedshortint sa_family_t;//unsigned short int 2个字节
In the POSIX specification, these fields are only required sin_family
sin_addr
(in_addr_t at least 32-bit unsigned) and sin_port
(in_port_t at least 16-bit unsigned) , but in addition, we need some extra fields , making the entire structure fill at least 16 bytes. This can and sockaddr
mutual transformation, about sockaddr_in and sockaddr, and then simply say two sentences, the former in the application layer used, the latter in the kernel state use, details can refer to: sockaddr and sockaddr_in differences. The socket is always referenced as a parameter (pointer to the struct), so the pointer needs to support the address structure of any protocol family socket, because the function is generic, so there is struct sockaddr
.
Two more important things are:
The 1.ipv4 address and port number are always stored in the network byte order in the socket address structure, noting the difference from host byte order.
2.ipv4 addresses are accessed in two different ways: because the sin_addr itself is a struct. Suppose serv is an internetwork socket address structure:
(a) SERV.SIN_ADDR will refer to the 32-bit IPV4 address in the form of the struct .
(b) SERV.SIN_ADDR.S_ADDR will refer to the same 32-bit IPV4 address according to in_addr_t (usually 32-bit unsigned integer) .
The specific use of the method will be determined according to the actual situation.
IPV6 socket Address structure
Defined in
struct in6_addr{ uint8_ts6_addr[16]; //128-bit Ipv6 address }; #define SIN6_LEN Struct sockaddr_in6{ uint8_t sin_len; sa_family_t sin6_family; in_port_t sin6_port; uint32_t sin6_flowinfo; struct in6_addr sin6_addr; uint32_t sin6_scope_id; }
Similar to IPV4, the source code in my computer with the book also has a loss of difference, probably also because the standard of support is different:
struct sockaddr_in6 { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ /* IPv6 scope-id */ };
From the socket level, there are some small contrasts between IPv6 and IPv4: The Af_inet6,ipv4 address family is af_inet when IPv6 address family. IPV6 socket Structure Minimum 28 bytes, IPV4 socket structure at the hour of 16 bytes, a new universal socket address structure defined as part of the IPV6 Socket API customer service has struct sockaddr
some existing shortcomings, new struct sockaddr_storage
enough to accommodate any socket address structure supported by the system.
(2) value-result parameter
We pass a pointer to the socket structure when we pass the socket structure to the socket function, and the length of the structure is sizeof
passed as a parameter, but the way it is passed depends on the direction of the structure's delivery: from the process to the kernel, from the kernel to the process.
process to Kernel
The socket function that passes the socket address structure in this direction has 3: bind
, connect
and sendto
, for example:
struct sockaddr_in serv;connect(sockfd,(struct sockaddr*)&serv,sizeof(serv));
The size of the pointer and pointer content is passed to the kernel, so the inner box knows exactly how much data needs to be copied from the process.
kernel-to-process
There are 4 functions that pass a socket address structure in this direction: accept
, recvfrom
, getpeername
and getsockname
Similar to the process-to-kernel function, except that it is a pointer to a socket structure or a length parameter, but this length gives a pointer. The reason here is well understood, the former is to tell the size of the kernel structure so that the kernel does not cross over the structure address, which is returned as a result, telling the process kernel how much information is stored in the structure.
PS: For IPV4, the size of the Sockaddr_in pass and return is 16.
Summary
the parameters passed from the process to the kernel require the kernel to know how many bytes the kernel needs to read
The parameters passed from the kernel to the kernel are required to let the process know how many bytes the kernel has written to the process.
(3) byte sorting function
Focus on how to convert between host byte order and network byte order.
big endian and small end
This problem from the computer composition of the structure as if it has not mastered the appearance, this time to thoroughly understand!
First, for a 16-bit integer whose 16-binary representation assumes 0x1234, then we say:
The high byte of this number is 0x12
The low byte of this number is the 0x34
For addresses, we use a 1-byte address offset to represent it. This is defined as a small end and a big endian:
low-order bytes are stored at the start address (where the offset is small) called the small-end mode
the high-order byte is stored at the start address (in a small offset position) called the big-endian mode
For details, refer to: Big-endian small-end mode detailed
So according to the above definition, 0x1234 in the large/small end mode should be the same
Address offset | Big-
endian mode |
Small terminal mode |
0x00 |
0x12 (High-order bytes) |
0X34 (Low order byte) |
0x01 |
0X34 (Low order byte) |
0x12 (High-order bytes) |
Reference code:
//test.c#include <stdio.h>intMain () {Union{ ShortSCharc[sizeof( Short)]; }un; un.s=0x0102;if(sizeof( Short)==2) {if(un.c[0] ==1&& un.c[1] ==2)printf("big-endian\n");Else if(un.c[0] ==2&& un.c[1] ==1)printf("little-endian\n");Else printf("unknown\n"); }return 0;}
We collectively refer to the byte storage order of the big and small ends as the "host byte order". corresponding, of course, there is the network byte order . These multibyte integers are transmitted in an Internet protocol using big endian byte sequences. The conversion between the two byte sequences has the following four functions:
#include <netinet/in.h>//返回网络字节序uint16_t htons(uint16_t host16bitvalue);uint16_t htonl(uint32_t host16bitvalue);//返回主机字节序uint16_t ntohs(uint16_t host16bitvalue);uint16_t htohl(uint32_t host16bitvalue);//-------------------/*//这四个函数通常被定义成宏定义。h:hostn:networkl:longs:short*/
(4) Byte manipulation function
BYTE processing function
#include <strings.h> void bzero(void *dest,size_t nbytes); //初始化void bcopy(constvoid *src,void *dest,size_t nbytes); //拷贝int bcmp(constvoid *ptrl,constvoid *ptr2,size_t nbytes); //若相等则为0,否则为非0
There are functions like these in the C language.
#include <string.h> void *memset ( void *dest,int c,size_t len); //corresponds to Bzero void *memcpy ( void *dest,const void *src,size_t Nbytes); //corresponds to bcopy int memcmp (const void *ptrl, const void *ptr2,sieze_t nbytes); //if equal is 0, otherwise <0 or >0
Use the time to pay attention to the SRC and dest order can not remember in the terminal input:
memcpy
So we know what the required parameters of these functions mean, huh?
(5) Address conversion function
These functions are:
inet_aton
,,, inet_addr
inet_ntoa
inet_pton
and inet_ntop
.
The function of these functions is to convert the Internet address between the ASCII string network and the byte-order binary , because people are more familiar with using strings for tagging, and the values stored in the socket address structure are often binary.
#include <arpa/inet.h>//将字符串形式的点分十进制字符串转换成为IPv4地址int inet_aton(constchar * strptr,struct in_addr *addrptr);in_addr_t inet_addr(constchar * strptr);//返回一个指向点分十进制字符串的指针char * inet_ntoa(struct in_addr inaddr);
The above functions are either discarded, or have better function replacements, inet_pton
and are inet_ntop
new functions that appear with IPV6, for both IPV4 and IPV6 addresses.
#include <arpa/inet.h>int inet_pton(int family,constchar * strptr,void *addrptr);//成功返回1失败返回0constchar * inet_ntop(intconstvoid * addrptr,char *strptr,size_t len);//---------------------/*p:代表 Presentation 表达n:代表 numeric 数值inet_pton做从表达格式到数值格式的转化inet_ntop做从数值格式到表达格式的转化(strptr必须事先分配好空间,len防止缓冲区溢出)*/
The family parameter can AF_INET
or may be AF_INET6
.
Example
inet_pton(AF_INET,cp,&foo.sin_addr);//上述语句等价于:foo.sin_addr.s_addr=inet_addr(cp);//----------------------------char str[len];ptr=inet_ntop(AF_INET,&foo.sin_addr,str,sizeof(str));//上述语句等价于:ptr=inet_ntoa(foo.sin_addr);
In the UNIX network programming book, for inet_ntop
a layer of encapsulation, callers can ignore their protocol family:
#include "unp.h"#ifdef have_sockaddr_dl_struct#include <net/if_dl.h>#endif/ * include sock_ntop * /Char*sock_ntop (Const structSockaddr *sa, socklen_t salen) {Charportstr[8];Static Charstr[ -];/ * Unix domain is largest * / Switch(sa->sa_family) { CaseAf_inet: {structSOCKADDR_IN *Sin= (structsockaddr_in *) sa;if(Inet_ntop (Af_inet, &Sin->SIN_ADDR, str,sizeof(str)) = = NULL)return(NULL);if(Ntohs (Sin->sin_port)! =0) {snprintf(Portstr,sizeof(PORTSTR),":%d", Ntohs (Sin->sin_port));strcat(str, PORTSTR); }return(str); }/ * End Sock_ntop * /#ifdef IPV6 CaseAf_inet6: {structSockaddr_in6 *sin6 = (structSOCKADDR_IN6 *) sa; str[0] =' [';if(Inet_ntop (AF_INET6, &sin6->sin6_addr, str +1,sizeof(str)-1) = = NULL)return(NULL);if(Ntohs (sin6->sin6_port)! =0) {snprintf(Portstr,sizeof(PORTSTR),"]:%d", Ntohs (Sin6->sin6_port));strcat(str, PORTSTR);return(str); }return(str +1); }#endif//... return(NULL);}Char*sock_ntop (Const structSockaddr *sa, socklen_t Salen)//external interface{Char*ptr;if(ptr = sock_ntop (SA, salen) = = NULL) Err_sys ("Sock_ntop Error");/* Inet_ntop () sets errno */ return(PTR);}
(6) I/O functions
We often use Yes, read
and the write
book mentions that the number of bytes we request is often more than the number of bytes in the input output, because of the buffer size limit, so that we have to call it multiple times read
write
, so the author for convenience period again in the encapsulation.
readn
: Reads n bytes from a descriptor
wirten
: Write n bytes into descriptor
readline
: Reads a line of text from the descriptor one byte at a time.
The specific implementation of these functions is given in the book, which is nothing more than an internal invocation read和write
, where we can use it until it is OK.
(7) Summary
Socket programming Basics: Including data structures and some functions and their authors ' encapsulation of functions, presumably in the following order:
Socket structure-------------------------------network byte order and host byte order conversion (port number used)----buffer operation (copy, etc.)->I/O operation with the function of socket structure as parameter.
Understanding the structure of sockets is very important, in addition to the network programming, but also until the call of the special function, do not need to memorize, is not able to remember in the Linux terminal Input man
command to see the corresponding function required parameters and return values and other information.
UNIX Network Programming notes (2)-Introduction to socket programming