The 11th chapter of UNIX Network Programming learning notes, name and address translation

Source: Internet
Author: User
Tags htons fully qualified domain name

One, domain Name System (DNS) 1. Brief introduction

DNS is primarily used for mapping between host names and IP addresses.

The host name can be a simple name LJM, or it can be a fully qualified domain name and so on.

2. Resource records

Entries in DNS are called Resource records (RRs). We are interested in only a few RR types:

A record maps a host name to a 32-bit IPV4 address.

AAAA 4 A records map a host name to a 128-bit IPV6 address.

For example:

LJM in A

In aaaa3ffe:1f8d:9bc3:1234:ef93:ac89

PTR is called a pointer record. Map IP addresses to host names. For the IPV4 address, the 32-bit 4 bytes First reverse the order, and then add the For IPV6 addresses, 128-bit addresses are reversed for each four-bit group and then added

For example, two PTR records for the above host LJM are: and

3. Resolver and name server 1> name server.

Each organization will have a name server (sometimes called a DNS server). They hold a mapped resource record between the host name and the IP address.

2> Parser

Applications, such as the client and server, parse the host name or IP address through the function of the parser.

The typical parser functions are gethostbyname and gethostbyaddr.

3> specific process

When we need to parse a host name or IP address, we call the parser function in the code, then the parser function will issue a UDP query based on the IP address of the local name server in the parser configuration file (/etc/resolv.conf), if not found, Other name servers will be queried throughout the Internet. If the answer is too long, the local parser automatically switches to TCP.

Two, gethostbyname and GETHOSTBYADDR functions 1. gethostbyname function

Mapping from host name to IPV4 address.

The function performs only query A records, so it does not return even if the host has an IPV6 address.

#include     <netdb.h>struct hostent* gethostbyname (const char* hostname);//returns: If a non-null pointer is successfully returned, NULL is returned if an error is set and H_ Errno

1> Let's take a look at the structure hostent:

struct hostent{       char* h_name;      Canonical name       char** h_aliases;  Host alias       int h_addrtype;    Address family: af_inet       int h_length;      Address Length: 4       char** H_addr_list;//ipv4 address};

The canonical name can also be understood as the fully qualified domain name.

Host aliases, a host may have multiple host aliases, so here is a two-dimensional char array.

Address family and address length, because this can only be IPv4 address mapping, so these two values will not change, since it does not change, why do you need these two parameters? It feels a little superfluous.

H_addr_list, the host may have multiple IP addresses, so here is a two-dimensional array. Here the IP address is obviously binary type, if you want to output also need to convert the expression type. Note that this is a two-dimensional array of type char. Why not a one-dimensional array of in_addr structures? Unknown.

2> when GetHostByName returns an error:

Here, when the function sends an error, the errno is not set, but the H_errno is set. In general we use the function Hstrerror to parse this H_errno error value.

Const char* hstrerror (int h_errno);

3> Let's write an example program, enter any number of host names, output each hostname corresponding to the canonical name, alias, IP address.

#include "unp.h" int main (int argc, char** argv) {       char* ptr, * * PPTR;       Char buff[4];       struct hostent * HPTR;       while (--argc>0)       {             ptr=*++argv;             if ((Hptr=gethostbyname (PTR)) ==null)             {                    printf ("Error for Host:%s:%s\n", Ptr,hstrerror (H_errno));                    Continue;             }             printf ("Host name:%s\n", PTR);             for (pptr=hptr->h_aliases;*pptr!=null;pptr++)                    printf ("Aliases:%s\n", *pptr);             Switch (hptr->h_addrtype)             {case             af_inet: for                    (pptr=hptr->h_addr_list;*pptr!=null;pptr++)                           printf ("Aliases:%s\n", Inet_ntop (Hptr->h_addrtype, *pptr, buff, sizeof (buff)));                    break;             Default:                    printf ("Unknown address type\n");                    Break;}}}       

2. gethostbyaddr function.

In contrast to the function above, a binary IPv4 address is converted to the corresponding hostname.

#include     <netdb.h>struct hostent* gethostbyaddr (const char* addr, socklen_t len, int family);//Len is 4, Family returns for af_inet//: if a non-null pointer is successfully returned, NULL is returned if an error is set H_errno

The function queries the IN_ADDR.ARPA message record.

The function returns the same structure as gethostbyname, except that we only care about the h_name in the struct.

Three, getservbyname and Getservbyport functions

The above mentioned that we can use the hostname instead of the IP address, we can use the service name instead of the port number.

0. Like a host: You can use a host name and an IP address to identify a host.

Service: You can use the service name and port number to identify a service.

A service can support multiple protocols (TCP,UDP,SCTP), just as a process can listen for both TCP sockets and UDP sockets. Generally a service has only one port number such as HTTP 80

The mapping of the service name and port number is saved in a file (/etc/services).

So even if the port number changes, we only need to modify the file (/etc/services). Without having to recompile the program.

Take a look at the contents of/etc/services:

# service-name Port/protocol [aliases ...] [# Comment]

Tcpmux 1/tcp #TCP Port service multiplexer

Tcpmux 1/udp # TCPport Service multiplexer

HTTP 80/tcp www www-http # worldwideweb http

HTTP 80/udp www www-http # hypertext Transfer Protocol

HTTP 80/SCTP #HyperText Transfer Protocol

HTTPS 443/tcp #http protocol over Tls/ssl

HTTPS 443/udp #http protocol over Tls/ssl

HTTPS 443/SCTP # HTTP protocol Overtls/ssl

1. Getservbyname function

Map from the service name to the port number.

struct servent* getservbyname (const char* servname, const char* protoname);//return: If a non-null pointer is successfully returned, NULL if failed

This function is to find the contents of the local/etc/services, it is known,/etc/services each line has three things, service-name,port,protocol. Therefore, this function provides two parameters to uniquely identify a port number.

The struct body returned by the function servent:

struct servent{       char* s_name;       char** s_aliases;       int port;       Char* S_proto;};

Here we only relate to the port, note that the network byte order is returned, so when assigning a value to sockaddr_in, there is no need to convert the byte order.

The second argument of the function here can be null, at which point the port number returned depends on the implementation, but generally does not matter, because a service usually differs from the protocol, usually corresponding to a port number.

If Protoname is specified, the protocol must be supported.

Ser=getservbyname ("Tcpmux", "TCP"),//okser=getservbyname ("Tcpmux", "SCTP");//error

2. Getservbyport function

Mapping from port number to service name

struct servent* getservbyport (int port, const char* protoname);//return: If a non-null pointer is successfully returned, NULL if failed

Note that the port parameter here must be a network byte order.

struct servent* ser;ser=getservbyport (htons (1), "TCP"),//okser=getservbyport (htons (1), NULL);//okser=getservbyport (Htons (1), "SCTP");//error

Previous programs we use IP addresses and port numbers to identify processes on a target host. So far, we can use the host name and service name to identify processes on a target host.

We have changed our previous acquisition time client to use host name and service name to identify.

#include "unp.h" #define MAXLINE 1024#define PORT 13void err_sys (const char* s) {fprintf (stderr, "%s\n", s); Exit (1);}    int main (int argc, char** argv) {int sockfd,nbytes;    struct sockaddr_in servaddr;       Char buff[maxline+1];       Char str[128];       struct hostent* hp;       struct in_addr** pptr;//Note here is a one-dimensional array, with each element pointing to a struct.       struct servent* ser;       if (argc!=3)//Enter host name and service name Err_sys ("input error");       Hp=gethostbyname (argv[1]);       if (hp==null) {Err_sys ("wrong hostname");       } pptr= (struct in_addr**) hp->h_addr_list;       if ((Ser=getservbyname (argv[2], "TCP") ==null) Err_sys ("Wrong server name");              for (; *pptr!=null;pptr++) {if ((Sockfd=socket (af_inet,sock_stream, 0)) <0) continue;             Bzero (&servaddr,sizeof (SERVADDR));             Servaddr.sin_family=af_inet;             servaddr.sin_port=ser->s_port; memcpy (&servaddr.sin_addr,*pptr,sizeof (strUCT in_addr));             printf ("Trying%s\n", Inet_ntop (af_inet, (struct sockaddr*) &servaddr,str,sizeof (str))); if (Connect (sockfd, struct sockaddr*) &servaddr, sizeof (SERVADDR)) ==0) break;//success cl       OSE (SOCKFD);       } if (*pptr==null) Err_sys ("Unable to connect");             while (Nbytes=read (sockfd,buff,maxline) >0) {buff[nbytes]=0;       Fputs (buff,stdout); } exit (0);}

Here we enter parameters from the command line: Server host name and service name

We call gethostbyname to get a list of server IP addresses, one to try to connect. If the connection fails, close the socket and then re-socket,connect. You cannot directly re-connect.

Then we call Getservbyname to get the port number, where the port number is the well-known port number, because the function is to view the/etc/services of this machine to learn the port number. The server host also uses this well-known port number service name for bind.

Five, getaddrinfo function

Because the gethostbyname and GETHOSTBYADDR functions only apply to IPV4 addresses. So there's a function getaddrinfo that supports both IPV4 and IPV6.

The Getaddrinfo function handles the conversion of IP addresses and host names, and also handles the mapping of port numbers and service names.

#include     <netdb.h>int getaddrinfo (const char* hostname, const char* service, const struct addrinfo* hints, stru CT addrinfo **result);//return: Successfully returned 0, failed to return 0.

1. Hostname is the hostname of the input, service is the input services name, hints is the prompt information that needs to return the result, this information will affect the return result, can be null. Result is returned by the function.

2. Say why the return result has two * before results, the function returns a linked list, *result to the head structure of the list, note that at this point we are to let the function inside the memory space, call the function person only need to provide a pointer, inside the function to modify the pointer itself, Instead of modifying what this pointer is pointing to. For example:

int main () {       char * s1, * s2;       Getaddrinfo1 (&S1);       Getaddrinfo2 (s2);} void Getaddrinfo1 (char** s) {       *s=new char[];} void Getaddrinfo2 (char* s) {       s=new char[];}

Obviously S1 points to an internal reasonable space, and S2 after the function call, still belong to the wild pointer.

3. Here's a look at the structure addrinfo:

struct addrinfo{       int ai_flags;//ai_passive, ai_canonname       int ai_family;//af_xxx       int Ai_socktype;//sock_ XXX       int ai_protocol;//0 or ipproto_xxx       socklen_tai_addrlen;//length of ai_addr       char* ai_canonname;//ptr To canonical name for host       struct sockaddr* ai_addr;//ptr to socket address       struct addrinfo* ai_next;//ptr to NE XT struct in linked list};

Here the first 4 members are set for the hints parameter. The second 4 members are the result returned by result. Because the different values of the first 4 members affect the value of the last 4 members.

1> Ai_flags General identity values are ai_passive and ai_canonname

Ai_passive: The socket will be used for passive open.

Ai_canonname: Tells the function to return the canonical name of the host. namely Ai_canonname

2> ai_family returns the IPV4 address or IPV6 address of the host name.

3> Ai_socktype This specifies whether to return TCP or UDP because the service name may correspond to multiple protocols

4> Ai_protocol specifies a specific protocol that is used when Ai_socktype cannot uniquely identify a particular protocol. That is, because SCTP also belongs to the stream protocol, and its socktype is also sock_stream, so if a service supports TCP and SCTP, then we must specify the protocol name. That

Ipproto_tcp or IPPROTO_UDP.

The following 4 members will not say, very clear.

4. If hints is null, the value of Ai_flags,ai_socktype,ai_protocol is 0,ai_family and the value is Af_unspec.

5. The function return is a linked list, why?

When the host name provided has multiple IP addresses, each IP will return a corresponding structure.

When Ai_socktype is not specified, the provided service name supports multiple protocols, and each protocol returns a structure.

So when the hostname has 2 IP addresses, the service name supports TCP and UDP, and the Ai_socktype is not specified, the 2*2=4 structure is returned.

The order of the structural bodies of the returned list is not fixed.

6. The ai_addr in the returned struct can be used directly for socket,connect function calls. Because the IP address is binary, and the IP address is of type sockaddr, it is not protocol-independent. and the IP address and port number are network byte order, so no conversion function is required.

7. Some common inputs of the function getaddrinfo

The function has six parameter values that can be entered: Hostname,,service, and the first 4 members of hints.

(1) Client

For TCP and UDP clients, we use this function to create a connection that connects to the server. So we need to specify the value of hostname and service, and as for the 4 members of hints, if you know which type of socket you are dealing with, you should specify Ai_socktype or Ai_protocol. The general Ai_flags is ai_passive. Instead of specifying ai_family, because the host might be used for IPV4 and IPV6 addresses, we need an attempt to socket->connect.

(2) server-side

For servers, we use this function only to specify the port number. Therefore, we generally do not specify a host name, only service is specified. And the host name is empty, the IP address in the returned socket address is the wildcard address, which is exactly what we want.

Note If you do not specify ai_family or specify Af_unspec, then at least two structures are returned, one containing Ipv4 's wildcard address inaddr_any, and the other with IPV6 's wildcard address in6addr_any_init. At this point we can use Select to listen for these two sockets.

For the first 4 members of hints, we specify Ai_flags as ai_passive. And you should specify the type of socket to prevent multiple structures from being returned.

Note: Since we need to call the Accept function to get the socket, we need to start with a new socket structure, how is the size of the socket structure determined?

In the linked list returned by the function, each struct in addrinfo will have the size ai_addrlen of its socket structure. So we can confirm the size of the socket based on this value.

8. It can be said that the Getaddrinfo function is very powerful, but it is also very complex to use.

Vi.. Gai_strerror function

As mentioned above, the Getaddrinfo function successfully returns 0, and the failure returns an integer other than 0.

The Gai_strerror function is the wrong integer used to interpret the failure.

Const char* gai_strerror (int error); Typical use method: Int ret=getaddrinfo (...); if (ret!=0) {       

Seven, Freeaddrinfo function

The function getaddrinfo function returns a list of structs whose storage space is dynamically acquired, such as New or malloc. So we need to release the memory after we run out. is to call Freeaddrinfo to release.

void Freeaddrinfo (struct addrinfo* ai); Typical usage: struct addrinfo* result;intret=getaddrinfo (..., &result); Freeaddrinfo (result);

This allows you to release the entire list of returned links.

But there is a problem, for example, that we go through the traversal to find the structure we need, then copy the struct and then call the function to release all the memory.

However: If you only copy the structure itself, there is a problem, because the structure of the body has pointers (socket structure pointers and canonical name pointers), the copy of the time remember to point these pointers to the space to be copied. Otherwise, if you just copy the pointer, Freeaddrinfo will release all the memory so that the pointer to the struct that we copied is pointing to the freed memory, which is dangerous.

So when replicating a struct, remember that deep replication is required.

Viii. some practical examples of 1. For TCP clients:

Provide an acknowledgement of the hostname and service name, and then for each IP address of the server, proceed as follows:

while () {Socket->connect}

2. For the TCP server side:

The host name is not typically provided, but the service name is provided, and then for each type of local wildcard address and service name, it is:

while () {socket->bind}accept () ...

If Bind succeeds, it jumps out of the loop. A bit of a problem so that only one IP address can be bound: either the IPV4 address for the IPV6, or the wildcard address for the.

Nine, Getnameinfo function

The function is a complementary function of getaddrinfo, which provides the socket address, which returns the host name and service name.

<pre name= "code" class= "CPP" >int getnameinfo (const structsockaddr* addr, socklen_t Addrlen, Char*hostname, Socklen_t Hostlen, char* serv,socklen_t servlen, int flags);//return: Successfully returned 0, failed to return non 0

Each parameter is obvious, with only the last flag, which is used to indicate something. Like what:

Ni_dgram: Indicates that the returned service is UDP-based because there is a possibility that different protocols with the same port number correspond to different services.

Some of the other flag values:

Note: Here you can put the tag value or, then you can set two flag values at the same time.

Note: Getnameinfo and getaddrinfo are designed for DNS, and the general server is not using the Getnameinfo function, directly with the IP address to identify it, because Getnameinfo design DNS, it is very time-consuming.

Ten, can be re-entry function 1. Let's take a look at the code for gethostbyname and gethostbyaddr:

Static structhostent host;struct hostent* gethostbyname (const char* hostname) {       /*.....*/       return (&host);} struct hostent* gethostbyaddr (const char* addr,socklen_t len, int family) {       /*.....*/       return (&host);}

You can see that the function returns a static object. The problem is on top of that.

If we call the GetHostByName function in a main program and call gethostbyname in the signal processing function, see what happens:

Main () {       struct hostent* hp;       ...       Signal (SIGALRAM,SIG_ALRM);       ...       Hp=gethostbyname (...);} void Sig_alrm (Intsigno) {       struct hostent* hp1;       Hp1=gethostbyname (...);}

If at this time the main program executes to the function gethostbyname, and the function has already handled the static host object, ready to return, at this time a signal, then the main program interrupted, to deal with the signal, and in the signal processing function to recall the gethostbyname, Then the host object will be reused because there is only one process at this time, leaving only one copy, so that the value originally computed by the main program is rewritten as the value computed by the signal processing function call. This will result in an error.

This is the non-reentrant function.

2. Check the previous function reentrant:

1> gethostbyname, gethostbyaddr, Getservbyname, Getservbyport are non-reentrant functions

2> Inet_pton, Inet_ntop are reentrant.

3> getaddrinfo can be re-entered as long as the calling function is reentrant, such as the gethostbyname reentrant version, Getservbyname's reentrant version, which is called in the function.

4> Getnameinfo can be re-entered as long as the calling function is reentrant, such as the gethostbyaddr reentrant version, Getservbyport's reentrant version, which is called in the function.

3. The errno variable also has a similar problem.

First, each process has a copy of the errno. In the same process, for example, the following code:

if (Close (FD) <0) {       fprintf (stderr, "Close error, errno=%s\n", errno);}

If the close function produces an error at this point, the kernel setting errno, when the program call ends close, there is no time to execute the output, when the signal comes, the signal processing function also produces an error, the errno is reset, then return to the main program, there is a problem.

4. One way to solve the reentrant problem is to never call the non-reentrant function in the signal processing function, and save the errno first, and then restore it back at the end of the function. Such as:

void Sig_handler (int signo) {       int errno_save=errno;       /*...other code*/       Errno=errno_save;}

Also in the signal processing function, do not invoke standard I/O functions, such as fprintf. Because many versions implement standard I/O functions that are non-reentrant.

5. Gethostbyname_r and Gethostbyaddr_r functions

First, there are two ways to change a non-reentrant function into a reentrant function:

1> the problem with non-reentrant functions is to return a global static object. Instead, we can dynamically open up an object space by the caller and then modify it by the function. As gethostbyname, we can let the caller dynamically open up a Hostent object and then let the function handle it.

This is what the Gethostbyname_r and Gethostbyaddr_r functions do.

But the problem introduced:

The caller not only needs a new Hostent object, but also provides the space that the pointer in the Hostent object points to.

struct hostent* gethostbyname_r (const char* hostname,struct hosten* result, char*buf, int buflen, INT*H_ERRNOP);

Where result is the object to be modified by the function. And BUF is the piece of memory space that the pointer in this object points to. The error code is H_ERRNOP, not a global variable H_errno

It is difficult to confirm the size of this buf. Not good to use.

2> we can let the non-reentrant function itself be a new object, and finally return. Instead of returning a global static object. This is what the Getaddrinfo function does.

The problem introduced: the space allocated within the function must require the caller to display the release, i.e. call function Freeaddrinfo. Otherwise, it causes a memory leak.

Xi. Other Network information

We mentioned above Gethostby ..., getservby ....

Network information includes: host, service, network, protocol. Because we have:

Where hosts and networks are obtained through DNS.

Protocols and services are obtained by querying the files of the local host.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

The 11th chapter of UNIX Network Programming learning notes, name and address translation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.