Linux Kernel-network stack implementation analysis (6)-Application Layer data packet acquisition (I)

Source: Internet
Author: User

This article analyzes Linux 1.2.13

Original works, reprint please mark http://blog.csdn.net/yming0221/article/details/7541907

For more please refer to the column, address http://blog.csdn.net/column/details/linux-kernel-net.html

Author: Yan Ming

Note: "(top)", "(bottom)" in the title indicates the data packet transmission direction in the analysis process: "(top) "indicates that the analysis is from the bottom up," (bottom) "indicates that the analysis is from the top down.


In the previous blog, the transport layer obtains data packets from the network layer and mounts the data packet cache structure sk_buff to the receiving queue of a specific sock structure.

Next, we will analyze how the application obtains network data packets from the transport layer. There are two main methods for obtaining data packets at the transport layer at the application layer: System Call and file operations.

System call:

In Linux, the user program calls the kernel function from the user State to the kernel state to complete the corresponding service.

Some functions of the network stack are completed by calling sys_socketcall.

The specific code is in net/socket. C. The function in this file is equivalent to a bridge between system call and kernel network stack.

/* *System call vectors. Since I (RIB) want to rewrite sockets as streams, *we have this level of indirection. Not a lot of overhead, since more of *the work is done via read/write/select directly. * *I'm now expanding this up to a higher level to separate the assorted *kernel/user space manipulations and global assumptions from the protocol *layers proper - AC. */asmlinkage int sys_socketcall(int call, unsigned long *args){int er;switch(call) {case SYS_SOCKET:er=verify_area(VERIFY_READ, args, 3 * sizeof(long));if(er)return er;return(sock_socket(get_fs_long(args+0),get_fs_long(args+1),get_fs_long(args+2)));case SYS_BIND:er=verify_area(VERIFY_READ, args, 3 * sizeof(long));if(er)return er;return(sock_bind(get_fs_long(args+0),(struct sockaddr *)get_fs_long(args+1),get_fs_long(args+2)));case SYS_CONNECT:er=verify_area(VERIFY_READ, args, 3 * sizeof(long));if(er)return er;return(sock_connect(get_fs_long(args+0),(struct sockaddr *)get_fs_long(args+1),get_fs_long(args+2)));case SYS_LISTEN:er=verify_area(VERIFY_READ, args, 2 * sizeof(long));if(er)return er;return(sock_listen(get_fs_long(args+0),get_fs_long(args+1)));case SYS_ACCEPT:er=verify_area(VERIFY_READ, args, 3 * sizeof(long));if(er)return er;return(sock_accept(get_fs_long(args+0),(struct sockaddr *)get_fs_long(args+1),(int *)get_fs_long(args+2)));case SYS_GETSOCKNAME:er=verify_area(VERIFY_READ, args, 3 * sizeof(long));if(er)return er;return(sock_getsockname(get_fs_long(args+0),(struct sockaddr *)get_fs_long(args+1),(int *)get_fs_long(args+2)));case SYS_GETPEERNAME:er=verify_area(VERIFY_READ, args, 3 * sizeof(long));if(er)return er;return(sock_getpeername(get_fs_long(args+0),(struct sockaddr *)get_fs_long(args+1),(int *)get_fs_long(args+2)));case SYS_SOCKETPAIR:er=verify_area(VERIFY_READ, args, 4 * sizeof(long));if(er)return er;return(sock_socketpair(get_fs_long(args+0),get_fs_long(args+1),get_fs_long(args+2),(unsigned long *)get_fs_long(args+3)));case SYS_SEND:er=verify_area(VERIFY_READ, args, 4 * sizeof(unsigned long));if(er)return er;return(sock_send(get_fs_long(args+0),(void *)get_fs_long(args+1),get_fs_long(args+2),get_fs_long(args+3)));case SYS_SENDTO:er=verify_area(VERIFY_READ, args, 6 * sizeof(unsigned long));if(er)return er;return(sock_sendto(get_fs_long(args+0),(void *)get_fs_long(args+1),get_fs_long(args+2),get_fs_long(args+3),(struct sockaddr *)get_fs_long(args+4),get_fs_long(args+5)));case SYS_RECV:er=verify_area(VERIFY_READ, args, 4 * sizeof(unsigned long));if(er)return er;return(sock_recv(get_fs_long(args+0),(void *)get_fs_long(args+1),get_fs_long(args+2),get_fs_long(args+3)));case SYS_RECVFROM:er=verify_area(VERIFY_READ, args, 6 * sizeof(unsigned long));if(er)return er;return(sock_recvfrom(get_fs_long(args+0),(void *)get_fs_long(args+1),get_fs_long(args+2),get_fs_long(args+3),(struct sockaddr *)get_fs_long(args+4),(int *)get_fs_long(args+5)));case SYS_SHUTDOWN:er=verify_area(VERIFY_READ, args, 2* sizeof(unsigned long));if(er)return er;return(sock_shutdown(get_fs_long(args+0),get_fs_long(args+1)));case SYS_SETSOCKOPT:er=verify_area(VERIFY_READ, args, 5*sizeof(unsigned long));if(er)return er;return(sock_setsockopt(get_fs_long(args+0),get_fs_long(args+1),get_fs_long(args+2),(char *)get_fs_long(args+3),get_fs_long(args+4)));case SYS_GETSOCKOPT:er=verify_area(VERIFY_READ, args, 5*sizeof(unsigned long));if(er)return er;return(sock_getsockopt(get_fs_long(args+0),get_fs_long(args+1),get_fs_long(args+2),(char *)get_fs_long(args+3),(int *)get_fs_long(args+4)));default:return(-EINVAL);}}

The macro called by the system is defined as follows:

#define SYS_SOCKET1/* sys_socket(2)*/#define SYS_BIND2/* sys_bind(2)*/#define SYS_CONNECT3/* sys_connect(2)*/#define SYS_LISTEN4/* sys_listen(2)*/#define SYS_ACCEPT5/* sys_accept(2)*/#define SYS_GETSOCKNAME6/* sys_getsockname(2)*/#define SYS_GETPEERNAME7/* sys_getpeername(2)*/#define SYS_SOCKETPAIR8/* sys_socketpair(2)*/#define SYS_SEND9/* sys_send(2)*/#define SYS_RECV10/* sys_recv(2)*/#define SYS_SENDTO11/* sys_sendto(2)*/#define SYS_RECVFROM12/* sys_recvfrom(2)*/#define SYS_SHUTDOWN13/* sys_shutdown(2)*/#define SYS_SETSOCKOPT14/* sys_setsockopt(2)*/#define SYS_GETSOCKOPT15/* sys_getsockopt(2)*/

After a series of operations, the application layer can obtain data packets through the sys_recv or sys_recvfrom parameter. Because UDP is connectionless, you must use recvfrom to obtain the packet sent if you need to reply. Of course, UDP can also use Recv functions, but it cannot reply and can only receive.

UDP in iNet is used as an example.

If the system call parameter is sys_recvfrom, The socket_recvform () function is executed after memory verification.

/** Receive a frame from the socket and optionally record the address of the * sender. we verify the buffers are writable and if needed move the * sender address from kernel to user space. */static int sock_recvfrom (int fd, void * buff, int Len, unsigned flags, struct sockaddr * ADDR, int * addr_len) {struct socket * sock; struct file * file; char Address [max_sock_addr]; int err; int Alen; If (FD <0 | FD> = Nr_open | (file = Current-> files-> FD [FD]) = NULL) Return (-ebadf); If (! (Sock = sockfd_lookup (FD, null) Return (-enotsock); If (LEN <0) Return-einval; If (LEN = 0) return 0; err = verify_area (verify_write, buff, Len); If (ERR) return err; // call the lower-level function after checking, and Inet domain is inet_recvfrom () function Len = sock-> OPS-> recvfrom (sock, buff, Len, (file-> f_flags & o_nonblock), flags, (struct sockaddr *) Address, & Alen ); if (LEN <0) return Len; If (ADDR! = NULL & (ERR = move_addr_to_user (address, Alen, ADDR, addr_len) <0) // copy the Sending address from the kernel space to the user space return err; return Len ;}

The inet_recvfrom () function calls a specific protocol operation function. UDP protocol operation functions are defined as follows:

struct proto udp_prot = {sock_wmalloc,sock_rmalloc,sock_wfree,sock_rfree,sock_rspace,sock_wspace,udp_close,udp_read,udp_write,udp_sendto,udp_recvfrom,ip_build_header,udp_connect,NULL,ip_queue_xmit,NULL,NULL,NULL,udp_rcv,datagram_select,udp_ioctl,NULL,NULL,ip_setsockopt,ip_getsockopt,128,0,{NULL,},"UDP",0, 0};

As you can see, its corresponding function is for udp_recvfrom ()

/** This shoshould be easy, if there is something there we \ * return it, otherwise we block. */INT udp_recvfrom (struct sock * SK, unsigned char * To, int Len, int Noblock, unsigned flags, struct sockaddr_in * sin, int * addr_len) {int copied = 0; int truesize; struct sk_buff * SKB; int er;/** check any passed addresses */If (addr_len) * addr_len = sizeof (* sin ); /** from here the generic datatedoes a lot Of the work. Come * The finished net3, it will do _ all _ the work! */SKB = skb_recv_datagram (SK, flags, Noblock, & Er); If (SKB = NULL) return er; truesize = SKB-> Len; copied = min (Len, truesize);/** fixme: shocould use UDP header size info value */skb_copy_datagram (SKB, sizeof (struct udphdr), to, copied ); // extract the data part from the sk_buff structure SK-> stamp = SKB-> stamp;/* copy the address. */If (SIN) {sin-> sin_family = af_inet; Sin-> sin_port = SKB-> H. uh-> source; Sin-> sin_addr.s_addr = SKB-> daddr;} skb_free_datagram (SKB); release_sock (SK); Return (truesize );}

In this way, the data reaches the user space.

Common file operation function interface

The primary functions are the Read and Write Functions: sock_read and sock_write. You can perform file operations to read and write network data. When talking about files, we have to have a file descriptor. The f_inode pointer in the file descriptor points to the storage node Structure of the file.

The file operation set is defined as follows:

static struct file_operations socket_file_ops = {sock_lseek,sock_read,sock_write,sock_readdir,sock_select,sock_ioctl,NULL,/* mmap */NULL,/* no special open code... */sock_close,NULL,/* no fsync */sock_fasync};

The read and write functions are similar to the recvfrom and send functions. The functions are listed here for ease of viewing.

/** Read data from a socket. ubuf is a user mode pointer. we make sure the user * area ubuf... ubuf + size-1 is writable before asking the protocol. */static int sock_read (struct inode * inode, struct file * file, char * ubuf, int size) {struct socket * sock; int err; If (! (Sock = socki_lookup (inode) {printk ("Net: sock_read: Can't Find socket for inode! \ N "); Return (-ebadf);} If (sock-> flags & so_acceptcon) Return (-einval); If (size <0) Return-einval; if (size = 0) return 0; If (ERR = verify_area (verify_write, ubuf, size) <0) return err; return (sock-> OPS-> Read (sock, ubuf, size, (file-> f_flags & o_nonblock); // similar to the recvfrom function, call the corresponding function of the inet field}

The above will call the inet_read () function. The inet_read () function will call the udp_read () function, while the udp_read () function is completed by calling udp_recvfrom.
These two methods are the user interface of the kernel network stack.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.