Linux RAW sockets
==========================================
1. Why do I need to know more about raw sockets?
In fact, a long time ago on the original socket has a certain understanding, then also did a small grab package program, at that time thought the original socket is very familiar with, but recently in looking at Nmap was one of the words to the whole confused.
In the SYN Scan section of the Nmap Network Discovery III, it is mentioned that after a SYN packet is constructed from the raw socket to the destination port, if the destination port sends back the SYN/ACK packet, So at this time Nmap directly send a RST package to quickly close the connection, and then came to a "you do not need to send the RST package, because the SYN packet sent is nmap manually constructed, so kernel received not in the expected packet, kernel will send the RST package."
? Lying trough! It was half an hour of thinking and I didn't understand it, and it was so different from what I knew about the raw socket, does the raw socket go through the protocol stack? Can a raw socket send a packet?
Feel the need to clear up the problem, so start online, books began to re-understand the raw socket.
2. Protocol of the original socket
? We know that the Pf_inet protocol family supports a number of protocol parameters, in general socket programming we are accustomed to using the socket (pf_inet, sock_stream, 0) to create a new TCP socket, this issue in the previous article " The three parameters (family, type, protocol) of the socket are already mentioned in the resolution, so can I use the socket (pf_inet, Sock_raw, 0) when creating the original socket?
? The answer is NO! If the socket (pf_inet, Sock_raw, 0) to create a new original socket will return error "-Eprotonosupport", we can find the answer from the kernel new socket function Inet_create:
List_for_each_entry_rcu (answer, &inetsw[sock->type],List) {err =0;/ * Check the Non-wild match. * * if(protocol = = Answer->protocol) {if(Protocol! = IPPROTO_IP) Break; }Else{/ * Check for the wild cases. * / if(ipproto_ip = = protocol) {protocol = answer->protocol; Break; }if(ipproto_ip = = Answer->protocol) Break; } err =-eprotonosupport; }//type equals Sock_raw inet_protosw.{. Type = Sock_raw,. Protocol = IPPROTO_IP,/ * Wild Card * /. Prot = &raw_prot,. Ops = &inet_sockraw_ops,. Flags = Inet_protosw_reuse,}
? The lower half of the code above is the INET_PROTOSW of the original socket, where the value of protocol is ipproto_ip (equals 0), then the upper part of the code logic goes to protocol = = Answer->protocol, The value of both is ipproto_ip, so if (protocol! = IPPROTO_IP) logic is not satisfied, the last err becomes-eprotonosupport, and the new socket fails.
? Protocol equals Ipproto_raw The original socket can accept the packet? The question is to be answered by the original socket acceptance section later.
3. Permission checking for the original socket
When creating a new raw socket, you need to have permission to create a new original socket (which is generally required to be a privileged user), and the new socket function in the kernel inet_create the user permissions on the raw socket as follows:
if (sock->type == SOCK_RAW && !kern && !ns_capable(net->user_ns, CAP_NET_RAW)) goto out_rcu_unlock;
4. Ip_hdrincl
The original socket needs to apply itself to construct the transport layer (TCP, UDP) header, but if the IP header is used to construct it, the socket must be set through the IP_HDRINCL (Header Include) option in setsocketopt. The protocol stack will not automatically populate the IP header.
? When creating a new raw socket in Linux, the protocol is specified as Ipproto_raw so we no longer need to display the IP_HDRINCL option to call Setsocketopt:
if (SOCK_RAW == sock->type) { inet->inet_num = protocol; if (IPPROTO_RAW == protocol) 1; //setsocketopt(...,IP_HDRINCL)}
? However, it is recommended to set the IP_HDRINCL option for readability and portability of the code.
5. Sending the original socket
? In the previous "socket three parameters (family, type, protocol) Resolution" has been mentioned in the kernel socket has three important structures INET_PROTOSW, proto_ops, Proto, Each represents the operation set of the socket transport layer, the operation set of a certain type of socket, and the operation set of the specific protocol. The three structures that correspond to the original sockets are:
static struct INET_PROTOSW inetsw_array[] ={... {. Type = Sock_raw,. Protocol = ipproto_ip, /* Wild Card */ . Prot = &raw_prot,. Ops = &inet_sockraw_ops,. Flags = Inet_protosw_reuse,}} /* * for SOCK_RAW sockets; Should is the same as inet_dgram_ops but without * udp_poll */static const struct proto_ops inet_sockraw_ops = {. Family = pf_inet, .... sendmsg = inet_sendmsg,. recvmsg = Inet_recvmsg, ...}; struct proto Raw_prot = {. Name = "raw" , ... . sendmsg = raw_sendmsg,. recvmsg = Raw_recvmsg, ...};
The sending process for the original socket is: Sys_call (sendto)->inet_sendmsg->raw_sendmsg,raw_sendmsg is the primary send function of the original socket:
Static intRaw_sendmsg (structKIOCB *IOCB,structSock *sk,structMsghdr *msg, size_t len) {structInet_sock *inet = Inet_sk (SK);structIpcm_cookie IPC;structRtable *rt = NULL;structFlowi4 Fl4; ...if(INET->HDRINCL)/ * xxx:stripping const * /Err = Raw_send_hdrinc (SK, &fl4, (structIovec *) Msg->msg_iter.iov, Len, &rt, msg->msg_flags);Else{... err = Ip_append_data (SK, &fl4, Raw_getfrag, &RFV, Len,0, &IPC, &rt, msg->msg_flags); ... } ...}
If we set IP_HDRINCL (or protocol to Ipproto_raw) for the original socket, then the Raw_send_hdrinc function is called:
Static intRaw_send_hdrinc (structSock *sk,structFlowi4 *fl4,void*from, size_t length,structRtable **RTP,unsigned intFlags) {structInet_sock *inet = Inet_sk (SK);structNET *net = sock_net (SK); .../ * * We don ' t want to modify the IP header, but we don't need to * is sure that it won ' t cause problems later Alo ng the network * stack. Specifically we want to make sure this IPH->IHL is a * sane value. If IHL points beyond the length of the buffer passed * in, reject the frame as invalid */err =-einval;if(Iphlen > Length)GotoError_free;if(Iphlen >=sizeof(*IPH)) {if(!iph->saddr) iph->saddr = fl4->saddr; Iph->check =0; Iph->tot_len = htons (length);if(!iph->id) ip_select_ident (SKB, NULL); Iph->check = Ip_fast_csum ((unsigned Char*) iph, IPH->IHL); }if(Iph->protocol = = ipproto_icmp) icmp_out_count (NET, (structICMPHDR *) Skb_transport_header (SKB))->type); Err = Nf_hook (Nfproto_ipv4, Nf_inet_local_out, SKB, NULL, Rt->dst.dev, dst_output); ...}
? From the code above to see if the source address is not set then kernel will add a source address to the header, the IP serial number is the same. Kernel is always set for IP checksum and IP length fields.
How fields in the IP header are set when IP_HDRINCL is set |
|
IP Checksum (IP header check code) |
Always populated by the kernel |
Source address (Origin) |
When 0 o'clock is populated by the kernel |
Packet ID (Packet ID) |
When 0 o'clock is populated by the kernel |
Total length (packet length) |
Always populated by the kernel |
? The sending process of the original socket (specified IP_HDRINCL) from the above can also be known:
- The protocol specified at the time of the new socket (for example: Ipproto_tcp, IPPROTO_UDP) are not used when sending, and the IP header is not populated with the specified protocol because we have set the IP_ Hdrincl to indicate that we have to populate the IP header ourselves.
- The data sent is no longer passed through the IP layer, and the last call to Dst_output sends the datagram, so if the datagram length exceeds the MTU then there will be no IP shards generated, send failed, return emsgsize error code.
On the contrary, if we do not set ip_hdrincl for ourselves to define the IP header, then the process will go to the Ip_append_data function for IP sharding.
- The datagram sent does not go through the TCP layer, which is the "foreshadowing" of the problem that kernel automatically sends the RST datagram to me.
6. Reception of the original socket
When the NIC driver receives the message, it is processed by NETIF_RECEIVE_SKB () in the soft interrupt context, and the Ip_local_deliver_finish () function is eventually called by IP_RCV () for the IP message and the destination address is native. Ip_local_deliver_finish () calls Raw_local_deliver () for each packet, and Raw_local_deliver () is the entry that the original socket receives.
The receiving process of the original socket is mainly: first, based on the L4 layer protocol type hash value of the message, find out if there is a matching sock in the raw_v4_htable table. If there is a matching sock structure, a further call is made to Raw_v4_input () to handle the network layer raw socket, and the __raw_v4_lookup () function is further called in Raw_v4_input () to match the source IP of the original socket, destination IP, The interface that binds the ETH is matched, and finally calls RAW_RCV ().
? Regardless of whether or not the original socket is to be processed, the message will follow the protocol stack process. will continue to match inet_protos[] array, according to the L4 layer protocol type to go TCP, UDP, ICMP and other different processing processes.
After understanding the receiving process of the original socket, we will answer the two questions mentioned above:
?Q: After sending a SYN packet through the raw socket, why does the kernel send the RST to the peer after receiving the Syn+ack return packet?
?A: From the 5th section we know that the raw socket in the sending of datagrams without passing through the L3 IP layer and L4 tcp/udp layer, so kernel is completely unaware of your own "secretly" sent a SYN packet. Waiting for the end to send us back syn+ack packets when we learned from the 6th section whether there is a corresponding original socket processing, the message will follow the protocol stack processing process, the protocol stack found you sent this packet I do not know, I did not send the corresponding SYN message, There is no doubt that the protocol stack will assume that this is an unusual message, directly back to the RST.
?Q: protocol equals IPPROTO_RAW socket can receive packets?
?A: From the receiving process of the original socket above, the first filter for the packet is to find the corresponding raw socket from raw_v4_htable based on the hash generated by the protocol of the packet. We receive back from the network card packet L4 layer protocol is definitely TCP, UDP, ICMP and other valid values, we have not heard to say which packet protocol is Ipproto_raw bar, then we use protocol equals ipproto_ Raw to create a new original socket the last generated hash value will not be matched to any packet. This means that the new socket with Ipproto_raw is only suitable for sending and cannot receive packets.
Reference Link: Http://sock-raw.org/papers/sock_raw (basic reading of this paper to the raw socket knowledge is enough)
Linux RAW sockets