Broadcast of openvpn and in-depth mining of Tun and tap Devices

Last Update:2018-12-04 Source: Internet

Author: User

Tags ssl connection htons

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Does broadcast pass or does not pass openvpn? The Tun processes layer-3 listeners. Although the IP addresses at both ends of the Tun are in the same subnet, the layer-2 listeners are not. Broadcast cannot be performed, but the TAP can transmit broadcast; due to the special nature of the virtual NIC Driver of windows, openvpn and the virtual NIC driver have made special and complex processing to allow windows to access the VPN. This article details (Note: This article does not introduce various terms of openvpn, such as the routing mode and bridge mode. For more information, see openvpn documentation or FAQs ).
How can we understand that the Tun device has established a "point-to-point" link, because the Tun tunnel is a three-tier tunnel with no layer-2 link, not to mention a layer-2 broadcast link, we know that the data link layer has two communication modes: one is point-to-point, such as PPP, and the other is broadcast, such as Ethernet, the tunnel established by the Tun device has only two endpoints. The tunnel encapsulates IP datagram. Although ARP is also required to locate the MAC of the tunnel peer Tun device, however, if n machines connect to a virtual network and belong to the same network segment at the same time, other machines will not receive this ARP packet, because there is no layer-2 link to help broadcast and forward this ARP packet.

Static inline unsigned int mroute_extract_addr_from_packet (struct mroute_addr * SRC,
Struct mroute_addr * DEST,
Struct mroute_addr * ESRC,
Struct mroute_addr * edest,
Const struct buffer * Buf,
Int tunnel_type)
{
...
Unsigned int ret = 0;
Verify_align_4 (BUF );
If (tunnel_type = dev_type_tun) // if it is in Tun mode, IPv4 headers are processed directly, but broadcasting is not processed, but IP Multicast can be processed.
Ret = mroute_extract_addr_ipv4 (SRC, DEST, Buf );
Else if (tunnel_type = dev_type_tap) // the L2 address is resolved only when the tap is used.
Ret = mroute_extract_addr_ether (SRC, DEST, esrc, edest, Buf );
Return ret;
}
In xxx_addr_ether, the following code is called:
If (is_mac_mcast_addr (ETH-> DEST ))
RET | = mroute_extract_bcast;
In future code, the mroute_extract_bcast is used to determine and broadcast the data. It should be noted that, instead of broadcasting in the tun/TAP driver, the openvpn process in the tap mode can be regarded as a layer-2 switch, the SSL portal and the tap device portal are physical interfaces on the switch. One Direction of broadcast data is to set the Ethernet frame from ssl_read (although openvpn does not call the libssl interface of OpenSSL ).

Read and decrypt the data, and then enter openvpn for broadcast. At the same time, the data is sent to the user, that is, write a copy to the user's tap device, the data broadcast to other machines still needs to be tunneling, that is, data is encapsulated by the SSL protocol and then sent through socket. This sending process is handled through multi_bcast. In fact, multi_bcast does not really want to send data, but puts the data to be sent together with its destination information into a container first, then, when the time is ripe, the container will be processed in a unified manner. The so-called time maturity is when multi_process_outgoing_link is called: multi_process_outgoing_link --> multi_get_queue, as mentioned in the comments of multi_get_queue, this container not only stores broadcast data, it also stores client-to-client data and multicast data. Now, let's talk about some topics about client-to-client. The following is an incorrect description:
Client-to-client is actually acting as a router through the server. All client-to-client connections must be relayed through the server, after the intermediate data packet arrives at the server, it first reaches the application layer through ethx and removes the SSL encapsulation. Then, the server writes the data packet encapsulated by the bare IP address to tun0, it is found that the destination IP address is an IP address of the virtual private network segment of a VPN. At this time, the data is sent from tun0 and received by openvpn, in this case, the server checks whether the client-to-client has been set. If the client-to-client is not set, and the virtual private IP address you just found is not your own, this indicates that this is a communication between the client and the client. If this packet is discarded, the packet is written to the SSL connection of the real IP corresponding to the destination Virtual Private IP. How can this connection be queried? After all, a server can have many clients. In fact, there are two methods. One is to configure a tun virtual network card for each client, and then distinguish by routing, another method is to solve this problem in openvpn. when data is sent from the virtual network card, it is actually a datagram with a standard IP header. openvpn reads this datagram through the character device, it can obviously obtain the destination address by reading the IP header and then know which real SSL connection is. Compared with the two methods, the first method has little impact on efficiency and can achieve high-speed forwarding, but the management is complicated. The second method is single-point decision-making, which is simple and safe to manage, however, parsing data at the application layer has a significant impact on performance. You can consider concurrency.
The above discussion is wrong. It is impossible for openvpn to implement client-to-client in such a complicated way. After reading the source code of openvpn, the implementation of openvpn is very simple, basically a forwarder, let's take a look at the following process:
While (true ){
...
Multi_process_io_udp (& multi );
...
}
Static void multi_process_io_udp (struct multi_context * m)
{
...
If (Status & socket_write) // write to socket
Multi_process_outgoing_link (M, mpp_flags );
Else if (Status & tun_write) // write the virtual Nic character device
Multi_process_outgoing_tun (M, mpp_flags );
Else if (Status & socket_read) {// read socket
Read_incoming_link (& M-> top );
Multi_release_io_lock (m );
If (! Is_sig (& M-> top ))
Multi_process_incoming_link (M, null, mpp_flags );
}
Else if (Status & tun_read) {// read the virtual Nic character device
Read_incoming_tun (& M-> top );
Multi_release_io_lock (m );
If (! Is_sig (& M-> top ))
Multi_process_incoming_tun (M, mpp_flags );
}
}
Among them, multi_process_outgoing_link is the socket write operation. Of course, SSL encapsulation is required before the actual write. If you follow this function, you will find that the data source written to the socket is from a queue, it is the queue processed in multi_get_queue, so the problem is who put the data into the queue. Because the logic of openvpn is the above multi_process_io_udp, it is obvious that multi_process_incoming_tun put the data into the queue, multi_process_incoming_tun finally calls mroute_extract_addr_from_packet, which is also associated with the broadcast issue at the beginning of this article. In general, openvpn first forms the following two channels in multi_process_io_udp:
1. From tun/TAP --> to socket
2. From socket --> to tun/TAP
If only these two channels are used, the client-to-client communication is as mentioned above, but openvpn also provides another channel, that is:
3. From socket --> to socket
As shown in the following call path:
Multi_process_incoming_link:
If (blen (& C-> c2.buf)> 0 ){
Process_incoming_link (c); // the SSL solution encapsulates C-> c2.to _ tun. len is set to the length of the data to be written into tun/TAP. That is to say, the data is written to the virtual Nic device by default, but C-> c2.to _ Tun may be written in the following logic. len is reset to 0. What is the situation? This is the case where the data has been processed. For example, if this is a client-to-client communication, there is no need to write data to the virtual network card device, and the above error is also confirmed. Since to_tun exists, C-> c2.to _ link exists, but the data read from tun/TAP is written to link, that is, socket.
If (tunnel_type (m-> top. c1.tuntap) = dev_type_tun ){
Mroute_flags = mroute_extract_addr_from_packet (...);
...
Else if (m-> enable_ C2C) {// If C2C is enabled
If (mroute_flags & mroute_extract_mcast)... // Multicast
Else {
Mi = multi_get_instance_by_virtual_addr (M, & DEST, true); // server finds the socket of the target client as the "Router"
If (MI ){
Multi_unicast (M, & C-> c2.to _ Tun, mi); // unicast transmission is actually put into the queue, and the communication between the source client and the target client is transitioned.
Register_activity (C, blen (& C-> c2.to _ Tun ));
C-> c2.to _ tun. Len = 0; // indicates that the data has been processed and does not need to be-tun.
}
}
}
...
In this case, process_incoming_link processes the client-to-client, and does not need to be written to the tun/TAP device, and then to the tun/TAP device by routing. At the same time, we can see from the above call path that the Tun-based tunnel does not support broadcast, because the mroute_extract_bcast flag is only set in mroute_extract_addr_ether, the latter will be called only in the tap mode, and ARP will be processed only when the mroute_extract_addr_ether is called in the tap mode, this macro is only enabled when a packet filter pre-compiled macro is enabled. When this macro is not enabled, ARP is transmitted through the normal tap tunnel in the tap device mode, ARP is a kind of link layer broadcast. The problem arises again. If it is in Tun device mode, how can we find the peer address? This also depends on the Tun driver of Linux kernel:
Static void tun_net_init (struct net_device * Dev)
{
Struct tun_struct * Tun = netdev_priv (Dev );
Switch (Tun-> flags & tun_type_mask ){
Case tun_tun_dev: // set the point-to-point mode of the Tun device below
Dev-> hard_header_len = 0;
Dev-> addr_len = 0;
Dev-> MTU = 1500;
Dev-> type = arphrd_none; // No ARP, it is a point-to-point connection, the route is directly sent from the exit, no ARP
Dev-> flags = iff_pointopoint | iff_noarp | iff_multicast;
Dev-> tx_queue_len = 10;
Break;
Case tun_tap_dev:
Dev-> set_multicast_list = tun_net_mclist;
* (2010*) Dev-> dev_addr = htons (0x00ff );
Get_random_bytes (Dev-> dev_addr + sizeof (2010), 4 );
Ether_setup (Dev );
Break;
}
}
In the end, the original problem is returned. The Tun device does not have a link layer, and it is point-to-point. The addressing from the client to the server is implemented through tunnel, although it is in an IP segment, they do not rely on ARP addressing. After all, ARP looks for Link Layer addresses. What else does ARP look for tunnels without link layer? However, the operations of the Windows tap device are different. The windows tap driver does not set the NIC according to the Tun or tap mode as the Linux Tun driver does, therefore, you must set an IP address that does not actually exist to represent the peer end, and then use the address that does not exist as the gateway to send data. In fact, all the data sent to the gateway is sent through the Virtual Nic, as a result, we walked onto the tunnel, so we had the net30 model.
The tap-win32 driver for Windows always sends out frames with Ethernet headers (you can capture packets and check the code for confirmation), so the tap-win32 driver does not really support point-to-point IP connections, A real point-to-point connection is generally used on a dedicated line, such as the slip protocol (simple serial link protocol, similar to HDLC). In fact, this point-to-point link is not simply a link layer, but a link layer, of course, there is certainly no arp/broadcast mechanism to support multi-point addressing. Windows machines are generally used for personal computers, while PCs generally use Ethernet, and there is no point-to-point link for personal computers, so windows virtual network card is basically an Ethernet virtual adapter, which can be seen from its name tap-win32. Let's take a look at the tap-win32 driver Io completed:
Completeirp:
If (p_packetbuffer-> m_sizeflags & tp_tun) {// if it is in Tun device mode, the Ethernet header is not transmitted to the user space.
Offset = ethernet_header_size;
Len = (INT) (p_packetbuffer-> m_sizeflags & tp_size_mask)-ethernet_header_size;
} Else {
Offset = 0;
Len = (p_packetbuffer-> m_sizeflags & tp_size_mask );
}
It can be seen that the current virtual network card on Windows does not have a direct point-to-point link concept (I don't know if anyone will develop it in the future). It is basically based on the old mechanism, ARP is sent for Ethernet addressing. For the Tun device mode running openvpn on a non-Windows system, ARP is not required and will never be sent. For Windows, its ARP is someone responding, so who is responding to it, since the thing is driven by the tap-win32, so don't look for this ARP response in the openvpn code, or in the tap-win32 driver itself to find it, again, openvpn was originally in Tun device mode does not support the link layer, to be compatible with windows, you can customize the net30 topology to simulate the link layer. In Tun mode, although all IP addresses are in one network segment, however, the communication between the IP addresses of the same network segment is not a link layer (such as ARP of Ethernet), but a point-to-point link between each client and server. If it is in the tap mode, obviously, there is a link layer, and it is Ethernet. ARP will be transmitted between the client and server. Back to the tap-win32 driver problem, ARP is how to send and how to receive it?
DriverEntry:
Rochelle properties-> sendhandler = adaptertransmit;
Ndis_status
Adaptertransmit (in ndis_handle p_adaptercontext,
In pndis_packet p_packet,
In uint p_flags)
{
...
If (l_adapter-> m_tun ){
Eth_header * E;
If (l_packetlength <ethernet_header_size)
Goto no_queue;

E = (eth_header *) l_packetbuffer-> m_data;
Switch (ntohs (e-> PROTO )){
Case eth_p_arp:
... // Because the tap-win32 must realize the standard of the Ethernet card, in fact it is an Ethernet card, so ARP must be processed, but the arp in the Tun mode is meaningless, so tap-win32 had to adopt a self-Answer Method to self-tact.
Processarp (l_adapter,
(Parp_packet) l_packetbuffer-> m_data,
Rochelle adapter-> m_localip,
L_adapter-> m_remotenetwork,
L_adapter-> m_remotenetmask,
Rochelle adapter-> m_taptouser.dest); // you can obtain your own ARP request.
Default:
Goto no_queue;
Case eth_p_ip:
...
}
If (is_up (l_adapter) // push data packets to a read queue to read data from the user space readfile.
Result = queuepush (l_adapter-> m_extension.m_packetqueue, l_packetbuffer );
...
}
Boolean
Processarp (tapadapterpointer p_adapter,
Const parp_packet SRC,
Const ipaddr adapter_ip,
Const ipaddr ip_network,
Const ipaddr ip_netmask,
Const macaddr Mac)
{
If (SRC-> m_proto = htons (eth_p_arp)
& Mac_equal (SRC-> m_mac_source, p_adapter-> m_mac)
... // Check whether ARP is sent to you.
& Src-> m_arp_ip_destination! = Adapter_ip ){
Arp_packet * ARP = (arp_packet *) memalloc (sizeof (arp_packet), true );
If (ARP ){
// Initialize ARP reply Fields
ARP-> m_proto = htons (eth_p_arp );
ARP-> m_mac_addresstype = htons (mac_addr_type );
ARP-> m_proto_addresstype = htons (eth_p_ip );
ARP-> m_mac_addresssize = sizeof (macaddr );
ARP-> m_proto_addresssize = sizeof (ipaddr );
ARP-> m_arp_operation = htons (arp_reply );
// ARP addresses
Copy_mac (ARP-> m_mac_source, Mac); // The Mac is actually 3rd bytes larger than p_adapter-> m_mac 1. Here, it is said that Mac is from the "remote end ".
Copy_mac (ARP-> m_mac_destination, p_adapter-> m_mac); // The ARP-reply destination of "remote" is obviously p_adapter-> m_mac, that is, itself
Copy_mac (ARP-> m_arp_mac_source, Mac );
Copy_mac (ARP-> m_arp_mac_destination, p_adapter-> m_mac );
ARP-> m_arp_ip_source = Src-> m_arp_ip_destination;
ARP-> m_arp_ip_destination = adapter_ip;
Injectpacket (p_adapter, (uchar *) ARP, sizeof (arp_packet); // simulate receiving ARP-reply data frames
Memfree (ARP, sizeof (arp_packet ));
}
Return true;
}
Else
Return false;
}
The final injectpacket calls ndismethindicatereceive and ndismethindicatereceivecomplete to enable the virtual network card to think that it has received data from the physical link. Finally, in Tun mode, Linux or UNIX system never sends ARP, ARP sent by Windows will not reach the openvpn process, directly simulate in the tap-win32, it is based on this, with a network topology such as net30, as for broadcast problems
The Tun mode is not broadcast, even if the tap-win32 driver of Windows does not present ARP broadcast to the openvpn of the user space, finally, the openvpn system is simple, the implementation is very symmetric, basically, it is a forwarder between Tun and link.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More