Statistical analyst-Linux Router Traffic statistics system

Source: Internet
Author: User
Tags billing cycle
Article title: statistical analyst-Linux Router Traffic statistics system. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
This article first outlines three common billing methods for Linux routers, and then introduces the Netfilter packet filtering technology in Linux, on this basis, we will discuss an efficient, low-load, and scalable traffic statistics system developed with Netfilter as the core.
  
Billing Method for Linux routers
  
As the stability and availability of Linux systems are getting higher and higher, routes based on Linux systems are also becoming more and more widely used. Whether it is for a pure IP forwarding application or a Linux router system based on NAT technology and transparent proxy technology, it is worth studying how to efficiently and accurately charge fees.
  
In the Linux world, there are more than one solution to the problem. Currently, there are three common billing methods for Linux routers:
  
1. billing based on the data link layer and network listening mode.
  
When a dedicated network is used to listen to a host, the host is in Promiscuous mode and the streaming data packets are monitored and billed. the host to listen to must be in the same broadcast segment as the router. Because high-speed Ethernet listeners cause serious packet loss, this method is not widely used.
  
2. SNMP-based billing.
  
Configure the SNMP proxy on the Linux Router for traffic statistics. This method is widely used, but the installation of SNMP proxy increases the load on the router and brings related security issues.
  
3. use third-party software, such as xtacacsd.
  
Linux kernel-based Netfilter technology
  
Netfilter is a package filtering software system in Linux. it aims to implement complete packet filtering, firewall, network address translation (NAT), and other functions. Its principle is based on the header of the checked data packet and is processed according to the rules to achieve control, security, disguise, and segmentation management of the data packet. The specific implementation is to load the kernel module iptables_filter.o and command iptables.
  
Create three tables in Netfilter. the functions are shown in Table 1.
  
   
  
Table 1 Netfilter composition
  
By default, the Netfilter process 1 after a packet enters the Linux router from a network is shown.
  
   
  
Netfilter packet routing
  
1. when a packet enters (such as network 1), the kernel first checks the packet destination (route decision ).
  
2. if it enters the local machine, the package will move to the bottom of the graph to reach the INPUT chain. Here, any process waiting for this package will receive it.
  
3. Otherwise, the kernel will be discarded if it is not allowed to forward the packet or does not know how to forward the packet. If forwarding is permitted and the packet destination is another network interface (such as network 2), the packet continues to move down the graph to the FORWARD chain. If the policy is allowed to pass (ACCEPT), it will be sent out.
  
4. processes on the server can also send network packets, which are directly sent through the OUTPUT chain. if the packet is allowed (ACCEPT), the packet will continue to be sent to the network interface that can reach its destination.
  
"Nat table" manages network address translation. the PREROUTING chain can define the rules for NAT of the destination address. Because the router only checks the destination IP address of the data packet during routing, the destination NAT must be performed before routing to make the data packets route correctly. The POSTROUTING chain is used to define the rule for source address NAT. The system executes the rule in the chain only after determining the route of the data packet. The OUTPUT chain defines the destination address NAT rules for locally generated packets.
  
In addition to built-in chains, Netfilter also supports user-created chains and more complex and functional applications.
  
Billing rules
  
According to the scalability of Netfilter, you can use custom chains to pay for data packets. From the perspective of network management, the traffic to be concerned is generally the total inbound and outbound traffic and some basic network application traffic, such as FTP, WWW, SMTP, POP, etc, more metrics can be added as needed.
  
The billing function of iptables can only be implemented once in the same chain. Therefore, the custom table must be carefully designed to ensure that the target data packet passes through the billing chain. First, create user-defined tables ACC-IN, ACC-OUT, ACC-OUT-SMTP, ACC-IN-POP3, ACC-IN-WWW, ACC-IN-FTP.
  
For Linux routers, IP camouflage and transparent proxy are often required. the billing chain is different from the traditional packet forwarding router. For IP address disguise (network address translation), the first IP packet in the group passes through the PREROUTING (nat) chain. after that, all the IP packets pass through the FORWARD (filter) chain, therefore, NAT billing is done on the FORWARD chain. For transparent proxy, the essence is to forward the HTTP port 80 request to the proxy service process. if the proxy service process is local, the corresponding traffic occurs in the local OUTPUT chain.
  
The sequential relationship between the iptables table and the user-defined table in the inbound traffic statistics is 2.
  
   
  
Order of billing tables
  
As shown in 2, each inbound IP package traverses all the billing chains, and the outbound diagram is similar to this. If the IP package meets the billing rules, the iptables count. the corresponding rules are shown in Table 2.
  
   
  
Table 2 Billing Types
  
◆ For the total inbound traffic, because the inbound traffic is billed on the local network, the local TCP traffic can be ignored. the corresponding rules are as follows (eth0 is the intranet Nic and the intranet is 10.0.0.0/24 ):
  
# Iptables-a forward-o eth0-s! 10.0.0.0/24-s *. *-j ACC-IN
  
◆ FTP traffic includes port 20 and port 21. because Port 21 is only responsible for FTP control signaling transmission, you only need to pay for Port 20 responsible for data transmission:
  
# Iptables-A ACC-IN-o eth0-p tcp -- source-port 20-d *. *-j ACC-IN-FTP
  
◆ If no transparent proxy is set for WWW traffic, the billing takes place in the FORWARD chain:
  
Iptables-A ACC-IN-o eth0-p tcp -- source-port 80-d *. *-j ACC-IN-WWW
  
◆ For traffic using transparent proxy, billing occurs in the OUTPUT chain (10. x is the intranet IP address, and X is the port number of the proxy process ):
  
Iptables-a output-o eth0-p tcp-s 10. x -- source-port x-d *. *-j ACC-IN-WWW
  
The billing method for other protocols is similar. Any TCP or UDP protocol can be charged if you know the service port number.
  
The actual billing rules are automatically generated using PERL. First, generate the list of IP addresses to be billed, then read the file through the PERL program and automatically generate billing rules. The program example is as follows:
  
Sub set_iptables_rules {
My (@ mylist) = @_;
Open (FILE, $ mylist [0]); # read the configuration FILE
@ Lines = ;
Close FILE;
Foreach $ address (@ lines)
{
Chomp ($ address); # process the configuration file
@ Address_var = split (//, $ address );
# Generate billing rules based on the configuration file
'$ Iptables $ mylist [2] FORWARD-I $ local_nic-s $ address_var [0]-d! $ Local_net-j ACC-OUT ';
'$ Iptables $ mylist [2] ACC-OUT-I $ local_nic-p tcp-s $ address_var [0] -- dport 25-j ACC-OUT-SMTP ';
......}
}
  
The firewall rules of Linux routers are also set based on iptables, mainly for the INPUT chain and FORWARD chain. The system default chain FORWARD and OUTPUT of iptables are also used for billing rules, which overlap with the FORWARD chain set by the firewall. Therefore, it is necessary to ensure that the billing rules on the FORWARD chain must be prior to the firewall rules so that the billing rules and firewall rules can work simultaneously.
  
System structure
  
The structure of the entire traffic and fee system shows that data packets flow through the Linux router and are billed. statistical data is sent to the database. you can query traffic through the front-end interface and monitor and manage billing rules.
  
   
  
System structure
  
System background program
  
The iptables counting function only provides a simple traffic Display. you can use the iptables-nvxL parameter to obtain more detailed traffic data. For example:
  
Chain FORWARD (policy ACCEPT 28951 packets, 18212425 bytes)
Pkts bytes target prot opt in out source destination
116 8465 ACC-OUT all -- eth0 * 10.2.229.2! 10.0.0.0/8
134 59582 ACC-IN all -- * eth0! 10.0.0.0/8 10.2.229.2
  
A practical and available billing system requires database support and provides a friendly query interface. The data provided by iptables must be processed in the background and input to the database. Using regular expressions in PERL, you can process the data output by iptables to select the required data segment. Traffic and IP addresses must be extracted for each row of data. Therefore, you must use a regular expression to process the destination data segment:
  
For ($ I = 0; $ I <= $ count; $ I ++)
{$ Lines_in [$ I] = ~ /^ \ S + \ d + \ s + (\ d +) \ s + (\ D + \ d) \ s + (\ W \ d + \. \ d + \. \ d + \. \ d + \/\ d +) (\ d + \. \ d + \. \ d + \. \ d + ). * $ /;
$ Ac_ip = $4;
$ Ac_in = $1; # read IP addresses and total traffic
......
# Input traffic records to the database
$ Something = $ dbh-> do ("insert into $ mysql_table_name
(Date, ip, acc_in, acc_out, acc_in_ftp, acc_in_pop3, acc_in_www, acc_out _ smtp)
VALUES ('$ date',' $ ac_ip ',' $ ac_in ',' $ ac_out ','
$ Ac_in_ftp ',' $ ac_in_pop3 ',' $ ac_in_www ',' $ ac_out_smtp ')");}
  
The billing cycle can be selected as needed, but the storage capacity and efficiency must be taken into account. Here, the billing cycle is set to 24 hours, that is, set the crontab of the system, and regularly run the background program every day to input traffic to the database. After the billing is complete, clear the traffic calculator:
  
# Iptables-Z FORWARD
  
Because the billing background program is written in PERL, the program is compact and efficient. compared with the SNMP billing method, the billing method is simple and customizable, and is executed once a day, there is no additional load on the Linux router.
  
Database and front-end implementation
  
Because the Linux router undertakes a large number of users' IP packet forwarding tasks, the load is heavy. to avoid the impact on its performance, the background database is not set locally, multiple routers use the same backend MySQL database for management.
  
The front-end query interface can be written in PHP to provide users with query of daily, monthly, and annual traffic. It can also monitor and manage billing policies and provide four display methods.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.