Self-built CDN defends against DDoS attacks
Self-built CDN to defend against DDoS attacks (1): Build a persistent defense line
Preface
This topic is the content we shared in the OWASP Hangzhou region security salon at the end of 2013. Here we resummarized the overall content of this topic and formed a text version.
In this article, the case and response experience of DDoS come from the actual scenarios of a customer service system with a high market share, we analyze the costs, efficiency, and specific architecture design (selection, configuration, and optimization) to cope with different types of DDoS attacks through self-built CDN.
Background
The main business of the customer service system is to provide real-time dynamic text chat Based on Web pages. It is mainly used in various online product sales, online customer service, and other fields, with a total of 0.58 million users, active online users at the same time: about 0.12 million/day.
These application fields are usually highly competitive among industries, including gray and profiteering industries that cannot be justified online, leading to frequent DDoS attacks among competitors. However, marketing websites are often accelerated on a single side. In addition, the promotion timeliness is very strong and it is difficult to be thoroughly cracked down. As a result, some smart hackers cannot communicate with visitors by attacking the website's online customer service system, transactions are not allowed to achieve the purpose of malicious attacks. Therefore, the customer service system, which originally contributed to website marketing, has become the main target of attacks. Although it has been wronged, it has to face challenges.
The types of DDoS attacks we encounter include: Slow CC attacks and fatal large-volume attacks. The following describes the attack features, defense ideas, and some of our defense solutions.
Slow CC attacks
Attack features
Attackers use a large number of Proxy Server IP addresses on the network and attack software to generate legitimate requests directed to the affected host.
This type of attack is low for attackers, and there are a lot of ready-made software on the Internet. The attack style is relatively "gentle and cautious", with the aim of increasing the number of spam requests, it consumes the normal application overhead of the server, such as CPU, memory, Nic pressure, or even network congestion, and then requests are unresponsive and no outbound traffic. This causes the website to slow down and makes the website inaccessible.
Defense ideas
For such attacks, two vulnerabilities can be exploited to prevent such malicious CC attacks. The key is to respond quickly.
First, because a large number of illegal requests are generated manually, the incoming traffic caused by the network increases abnormally (normally, the incoming traffic is small and the outgoing traffic is large). Second, there is an increasing process of attack strength. We need to make full use of this precious time, so that the machine can respond intelligently at the first time, and call the log analysis script for decision-making, so as to defend against or divert traffic.
There are multiple methods. Here we only list the two methods we use:
1. Use the traffic monitoring diagram of the monitoring software to trigger the log analysis script. (zabbix is used as an example ):
2. Use the bash script to count incoming traffic. When an exception is found, call the corresponding log analysis script to implement blocking.
#! /Bin/bash DEV = $1 # define the listener network card LIMIT = $2 # define the trigger threshold value WARN = $3 # define the alarm threshold value TIME = $4 # define the network card data collection frequency mobile_num = "13 xxxxxxxxxx" # define the mobile phone number LOCK = "/tmp /. exchange_proxy.lock"
[-Z $ DEV] & echo "$0 ethx limit_band (kbps) warn_limit (kbps) seconds" & exit 0
[-Z $ LIMIT] & LIMIT = 800000 #800 kbps
[-Z $ WARN] & WARN = 900000 #900 kbps
[-Z $ TIME] & TIME = 10 #10 s
Send_fetion (){
# Define the Apsara stack alarm text message interface
}
While:; do
Net_flood = 'ifconfig $ DEV | sed-n "8" P'
Rx_before = 'echo $ net_flood | awk '{print $2}' | cut-c7 -'
Sleep $ TIME
Net_flood = 'ifconfig $ DEV | sed-n "8" P'
Rx_after = 'echo $ net_flood | awk '{print $2}' | cut-c7 -'
Rx_result = $ [(rx_after-rx_before)/$ TIME]
Over_bw = $ [(rx_result-LIMIT)]
If [$ over_bw-gt 0]; then
BOOL = 'echo "$ rx_result> $ WARN" | bc' # identifies whether an attack is performed.
If [$ BOOL-eq 1]; then
# Confirm as attack, execute policy and send SMS
Send_fetion $ mobile_num "$ STR"
Else
# If the traffic exceeds the limit, send an SMS. Please note that
Send_fetion $ mobile_num "$ STR"
Fi
Fi
Sleep $ TIME
Done
The filter script is used to enable the log analysis mechanism on the server to identify abnormal IP addresses, agents, URLs, or other signatures at the first time. The kernel layer uses iptables to filter malicious IP addresses, the application layer uses the http keyword of nginx for filtering and directly returns badcode 444 for interception.
Disadvantages
Whether at the kernel level or application level, the CPU and memory of the server itself are highly dependent. For example, iptables filtering has a high CPU pressure on the server and blocks more than 15 k IP addresses, the server is basically unavailable. When Nginx blocks HTTP requests, it will allocate memory and processing chain rules for each http request, so the memory resources are exhausted; as the traffic increases and the attack time continues, the network adapter is under heavy pressure and the resources are eventually exhausted.
Therefore, this solution is temporary.
Fatal high-traffic attack
Attack features
This type of attack is generally based on tcp syn, icmp, and UDP (especially UDP packets, a single UDP packet can be large. The maximum attack traffic suffered by the customer service system is 16 GB, and the entire data center is affected. Attackers usually control a large number of bots or directly collude with servers and Bandwidth Resources in the IDC to attack the target traffic. In this case, the traffic will quickly occupy the network bandwidth of the server, and thus cannot respond to any user requests.
This type of attack requires a large amount of bandwidth resources. For the attacker, the cost is quite high, but the attack is "quick and accurate" to make the website completely unresponsive in a short time.
Due to this type of attacks, the traffic monitoring devices in the IDC will be aware of this phenomenon. IDC usually takes measures to block or even directly strip the attacked IP address, causing the target to commit suicide. This is undoubtedly the case for customers who need help.
Defense ideas
Defense methods against such traffic attacks include:
1. Set up a hard Firewall
2. Rent anti-DDoS nodes
3. Rent CDN distributed target traffic
Disadvantages
Set up a hard firewall: the price of 2G hard defense on the market is about 10 W, and the cluster defense cost is even higher. Although the hardware-level defense performance is high, the traffic flood is also a hit, and the side effects cannot be underestimated.
Rent anti-DDoS nodes: Anti-DDoS nodes are divided into defense bandwidth, defense traffic, and sharing exclusive. The combined prices of each package vary greatly, and the traffic distribution policies vary, when the traffic exceeds the traffic promised by anti-DDoS pro, the Defense fails or the money is added, but both of them have performance loss and side effects.
Rent CDN to distribute target traffic: All CDN providers on the market are charged based on traffic. For websites that are frequently attacked by traffic, they have to pay for the attack traffic, which is really unpleasant.
Both purchased hardware and anti-DDoS resources and CDN acceleration are expensive, and resource utilization is low during idle hours. During attack peaks, the costs are limited when there is a large volume of traffic organized, also accompanied by side effects (see the principles of the Green Alliance black hole firewall), is not a long-term plan.
Vulnerable party
To sum up, no matter which choice we make, it is very painful.
We have been talking with the attacker for nearly a year. We have learned that this is a very complete industrial chain (upstream personnel have long lived abroad, and remote control commands and operations cannot be investigated at all ), they control a large amount of attack resources, and the attack resources themselves come from IDCs. In order to make quick profits, attackers also like and recommend this direct method to attack the target. When launching an attack, they can mobilize the bandwidth resources of multiple IDCs to combat the target (this phenomenon also reflects the nonstandard IDC Management in China ).
From this point of view, the attacked party is always in a weak position. With a weak architecture and extremely limited resources, it cannot resist powerful cluster resource attacks.
We have been thinking about the question: if we continue to invest these funds, what can we leave after the crisis or a few years? Therefore, we jumped out of the single-node defense and the idea of renting CDN, and combined with the advantages of the above solution, we switched to the self-built CDN solution.
Long-term Plan: self-built CDN
The advantages of self-built CDN are as follows:
1. bypass traffic cleaning (acne is best on others' faces)
2. Make full use of resources: Perform route acceleration when there is no attack, and perform node switching when there is an attack (one thing is used for multiple purposes)
3. As the investment increases, the ability to defend against DDoS attacks is enhanced (long-term planning, high return on capital)
We will introduce how to build self-built CDN and how much it costs in the next article in the series.
Self-built CDN defense against DDoS (2): architecture design, cost and deployment details
In the first article in this series, we introduced the situation of DDoS attacks on our customer service system and the reasons why we decided to use self-built CDN to solve this problem.
Next, we will introduce the specific construction plan of self-built CDN, mainly from the following aspects: hardware cost, bandwidth cost, architecture design, actual deployment.
Hardware cost
In terms of hardware, our demand for selection is strong performance on a 1U basis and cost-effective.
We chose the (strong oxygen) Twin Star server. Its hardware specification is 1U body + dual Xeon CPU + maximum 48 gb memory + dual gigabit network port x2 + H3C S1208 eight Gigabit ports, three-year warranty service, with a total price of about 15 thousand.
Bandwidth cost
The data center and bandwidth resources of a single data center are directly purchased from the operating agent because they do not need to be bundled by a third party. Therefore, the choice of a redundant location is high and cost-effective. For example, China Telecom and China Unicom single-line resources are rented. Each line provides Mbps of bandwidth and 8 IP addresses. Some data centers provide hardware protection to defend against 5-10 Gbps traffic.
Average cost, the bandwidth cost for each node is basically 1.6 ~ 25 thousand/year.
Architecture Design
The CDN architecture should fully reflect the anti-attack capability and flexible response principles. Therefore, we break down CDN nodes into three different functional structures: reverse proxy + cache acceleration + attack defense.
1. Reverse Proxy function (Role: Route acceleration, hiding the master node and Server Load balancer)
2. cache acceleration function (function: static push to save the bandwidth of the backend master node)
3. Attack Defense function (function: Fast parsing, matching and filtering of malicious attacks)
There are a lot of software in the Open Source world that can act as reverse proxy and cache, and each has its own advantages and disadvantages. As an architect, we should consider how to select a model. We will compare and filter the performance, functions, and configurations.
We have conducted test and optimization and production line tests on the three-layer functional structure, and evaluated from the following aspects:
1. HTTP defense performance: HAProxy only accounts for 10% of CPU consumption during regular expression matching and header filtering in response to heavy-traffic CC attacks ~ 20%. Other Software accounts for more than 90% of CPU resources, which easily leads to no response from the entire system.
2. reverse Proxy performance: the forward efficiency is the highest in Varnish with memory cache, followed by ATS and Nginx. Considering the large-capacity cache, ATS is also a good choice, but there is a lack of documentation, continuous attention is required. Nginx is a product dedicated to C10K, with good performance. It is highly innovative with many plug-ins.
3. configurable filtering rules: HAProxy, ATS, and Squid support rule File Reading, ACL customization, hot loading, and hot start. Nginx does not support regular expression matching of external files, which is slightly less compact but highly plasticity.
Therefore, based on the above considerations, our final architecture is a combination of HAProxy + Varnish/ATS/Nginx, that is, the defensive reverse proxy cache solution. The functional roles are as follows:
1. Previously, HAProxy was fully responsible for separation of dynamic and static resources to achieve session stickiness, node load balancing, failover, and defense against Http-based CC attacks in case of critical events.
2. backend pluggable and replaced Reverse Proxy Cache Engine: memory-type varnish or disk-type ats is determined based on the actual application scenarios on the production line and the cache object capacity, if you need to customize reverse proxy with strong functions (Anti-leech), such as Nginx + plugins.
The biggest feature of this combination is:
L. Supports reading external filter rules, especially key strings that can be directly appended to files without escaping.
2. Supports hot configuration file loading and reload.
3. pluggable cache components can flexibly meet various business needs.
4. easy deployment, easy to switch between failed and effective nodes.
LVS absent: Why is LVS not mentioned here, because LVS is a heavyweight, efficient, and stable layer-4 forwarding. It cannot be identified by layer-7 HTTP protocol, but can be fully set up before layer-7. Therefore, the use of LVS does not affect the network structure, and you can still consider it later, only on the premise that the single point of failure of LVS should be taken into account.
Actual deployment
Finally, we deployed a total of eight CDN nodes around the master node (the number of nodes is adjusted flexibly according to the company's strength and actual production environment requirements. This number is for reference only ), these nodes are divided into four regions by region: North (mainly in Shandong and Hebei), Southwest (mainly in Sichuan), East China (mainly in Ningbo and Jiaxing), South China (mainly in Fujian, mainly in Hunan.
Overall Cost
Eight single-line acceleration nodes, each of which is mx8 and eight Twin Star servers, with a total investment of about RMB (the subsequent costs are only for bandwidth expenditure, about RMB/year ), our emergency fund is RMB, and the monthly CDN budget is RMB.
Project schedule:
1 ~ 4-Month Progress: this feature is a quick start. Here is a tip: you can sign a contract with the IDC on a monthly or quarterly basis in the early stage, and then check the continuous node quality through monitoring. If the node quality is poor and the provider is changed, the loss will not be too great, if the node quality is good, you can pay for it by half a year or by one year. This ensures the highest quality and cost-effectiveness;
5 ~ 8 months is the final period: Based on the budget, add bandwidth at a certain pace, and ensure bandwidth redundancy;
A stable period after 8 months: the maximum availability of nodes is ensured based on actual conditions, and the overall defense capability is also improved.
How to implement protection policies
Enable the httplog function of HAProxy to record logs.
HAProxy Configuration Policy:
global nbproc 24 pidfile /var/run/haproxy.pid daemon quiet user nobody group nobody chroot /opt/haproxy spread-checks 2
Ults
Log 127.0.0.1 local5
Mode http
Option forwardfor
Option httplog
Option dontlognull
Option nolinger # reduce FIN_WAIT1
Option redispatch
Retries 3
Option http-pretend-keepalive
Option http-server-close
Option accept-invalid-http-request
Timeout client 15 s
Timeout connect 15 s
Timeout server 15 s
Timeout http-keep-alive 15 s
Timeout http-request 15 s
Stats enable
Stats uri/stats
Stats realm 53KF \ Proxy \ Status
Stats refresh 60 s
Stats auth admin: adminxxx
Listen Web_FB 0.0.0.0: 80
Option HTTP chk GET/alive. php HTTP/1.0
Acl invalid_referer hdr_sub (referer)-I-f/opt/haproxy/etc/bad_ref.conf
Acl invalid_url url_sub-I-f/opt/haproxy/etc/bad_url.conf
Acl invalid_methods method-I-f/opt/haproxy/etc/bad_method.conf
Block if invalid_referer | invalid_url | invalid_methods
Acl dyn_host hdr (host)-I-f/opt/haproxy/etc/notcache_host.conf
Acl static_req path_end-I-f/opt/haproxy/etc/allow_cache_file.conf
Use_backend img_srv if static_req! Dyn_host
# Acl shaohy
Acl geek hdr_dom (host)-I 17geek.com
Use_backend geek if geek
# Backend shaohy
Backend geek
Mode http
Balance source
Cookie SESSION_COOKIE insert indirect nocache
Option tcpka
Server geek_1 FIG: 81 cookie geek_1 maxconn 10000 weight 8
Backend img_srv
Mode http
Option tcpka
Server img_srv 127.0.0.1: 88 maxconn 30000 weight 8
Varnish Configuration Policy:
backend h_17geek_com_1 { .host="127.0.0.1"; .port="81"; .connect_timeout=300s; .first_byte_timeout=300s; .between_bytes_timeout=300s; }
Director geek srv {
{. Backend = h_17geek_com_1;. weight = 3 ;}
}
Sub vcl_recv {
If (req. http. host ~ "^ (Www ).? 17geek.com $ "){
Set req. backend = geek_srv;
If (req. request! = "GET" & req. request! = "HEAD "){
Return (pipe );
}
If (req. url ~ "\. (Php | jsp) ($ | \?) "){
Return (pass );
}
Else {
Return (lookup );
}
}
}
For CC DDoS attacks, the method for monitoring abnormal traffic described in the first article is still applicable, and the advantage is more obvious, because:
1. each node undertakes corresponding log records, analyzes the system overhead of logs, and filters ACL rules on the haproxy frontend after detecting abnormal requests. Therefore, the attack pressure is not transmitted to the backend server, ensure backend security.
2. If the attack traffic on the node is too high, the data center can pull black IP addresses or divert traffic. The backend intelligent DNS will automatically remove the node, and subsequent requests will not pass the node.
In the next article in this series, we will introduce some subsequent improvements to the CDN architecture, including intelligent DNS, large-scale log analysis, and the use of OpenCDN to improve background management.
Self-built CDN defense against DDoS (3): subsequent improvements to the architecture
In the first article in this series, we introduced the situation of DDoS attacks on our customer service system and the reasons why we decided to use self-built CDN to solve this problem.
Afterwards, we introduced the specific construction plan of self-built CDN, mainly from the following aspects: hardware cost, bandwidth cost, architecture design, and actual deployment.
This article is the third part of the "self-built CDN anti-DDoS" series, introducing the subsequent improvements to the CDN architecture. Subsequent improvements include DNS Intelligent Resolution + round robin + survival monitoring, centralized Log Analysis + attack defense, and rapid deployment and graphical management of multi-node CDN.
1. Intelligent DNS resolution + round robin + survival Monitoring
A. Deploy smart DNS to match the nearest CDN Node
Another purpose of self-built CDN is to optimize the access path, because these acceleration nodes are deployed after careful selection, indicators such as bandwidth quality, data center environment, and security risks can meet reliable and controllable requirements.
Therefore, after multiple CDN nodes are deployed, to make these nodes work collaboratively and optimize the user access path, you can specify the visitor IP address to the corresponding CDN node by configuring the View of the Bind so that the visitor can obtain the page content from the CDN node nearby based on the region and line type, to optimize the visitor's route.
B. automatic DNS round robin + Fault Monitoring
We can use DNS round robin to distribute loads to websites. If conditions are sufficient, redundant CDN nodes can be deployed in each region. This can not only relieve the load of a single node in a region, but also provide mutual backup for this node, when the CDN node in this region fails due to a fault, the scheduling mechanism can drag the traffic of the faulty node to the current available node in the shortest time to dynamically remove the node, this does not affect the normal request of the visitor.
To implement DNS round robin, you only need to add multiple A records for the same domain name in Bind. The Bind View function and node survival check related technologies are quite mature, and there are many technical documents. For details, refer to "using Bind to build a highly available intelligent DNS server". we will not go into detail here.
C. Bind View IP sorting script
The scripts we have compiled can help you quickly sort out the IP ranges of China Telecom and China Unicom lines, including China East, China South, China north, and China west. If you are interested, try them out.
# This script downloads the list of IP addresses in China from Apnic, and classifies them as China Unicom, China Telecom, and other IP addresses. get_apnic () {FILE = $ PWD/ip_apnic CNC_FILE = $ PWD/CNC CTC_FILE = $ PWD/ctc tmp =/dev/shm/ip. tmp rm-f $ FILE wget http://ftp.apnic.net/apnic/stats/apnic/delegated-apnic-latest-O $ FILE
Grep 'apnic | CN | ipv4 | '$ FILE | cut-f 4,5-d' |' | sed-e's/| // G' | while read ip cnt
Do
Echo $ ip: $ cnt
Mask = $ (cat <EOF | bc | tail-1
Pow = 32;
Define log2 (x ){
If (x <= 1) return (pow );
Pow --;
Return (log2 (x/2 ));
}
Log2 ($ cnt)
EOF
)
Whois $ [email protected]> $ TMP. tmp
Sed-n'/^ inetnum/,/source/P' $ TMP. tmp | awk '(/mnt-/|/netname/)'> $ TMP
NETNAME = 'grep ^ netname $ TMP | sed-e's /. *:\(. * \)/\ 1/G' | sed-e's /-. * // G' | sed's: g''
Egrep-qi "(CNC | UNICOM | WASU | NBIP | CERNET | CHINAGBN | CHINACOMM | FibrLINK | BGCTVNET | DXTNET | CRTC)" $ TMP
If [$? = 0]; then
Echo $ ip/$ mask >>$ CNC_FILE
Else
Egrep-qi "(CHINATELECOM | CHINANET)" $ TMP
If [$? = 0]; then
Echo $ ip/$ mask >>$ CTC_FILE
Else
Sed-n'/^ route/,/source/P' $ TMP. tmp | awk '(/mnt-/|/netname/)'> $ TMP
Egrep-qi "(CNC | UNICOM | WASU | NBIP | CERNET | CHINAGBN | CHINACOMM | FibrLINK | BGCTVNET | DXTNET | CRTC)" $ TMP
If [$? = 0]; then
Echo $ ip/$ mask >>$ CNC_FILE
Else
Egrep-qi "(CHINATELECOM | CHINANET)" $ TMP
If [$? = 0]; then
Echo $ ip/$ mask >>$ CTC_FILE
Else
Echo "$ ip/$ mask $ NETNAME"> $ PWD/OTHER
Fi
Fi
Fi
Fi
Done
Rm-rf $ TMP. tmp
}
# Extract address registrant address information from whois information to determine which province
Gen_zone (){
FILE = $2
[! -S $ FILE] & echo "$ FILE file not found." & exit 0
Rm-rf $ FILE. zone
While read LINE; do
LINE = 'echo "$ LINE" | awk '{print $1 }''
Echo "$ LINE @"
Echo-n "$ LINE @" >>$ FILE. zone
Whois $ LINE | egrep "address" | xargs echo >>$ FILE. zone
Sleep $ TIME
Done <$ FILE
}
# Select the IP addresses in the China East, China South, China north, and western regions.
Gen_area (){
FILE = $2
[! -S $ FILE. zone] & echo "$ FILE. zone file not found." & exit 0
STRING = "none"
Echo $ FILE | egrep-I-q "cnc"
[$? = 0] & STRING = "cnc"
Echo $ FILE | egrep-I-q "ctc"
[$? = 0] & STRING = "ctc"
Echo $ FILE | egrep-I-q "other"
[$? = 0] & STRING = "other"
[$ STRING = "none"] & echo "Not cnc or ctc file" & exit 0
Cp-a $ FILE. zone $ FILE. tmp
Egrep-I "$ HD_STR" $ FILE. tmp> $ HD_FILE. $ STRING
Egrep-I-v "$ HD_STR" $ FILE. tmp> aaa
Mv aaa $ FILE. tmp
Egrep-I "$ HN_STR" $ FILE. tmp> $ HN_FILE. $ STRING
Egrep-I-v "$ HN_STR" $ FILE. tmp> aaa
Mv aaa $ FILE. tmp
Egrep-I "$ XI_STR" $ FILE. tmp> $ XI_FILE. $ STRING
Egrep-I-v "$ XI_STR" $ FILE. tmp> aaa
Mv aaa $ FILE. tmp
Egrep-I "$ HB_STR" $ FILE. tmp> $ HB_FILE. $ STRING
Egrep-I-v "$ HB_STR" $ FILE. tmp> aaa
Mv aaa $ FILE. tmp
Grep ^ [0-9] $ FILE. tmp | awk '{print $1}'> $ HD_FILE. $ STRING
Sed-r-I's # @. * # G' *. $ STRING
Rm-rf $ FILE. tmp
}
2. Centralized Log Analysis + AttacK Defense
As the front node of the website, CDN records the access behavior of all visitors in real time. It can be said that logs contain a variety of mysteries. It is understood that most websites do not make good use of their access logs, but only archive and backup them. If you can make good use of these access logs and perform in-depth analysis and mining on these logs, it will be of great help to understand the website running status and detect abnormal activities at the business layer. In particular, in the face of DDoS attacks, sufficient evidence can be provided to distinguish malicious IP addresses.
The main types of distinguishing malicious attacks are as follows:
1. A certain IP address initiates a large number of concurrent requests
2. Initiate a large number of consecutive IP segments
3. A large number of non-Rule IP addresses initiate requests
Currently, our log analysis on HAProxy only applies to a single node. in actual application scenarios, we use Log truncation per unit of time to write logs to/dev/shm memory, the common shell, awk, and sed languages are used for behavior analysis. This avoids the short board of disk I/O overhead. The disadvantage is that the log analysis behavior is rough and the analysis efficiency needs to be improved.
A. Multi-node CDN centralized Log Analysis + attack blocking Architecture
The Log Analysis Architecture acting on a single node has many limitations, mainly including:
1. logs are scattered across nodes. The data of other nodes is ignored during analysis and the global conditions cannot be known.
2. When the protection rule is enabled, it only acts on a single node, and other nodes still face attacks with this feature.
3. Real-Time Analysis of A Single Node occupies large system resources when it is under attack
Therefore, in a multi-node CDN architecture, if you need to detect and block DDoS attacks in a timely manner, and consider using node system resources with as little overhead as possible, the attack behavior must be centrally analyzed at the global level, and multi-node coordinated defense/blocking rules should be launched for the analysis results to cope with DDoS attacks.
After sorting out the difficulties, we found that the following three problems should be solved:
Collects massive log storage on multiple CDN nodes
Centralized Risk Analysis for massive logs
Coordinated attack blocking Mechanism
Specific architecture:
Nginx/HAProxy as the terminal to defend against attack systems
Access logs generated by nodes are sent to the dedicated LogServer through syslog for collection.
Dedicated LogServer is used for log storage and Risk Analysis and blocking rule push.
A. HAProxy/Nginx is used as a carrier to defend against attacks.
As mentioned in the previous article, we recommend that you use HAProxy or Nginx as a defensive reverse proxy on the CDN node to flexibly develop ACL filtering rules for attack defense, and can take effect in real time in hot loading mode.
B. Solutions for log storage
This step consists of two parts: one is the log transmission from the node to the LogServer, and the other is the centralized storage of logs at the LogServer end. Logs generated by CDN nodes can be summarized to the dedicated LogServer by locally writing PIPE + Rsyslog UDP for transmission. After the LogServer receives the logs, logs are stored together by domain name classification.
Hadoop can be used as the carrier for storing massive logs. The Map/Reduce algorithm can be used to decompose logs to improve the filtering efficiency. For more information, see open-source log system comparison.
C. coordinated attack blocking Mechanism
Here is the most critical link: the focus of our entire architecture is "anti-attack". After the previous analysis, we have defended against multi-node CDN attacks. The most efficient approach is: A dedicated LogServer performs centralized Analysis and Computation, generates security protection policies based on the calculation results, and connects them to each CDN node in real time to coordinate defense and blocking rules to cope with DDoS attacks.
The following problems will occur:
What types of scripts and rules are used to analyze logs?
How does the analysis result form an ACL Policy for HAProxy/Iptables?
How the generated ACL Policy applies to global CDN nodes and forms a linkage
Our design philosophy is as follows:
After logs are completely stored on the LogServer, use the analysis script to perform feature matching, extract the source IP addresses of malicious attacks, and generate the corresponding HAProxy/Iptables blocking rules for these IP addresses, and distributed to the global CDN node. You can do this in two ways:
1. interact with Iptables and Nginx/HAProxy through dedicated interfaces
2. the unified configuration management tool Puppet is used to push messages. LogServer acts as the message pushing end and command issuing master end, and each CDN node acts as the policy receiving end and effective command execution end, after receiving the protection policy, the system automatically adds the ACL list and runs the Hot load command.
B. Advantages of the Architecture
After the implementation of this architecture, the horizontal scaling of the system will become very easy. The CDN node can be dynamically added or removed based on the node traffic/Resource load, without any changes to the source site.
It can easily cope with DDoS attacks and automatically block attack sources while distributing attack traffic.
In addition, if an exception is detected on a website, you can quickly develop new protection rules and apply the blocking measures to all sites added to the CDN for global security protection.
Collect and analyze the logs on each CDN node to obtain detailed access behaviors of all users and record all illegal access behaviors, by preparing business security rules, pre-warning and post-event tracking can be provided.
3. Fast deployment and graphical management of multi-node CDN
Managing and maintaining a set of CDN systems is a great challenge for any organization, especially the deployment of multi-regional and multi-line CDN. You need to control the node list of CDN acceleration at any time, define which web elements can be used as cache, And what ACL policies are required. These require professional system O & M personnel to configure and implement them.
Generally, a mature approach is to configure CDN rules in advance through the master machine and push the configuration file to each CDN node through Rsync. Obviously, although this solution is highly efficient, it has a certain threshold for CDN deployers. In addition, the server's permission control requirements are very strict, which is not conducive to promotion for other engineers.
By chance, we were lucky to have first recognized OpenCDN as an award in the hackathon competition. Through complementary integration, we made up for the lack of front-end management on our CDN. Therefore, IT can be deeply integrated with the OpenCDN project to lower the O & M and Management thresholds and benefit more it o & M users.
A. What problems does OpenCDN mainly solve?
OpenCDN is a tool for rapid deployment of CDN acceleration. It provides a convenient management platform for enterprises that provide CDN acceleration services or that require multi-node CDN acceleration, monitors and manages the status and system load of each node in real time. OpenCDN pre-fabricated multiple sets of common cache rules, supporting a variety of complex CDN cache scenarios. As its name suggests, OpenCDN is free and open-source.
B. How is OpenCDN currently implemented?
The main architecture of OpenCDN can be divided into CDN management center and CDN acceleration nodes. There can be many CDN acceleration nodes, but there is no limit on the quantity. Users can quickly deploy multiple CDN acceleration nodes through OpenCDN and perform centralized management through a management center.
Therefore, OpenCDN mainly implements two parts here: one is to integrate the CDN node deployment process with one click, and the other is to centrally manage these CDN nodes through the WebConsole tool.
C. What will OpenCDN do in the future? What is the effect?
OpenCDN is designed to accelerate websites with multiple CDN nodes. It provides a convenient CDN acceleration management platform that allows users to build self-built CDN nodes on demand, flexibly control costs, and improve website response speed, quick Response to traffic spikes.
In the future, we will integrate the above-mentioned CDN combination scheme to defend against large-volume DDoS attacks. We have made open source for this platform and hope that more people who need it can obtain it at the lowest cost. At the same time, we hope that more developers can join us to complete it. Everyone is me, and everyone is me.
D. Advantages of OpenCDN for self-built CDN
The first is to reduce the cost of obtaining CDN, and the most important thing is to improve the performance of CDN nodes. Compared with renting a commercial CDN, we do not need to calculate the cost for purchasing traffic and create a leasing mode with fixed overhead.
It is not limited to node media. Physical servers or VPS can be used. You can use the VPS of different service providers to build a low-cost CDN acceleration cluster covering the whole country.
Commercial CDN nodes must be shared to multiple sites for simultaneous use, which means that the limited resources (concurrency) of the nodes will be shared and used at the same time, users with high bandwidth/traffic requirements are more suitable for self-built architectures.
Which users does OpenCDN apply?
Currently, OpenCDN is applicable to gaming sites, vertical e-commerce sites, community forums, online videos, and chats.
Common Characteristics of these websites: Medium Traffic, fierce competition, frequent attacks, high profits in the industry, and willingness to spend money.
Summary
So far, the "self-built CDN anti-DDoS" series has come to an end. If you have any questions, please contact us.