To build a fast scanner

Source: Internet
Author: User
Tags pack socket error
0x00 Preface

The recent work needs to do large-scale scanning, need to write a scan engine according to the fingerprint, which encountered countless pits, finally reluctantly counted the past, specially write blog summed up a variety of pits and ideas. 0x01 Task Queue + multithreaded model

The first thing that comes to mind is this scanning model, which uses the synchronous queues in Python to combine multiple packets, packages, and write operations. The approximate model is as follows:

Class Worker (Thread):
    def __init__ (self, queue):
        self.queue = Queue
    def run (self): while
        True:
            IP = Self.queue.get ()
            sock.send ()
            recv = Sock.recv (buf_size) ...
            .
            Do sth ....
            ....
            Self.queue.task_done ()
def main ():
    ...
    Worker = Worker (queue)
    Work.deamon = True
    work.start ()
    queue.join () # Queue Sync
    ...

But it turns out that efficiency is really low, and it can only send a poor hundreds of pps,4m bandwidth. If you want to do a large-scale scan, I said that the large-scale scan refers to billions of IP scans, do not consider this method.
To summarize the reason, the main thread is too much, switching time is very CPU-intensive, and because of Gil's existence, the efficiency is conceivable. The second is because of too much IP, can not be all placed in the queue, you must make a buffer in the main function, and then put into the synchronization queue, the result of frequent queue.join () is that the state of the done thread waiting for slow threads of the situation increased, the efficiency is not high, because of the above reasons resulting in inefficient, Network IO Latency is nothing at all.
Later, I wanted to use celery to do asynchronous tasks, separating network io and local IO, separating the locally time-consuming IO operations in the thread, and putting them into celery for asynchronous execution, which speeds up the packet-wrapping thread. Results found to be slower, it is recommended not to use celery processing very time-consuming background tasks, feel good pits .... 0x01 based non-state scan model

In short, you don't care about the interaction state of the packet, there is no synchronization queue. Instead, it opens up two threads, one thread is responsible for send, and the other is responsible for recv, because I need to send a UDP protocol as the base probe packet for my business.
The advantage of doing this is that it is different to send and receive messages simultaneously in a thread, because do not care about the interactive process of network communication, therefore, can let the contract and the packet thread full load work, there is no thread waiting for each other, there is basically no synchronization situation, so the efficiency is greatly improved. The model is roughly as follows:

Class Sender (Thread):
    def __init__ (self, sock):
        self.sock = sock
    def run (self): for
        IP in list:
            Pack = Package_create ()
            self.sock.sendto (Pack, server)
            ....
            Do sth
            ... ....
Class Receiver (Thread):
    def __init__ (self, sock):
        self.sock = sock
    def run (self): while
        True:
            recv = Self.sock.recvfrom (buf_size)
            handle_recv_package (recv) ...
            .
            Do sth
            ... ....
def main ():
    sender = Sender (sock)
    recver = Receiver (sock) ...
    .
    Do sth
    ... ....
    Sock.close ()

The trouble is that this model needs to assemble, send, and parse messages to receive the message. The payoff is, the efficiency is greatly improved, easy 3k~4kpps (4M/1 cpu/2g mem), the bandwidth is basically full. 0x03 Performance Improvement

The main lifting of the space in the following aspects: Speed up the assembly of messages to accelerate the assembly speed of receiving messages using buffer for bulk IO, to a certain extent to reduce IO operation processing good socket error, the proposed event model, set task rollback function, encountered socket error on the end of the thread, Then the new socket is created in the main thread and the thread starts.

Optimization is done, the existing bandwidth is full, at this time, the bottleneck is bandwidth, it is necessary to change a better bandwidth, increase speed.
In addition, in the test process, I found that the network card to send UDP packets more slowly than TCP, using two machine tests, the conclusion is consistent.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.