In October 16, I wrote a article titled capturing URLs with pcap and dpkt. This is done by a single process. Although efficiency is acceptable, it is a process after all. Human beings have to pursue perfection. What should we do if multithreading is used? (I have no experience in this area)
First, I tried to call the capture function in multiple threads and found that the results were repeated. That is to say, every thread did the same thing; then I thought about whether there could be a global list, where one thread writes data and the other thread reads data. This may cause shared access problems. Finally, I found the queue in my book. It has implemented data protection by itself. We don't need to consider whether it will conflict. OK, let's try.
Two classes are written, one for packet capture and one for analysis.
Code
1 #! /Usr/bin/env python
2 # coding = UTF-8
3 import pcap
4 import dpkt
5 import threading
6 from Queue import Queue
7 import time
8
9 TOTAL_HEADER_LENGTH = 48 # The Data Length of the syn and ack packets is 48, including 8-bit options.
10 #
11 # class QueueWraper:
12 #
13 # def _ init _ (self ):
14 # self. queue = Queue ()
15 #
16 # def get ():
17 ## debug
18 # return self. queue. get ()
19 # def put (obj ):
20 ## debug
21 # self. queue. put (obj)
22
23 class Capture (threading. Thread ):
24
25 def _ init _ (self, threadname, queue, port ):
26 threading. Thread. _ init _ (self, name = threadname)
27 self. Fetch DATA = queue # queue
28 self. port = port
29
30 def run (self ):
31 pc = pcap. pcap ()
32 B = 'tcp port' + str (self. port)
33 pc. setfilter (B)
34 for ts, pkt in pc:
35 Eth = dpkt. ethernet. Ethernet (pkt)
36 length = Eth. data. len
37 if length> TOTAL_HEADER_LENGTH:
38 content = str ('eth. data. data. data ')
39 self. Fetch data. put (content)
40
41
42 class Makeresult (threading. Thread ):
43
44 def _ init _ (self, threadname, queue, filepath ):
45 threading. Thread. _ init _ (self, name = threadname)
46 self. Fetch DATA = queue # queue
47 self. filepath = filepath
48 def run (self ):
49 fp = open (self. filepath, 'AB ')
50 while True:
51 # fp = open (self. filepath, 'AB ')
52 urlist = self. Fetch data. get ()
53 if 'get' in urlist:
54 print urlist
55 url = urlist. split () [1]
56 # print url
57 fp. write (url + '\ r \ n ')
58 fp. flush ()
59 # fp. close ()
60 # time. sleep (1)
61
62 def main ():
63 queue = Queue ()
64 capture = Capture ('capture ', queue, 80)
65 makeresult = Makeresult ('makeresresult', queue, 'c: \ url.txt ')
66 capture. start ()
67 makeresult. start ()
68
69 if _ name _ = '_ main __':
70 main ()
71
72
73